Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read JSON files from multiple line file in spark scala

I'm learning spark in Scala. I have a JSON file as follows:

[
  {
    "name": "ali",
    "age": "13",
    "phone": "09123455737",
    "sex": "m"
  },{
    "name": "amir",
    "age": "24",
    "phone": "09123475737",
    "sex": "m"
  }
]

and there is just this code:

val sqlContext = new org.apache.spark.sql.SQLContext(sc)
val jsonFile = sqlContext.read.json("path-to-json-file")

I just receive corrupted_row : String nothing else but when put every person(or objects) in single row, code works fine

How can I read from multiple lines for a JSON sqlContext in spark?

like image 610
reza Avatar asked Dec 28 '25 07:12

reza


1 Answers

You will have to read it into an RDD yourself and then convert it to a Dataset:

spark.read.json(sparkContext.wholeTextFiles(...).values)          
like image 195
Justin Pihony Avatar answered Dec 30 '25 22:12

Justin Pihony