Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to convert json string to dataframe on spark

I want to convert string variable below to dataframe on spark.

val jsonStr = "{ "metadata": { "key": 84896, "value": 54 }}" 

I know how to create dataframe from json file.

sqlContext.read.json("file.json") 

but I don't know how to create dataframe from string variable.

How can I convert json String variable to dataframe.

like image 537
lucas kim Avatar asked Jul 08 '16 16:07

lucas kim


People also ask

How do I read a JSON file in Spark?

Spark SQL can automatically infer the schema of a JSON dataset and load it as a DataFrame. using the read. json() function, which loads data from a directory of JSON files where each line of the files is a JSON object. Note that the file that is offered as a json file is not a typical JSON file.


1 Answers

For Spark 2.2+:

import spark.implicits._ val jsonStr = """{ "metadata": { "key": 84896, "value": 54 }}""" val df = spark.read.json(Seq(jsonStr).toDS) 

For Spark 2.1.x:

val events = sc.parallelize("""{"action":"create","timestamp":"2016-01-07T00:01:17Z"}""" :: Nil)     val df = sqlContext.read.json(events) 

Hint: this is using sqlContext.read.json(jsonRDD: RDD[Stirng]) overload. There is also sqlContext.read.json(path: String) where it reads a Json file directly.

For older versions:

val jsonStr = """{ "metadata": { "key": 84896, "value": 54 }}""" val rdd = sc.parallelize(Seq(jsonStr)) val df = sqlContext.read.json(rdd) 
like image 193
Jean Logeart Avatar answered Sep 19 '22 19:09

Jean Logeart