Here is my JSON
[{"dict": {"key": "value1"}}, {"dict": {"key": "value2"}}]
Here is my parse code:
val mdf = sparkSession.read.option("multiLine","true").json("multi2.json")
mdf.show(false)
This outputs:
+--------+
|dict |
+--------+
|[value1]|
|[value2]|
+--------+
I want to see the name-value pairs? The keys and the values.
How do I do this?
Thanks
If you want to expand data just select dict.*
(note that the option is named multiline
not multiLine
):
val df = spark.read.option("multiline", "true").json("multi2.json")
df.select($"dict.*").show
// +------+
// | key|
// +------+
// |value1|
// |value2|
// +------+
If you want to treat it as a dictionary just provide the schema:
import org.apache.spark.sql.types._
val schema = StructType(Seq(
StructField("dict", MapType(StringType, StringType))
))
val dfm = spark.read
.schema(schema)
.option("multiline", "true")
.json("multi2.json")
dfm.show
// +------------------+
// | dict|
// +------------------+
// |Map(key -> value1)|
// |Map(key -> value2)|
// +------------------+
and if you want a pair per row, just explode the result:
import org.apache.spark.sql.functions._
dfm.select(explode(col("dict"))).show
// +---+------+
// |key| value|
// +---+------+
// |key|value1|
// |key|value2|
// +---+------+
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With