I have a dataframe whose schema looks like this: <pre class="prettyprint"><code>event: struct (nullable = true) | | event_category: string (nullable = true) | | event_name: string (nullable = true) | | properties: struct (nullable = true) | | | ErrorCode: string (nullable = true) | | | ErrorDescription: string (nullable = true) </code></pre> I am trying to explode the <code>struct</code> column <code>properties</code> using the following code: <pre class="prettyprint"><code>df_json.withColumn("event_properties", explode($"event.properties")) </code></pre> But it is throwing the following exception: <blockquote> <pre class="prettyprint"><code>cannot resolve 'explode(`event`.`properties`)' due to data type mismatch: input to function explode should be array or map type, not StructType(StructField(IDFA,StringType,true), </code></pre> </blockquote> How to explode the column <code>properties</code>?

You can use <code>explode</code> in an <code>array</code> or <code>map</code> columns so you need to convert the <code>properties</code> <code>struct</code> to <code>array</code> and then apply the <code>explode</code> function as below <pre class="prettyprint"><code>import org.apache.spark.sql.functions._ df_json.withColumn("event_properties", explode(array($"event.properties.*"))).show(false) </code></pre> You should have your desired requirement

Error while exploding a struct column in Spark

Tags:

scala

apache-spark

apache-spark-sql

pyspark

spark-dataframe

I have a dataframe whose schema looks like this:

Click to copy

event: struct (nullable = true)
|    | event_category: string (nullable = true)
|    | event_name: string (nullable = true)
|    | properties: struct (nullable = true)
|    |    | ErrorCode: string (nullable = true)
|    |    | ErrorDescription: string (nullable = true)

I am trying to explode the struct column properties using the following code:

Click to copy

df_json.withColumn("event_properties", explode($"event.properties"))

But it is throwing the following exception:

Click to copy

cannot resolve 'explode(`event`.`properties`)' due to data type mismatch: 
input to function explode should be array or map type, 
not StructType(StructField(IDFA,StringType,true),

How to explode the column properties?

762

asked Jan 18 '18 06:01

shiva.n404

Video Answer

1 Answers

You can use explode in an array or map columns so you need to convert the properties struct to array and then apply the explode function as below

Click to copy

import org.apache.spark.sql.functions._
df_json.withColumn("event_properties", explode(array($"event.properties.*"))).show(false)

You should have your desired requirement

112

answered Oct 17 '22 13:10

Ramesh Maharjan

Related questions
                            
                                Scala: How do I define an anonymous function with a variable argument list?
                            
                                Why does Scala choose the type 'Product' for 'for' expressions involving Either and value definitions
                            
                                Compiler error about class graph being not finitary due to a expansively recursive type parameter
                            
                                Modifying environment variable for a process with scala.sys.process?
                            
                                Zip two HashMaps(or dictionaries)
                            
                                How do I write a JSON Format for an object in the Java library that doesn't have an apply method?
                            
                                Can only do 4 concurrent futures as maximum in Scala
                            
                                How to add source file name to each row in Spark?
                            
                                How to convert from from java.util.Map to a Scala Map
                            
                                Functional equivalent of if (p(f(a), f(b)) a else b
                            
                                When is a return type required for methods in Scala?
                            
                                Force initialization of Scala singleton object
                            
                                Debug long compile times in Scala and SBT
                            
                                How to merge a JsValue to JsObject in flat level
                            
                                What is "at" in shapeless (scala)?
                            
                                How to inject multi dependencies when I use "Reader monad" for dependency injection?
                            
                                canEqual() in the scala.Equals trait
                            
                                Applying function to Spark Dataframe Column
                            
                                Read from a hive table and write back to it using spark sql
                            
                                My API is all returning Future[Option[T]], how to combine them nicely in a for-compr

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Error while exploding a struct column in Spark

Tags:

scala

apache-spark

apache-spark-sql

pyspark

spark-dataframe

shiva.n404

People also ask

Video Answer

1 Answers

Ramesh Maharjan

Recent Activity

Donate For Us