Schema: <pre class="prettyprint"><code>|-- c0: string (nullable = true) |-- c1: struct (nullable = true) | |-- c2: array (nullable = true) | | |-- element: struct (containsNull = true) | | | |-- orangeID: string (nullable = true) | | | |-- orangeId: string (nullable = true) </code></pre> I am trying to flatten the schema above in spark. Code: <pre class="prettyprint"><code>var df = data.select($"c0",$"c1.*").select($"c0",explode($"c2")).select($"c0",$"col.orangeID", $"col.orangeId") </code></pre> The flattening code is working fine. The problem is in the last part where the 2 columns differ only by 1 letter (orangeID and orangeId). Hence I am getting this error: Error: <pre class="prettyprint"><code>org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(orangeID,StringType,true), StructField(orangeId,StringType,true); </code></pre> Any suggestions to avoid this ambiguity will be great.

turn on the spark sql case sensitivity configuration and try <pre class="prettyprint"><code>spark.sql("set spark.sql.caseSensitive=true") </code></pre>

Ambiguous schema in Spark Scala

Tags:

scala

apache-spark

Schema:

|-- c0: string (nullable = true)
|-- c1: struct (nullable = true)
|    |-- c2: array (nullable = true)
|    |    |-- element: struct (containsNull = true)
|    |    |    |-- orangeID: string (nullable = true)
|    |    |    |-- orangeId: string (nullable = true)

I am trying to flatten the schema above in spark.

Code:

var df = data.select($"c0",$"c1.*").select($"c0",explode($"c2")).select($"c0",$"col.orangeID", $"col.orangeId")

The flattening code is working fine. The problem is in the last part where the 2 columns differ only by 1 letter (orangeID and orangeId). Hence I am getting this error:

Error:

org.apache.spark.sql.AnalysisException: Ambiguous reference to fields StructField(orangeID,StringType,true), StructField(orangeId,StringType,true);

Any suggestions to avoid this ambiguity will be great.

680

asked Aug 29 '18 21:08

data_person

1 Answers

turn on the spark sql case sensitivity configuration and try

spark.sql("set spark.sql.caseSensitive=true")

answered Oct 08 '22 06:10

Chandan Ray

Related questions
                            
                                playframework JsValue in HTML Template
                            
                                What scala statements or code can produce a byte-code which can not be translated to java?
                            
                                How to implement actor model without Akka?
                            
                                How to enter a multi-line command in the Scala REPL?
                            
                                Are Futures executed on a single thread? (Scala)
                            
                                How to reference to the standard ActorSystem of play framework 2?
                            
                                Why should I avoid using local modifiable variables in Scala?
                            
                                What does () mean in Scala?
                            
                                Deserialization of case object in Scala with JSON4S
                            
                                How to map variable names to features after pipeline
                            
                                Find minimum for a timestamp through Spark groupBy dataframe
                            
                                In Scala Reflection, How to get generic type parameter of a concrete subclass?
                            
                                Play Framework Scala: How to Stream Request Body
                            
                                how to get paginated select on slick + postgresql
                            
                                Scala: Class parameters access vs object fields access
                            
                                why no tailOption in Scala?
                            
                                Number of Executors in Spark Local Mode
                            
                                How to convert a string column with milliseconds to a timestamp with milliseconds in Spark 2.1 using Scala?
                            
                                Is it possible to express kotlin 'with' method equivalent in Scala?
                            
                                Scala hex string to bytes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With