Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Spark 1.5.2: org.apache.spark.sql.AnalysisException: unresolved operator 'Union;

Tags:

apache-spark

I have two dataframes df1 and df2. Both of them have the following schema:

 |-- ts: long (nullable = true)
 |-- id: integer (nullable = true)
 |-- managers: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- projects: array (nullable = true)
 |    |-- element: string (containsNull = true)

df1 is created from an avro file while df2 from an equivalent parquet file. However, If I execute, df1.unionAll(df2).show(), I get the following error:

    org.apache.spark.sql.AnalysisException: unresolved operator 'Union;
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:37)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.failAnalysis(Analyzer.scala:44)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:174)
    at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$$anonfun$checkAnalysis$1.apply(CheckAnalysis.scala:49)
    at org.apache.spark.sql.catalyst.trees.TreeNode.foreachUp(TreeNode.scala:103)
like image 489
Neel Avatar asked Jul 29 '16 04:07

Neel


2 Answers

I ran into the same situation and it turns out to be not only the fields need to be the same but also you need to maintain the exact same ordering of the fields in both dataframe in order to make it work.

like image 185
Even Cheng Avatar answered Sep 25 '22 00:09

Even Cheng


This is old and there are already some answers lying around but I just faced this problem while trying to make a union of two dataframes like in...

//Join 2 dataframes
val df = left.unionAll(right)

As others have mentioned, order matters. So just select right columns in the same order than left dataframe columns

//Join 2 dataframes, but take columns in the same order    
val df = left.unionAll(right.select(left.columns.map(col):_*))
like image 31
David Royo Avatar answered Sep 27 '22 00:09

David Royo