DataFrame fail to find the column name after join condition

Tags:

I am getting one issue like "sql.AnalysisException: cannot resolve column_name" when performing one join operation using dataframe API. Though the column name exists and same join operation is working fine when tried with SQL format of HiveContext. In the following code base,

DataFrame df= df1
  .join(df2, df1.col("MERCHANT").equalTo(df2.col("MERCHANT")))
  .select(df2.col("MERCH_ID"), df1.col("MERCHANT")));

I have tried with "alias" function too, but got the same problem "Can't resolve column name." and throwing following exception.

resolved attribute(s) MERCH_ID#738 missing from MERCHANT#737,MERCHANT#928,MERCH_ID#929,MER_LOC#930 in operator !Project [MERCH_ID#738,MERCHANT#737];

at org.apache.spark.sql.catalyst.analysis.CheckAnalysis$class.failAnalysis(CheckAnalysis.scala:38)

Spark Version: 1.6

The problem has been faced in both Scala and Java Spark.

In Scala, the issue got resolved using 'alias', but in Java, I am still getting the error.

969

asked Aug 25 '17 13:08

abhijit nag

2 Answers

import org.apache.spark.sql.functions;

Dataset joinedDF = DF1.join(DF2, DF1.col("_c1").equalTo(DF2.col("_c1")));

here _c1 represents the column 1, if the column does not have a header

168

answered Oct 20 '22 01:10

Arun Mohan

From my experience it is best to avoid DataFrame.col and DataFrame.apply unless necessary for disambiguation (aliasing is still better). Please try using independent Column objects:

import org.apache.spark.sql.functions;

DataFrame df= df1.alias("df1").
  .join(df2.alias("df2"), functions.col("df1.MERCHANT").equalTo(functions.col("df2.MERCHANT")))
  .select(functions.col("df2.MERCH_ID"), functions.col("df2.MERCHANT")));

answered Oct 20 '22 02:10

Alper t. Turker

Related questions
                            
                                Java "Thread-2" without stack prevents termination
                            
                                Why wait() and notify() are not in special class? [closed]
                            
                                What is Law of Demeter?
                            
                                How do I map my java app logging events to corresponding cloud logging event levels in GCP Felexible non-compat App Engine?
                            
                                Java GUI support on Wayland
                            
                                Will "Rubber banding" resolve multiplayer interpolation stutter?
                            
                                Merging multiple identical Kafka Streams topics
                            
                                Send authorization header with every request in webview using okhttp in android
                            
                                Equivalent of Mockito any with not null constraint
                            
                                Kafka-connect sink task ignores file offset storage property
                            
                                Reuse tomcat threads while waiting "long" time
                            
                                android monitor throw E/HW-JPEG-DEC after update to android 7
                            
                                What's the worst resolution I can reasonably expect from System.nanoTime?
                            
                                How does ComponentScan work?
                            
                                Solving crosswords [closed]
                            
                                How to properly use JTI claims with JWT to prevent replay attacks?
                            
                                How to enable auto reconnect of redis connection in Jedis client
                            
                                How to enable Bearer authentication on Spring Boot application?
                            
                                How to generate constructors in swagger codegen?
                            
                                serve swagger.json from resource class

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

DataFrame fail to find the column name after join condition

Tags:

java

apache-spark

apache-spark-sql

abhijit nag

People also ask

2 Answers

Arun Mohan

Alper t. Turker

Recent Activity

Donate For Us