org.apache.spark.sql.AnalysisException: cannot resolve given input column

Question

I have a Spark program that's reading from CSV files and loading them into Dataframes. Once loaded, I'm manipulating them using SparkSQL.

When running my Spark job, it fails and gives me the following exception:

org.apache.spark.sql.AnalysisException: cannot resolve 'action' given input columns ["alpha", "beta", "gamma", "delta", "action"]

The exception above is thrown when SparkSQL tries parsing the following:

SELECT *, 
  IF(action = 'A', 1, 0) a_count,
  IF(action = 'B', 1, 0) b_count,
  IF(action = 'C', 1, 0) c_count,
  IF(action = 'D', 1, 0) d_count,
  IF(action = 'E', 1, 0) e_count
FROM my_table

This code worked fine before updating to Spark 2.0. Does anyone have any idea what would cause this issue?

Edit: I'm loading the CSV files using the Databricks CSV parser:

sqlContext.read().format("csv")
    .option("header", "false")
    .option("inferSchema", "false")
    .option("parserLib", "univocity")
    .load(pathToLoad);

xmar · Accepted Answer

Try adding backquotes to your selection.

SELECT *, 
  IF(`action` = 'A', 1, 0) a_count,
  IF(`action` = 'B', 1, 0) b_count,
  IF(`action` = 'C', 1, 0) c_count,
  IF(`action` = 'D', 1, 0) d_count,
  IF(`action` = 'E', 1, 0) e_count
FROM my_table

This applies to some databases like MySQL as well.

org.apache.spark.sql.AnalysisException: cannot resolve given input column

Tags:

dataframe

apache-spark

apache-spark-sql

dmux

1 Answers

xmar

Recent Activity

Donate For Us

org.apache.spark.sql.AnalysisException: cannot resolve given input column

Tags:

dataframe

apache-spark

apache-spark-sql

dmux

1 Answers

xmar

Related questions

Recent Activity

Donate For Us