Enable case sensitivity for spark.sql globally

Question

The option spark.sql.caseSensitive controls whether column names etc should be case sensitive or not. It can be set e.g. by

spark_session.sql('set spark.sql.caseSensitive=true')

and is false per default.

It does not seem to be possible to enable it globally in $SPARK_HOME/conf/spark-defaults.conf with

spark.sql.caseSensitive: True

though. Is that intended or is there some other file to set sql options?

Also in the source it is stated that it is highly discouraged to enable this at all. What is the rationale behind that advice?

Ankur · Accepted Answer

Yet another way for PySpark. Using a SparkSession object named spark:

spark.conf.set('spark.sql.caseSensitive', True)

karlson · Answer

As it turns out setting

spark.sql.caseSensitive: True

in $SPARK_HOME/conf/spark-defaults.conf DOES work after all. It just has to be done in the configuration of the Spark driver as well, not the master or workers. Apparently I forgot that when I last tried.

Jie · Answer

Try sqlContext.sql("set spark.sql.caseSensitive=true") in your Python code, which worked for me.

Enable case sensitivity for spark.sql globally

Tags:

apache-spark

pyspark

karlson

3 Answers

Ankur

karlson

Jie

Recent Activity

Donate For Us

Enable case sensitivity for spark.sql globally

Tags:

apache-spark

pyspark

karlson

3 Answers

Ankur

karlson

Jie

Related questions

Recent Activity

Donate For Us