The option spark.sql.caseSensitive
controls whether column names etc should be case sensitive or not. It can be set e.g. by
spark_session.sql('set spark.sql.caseSensitive=true')
and is false
per default.
It does not seem to be possible to enable it globally in $SPARK_HOME/conf/spark-defaults.conf
with
spark.sql.caseSensitive: True
though. Is that intended or is there some other file to set sql options?
Also in the source it is stated that it is highly discouraged to enable this at all. What is the rationale behind that advice?
Yet another way for PySpark. Using a SparkSession
object named spark
:
spark.conf.set('spark.sql.caseSensitive', True)
As it turns out setting
spark.sql.caseSensitive: True
in $SPARK_HOME/conf/spark-defaults.conf
DOES work after all. It just has to be done in the configuration of the Spark driver as well, not the master or workers. Apparently I forgot that when I last tried.
Try sqlContext.sql("set spark.sql.caseSensitive=true") in your Python code, which worked for me.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With