Spark-csv data source: infer data types

1 Answers

Starting from Spark 2 we can use option 'inferSchema' like this: getSparkSession().read().option("inferSchema", "true").csv("YOUR_CSV_PATH")

109

answered Sep 19 '22 16:09

Olga

Related questions
                            
                                Dummy Encoding using Pyspark [duplicate]
                            
                                How to create a Dataset of Maps?
                            
                                Spark Structured Streaming with Hbase integration
                            
                                How does Spark 2.0 handle column nullability?
                            
                                Spark: Extracting summary for a ML logistic regression model from a pipeline model
                            
                                Pyspark, Add a character in the middle of a string
                            
                                How to implement Functor[Dataset]
                            
                                Understanding Kryo serialization buffer overflow error
                            
                                Using UDF ignores condition in when
                            
                                Spark: select with key in map
                            
                                How to bucketize a group of columns in pyspark?
                            
                                ERROR : User did not initialize spark context
                            
                                Why does Spark's Word2Vec return a vector?
                            
                                Set spark configuration
                            
                                PySpark explode stringified array of dictionaries into rows
                            
                                Convert UTC timestamp to local time based on time zone in PySpark
                            
                                Delta Lake without Databricks Runtime
                            
                                Stream-Static Join: How to refresh (unpersist/persist) static Dataframe periodically
                            
                                API compatibility between scala and python?
                            
                                Spark fail when running pi.py example with yarn-client mode

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Spark-csv data source: infer data types

Tags:

dataframe

apache-spark

Oleg Shirokikh

People also ask

1 Answers

Olga

Recent Activity

Donate For Us