Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read CSV file in pyspark with ANSI encoding

I am trying to read in a csv/text file that requires it to be read in using ANSI encoding. However this is not working. Any ideas?

mainDF= spark.read.format("csv")\
                  .option("encoding","ANSI")\
                  .option("header","true")\
                  .option("maxRowsInMemory",1000)\
                  .option("inferSchema","false")\
                  .option("delimiter", "¬")\
                  .load(path)

java.nio.charset.UnsupportedCharsetException: ANSI

The file is over 5GB hence the spark requirement.

I have also tried ANSI in lower case

like image 578
Tiger_Stripes Avatar asked Nov 14 '25 21:11

Tiger_Stripes


1 Answers

ISO-8859-1 is the same as ANSI so replace that as above

like image 178
Tiger_Stripes Avatar answered Nov 17 '25 12:11

Tiger_Stripes



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!