Spark - Read csv file with quote

Question

I have a CSV file which has data contained in double quotes (").

"0001", "A", "001", "2017/01/01 12"

"0001", "B", "002", "2017/01/01 13"

I would like to read only pure data (without " symbol).

spark.read
 .option("encoding", encoding)
 .option("header", header)
 .option("quote", quote)
 .option("sep", sep)

Other options work well, but only quote seems not work properly. It load with quote symbol ("). How should I take this symbol off from loaded data.

dataframe.show result

+----+----+------+---------------+
| _c0| _c1|   _c2|             _c3|
+----+----+------+---------------+
|0001| "A"| "001"| "2017/01/01 12"|
|0001| "B"| "002"| "2017/01/01 13"|
+----+----+------+---------------+

koiralo · Accepted Answer

You can use option quote as below

option("quote", "\"")

If you have an extra space between your two data as "abc", "xyz", than you need to use

option("ignoreLeadingWhiteSpace", true)

Hope this helps

Spark - Read csv file with quote

Tags:

apache-spark

J.Done

1 Answers

koiralo

Recent Activity

Donate For Us

Spark - Read csv file with quote

Tags:

apache-spark

J.Done

1 Answers

koiralo

Related questions

Recent Activity

Donate For Us