CSV files with quote and comma chars inside fields

Tags:

I have a stack of CSV files I want to parse - the problem is half of the have quote marks used as quote marks, and commas inside main field. They are not really CSV, but they do have a fixed number of fields that are identifiable. The dialect=csv."excel" setting works perfectly on files with out the extra " and , chars inside the field.

This data is old/unsupported. I am trying to push some life into it.

e.g.

"AAAAA
AAAA
AAAA
AAAA","AAAAAAAA


AAAAAA
AAAAA "AAAAAA" AAAAAAAAA
AAAAAAAAAAAAAAAAAAAAAAAAA, AAAAA
AAAAAAAAA AAAAA AAAAAAAAAA
AAAAA, "AAAAA", AAAAAAAAA
AAAAAAAA AAAAAAAA
AAAAAAA
"

This is tripping the file parser, and throws an error _csv.Error: newline inside string. I narrrowed it down to this being the issue by removing the quote marks from inside the 2nd field and the csv.reader module parses the file OK.

Some of the fields are multi line - I'm not sure if thats important to know.

I have been poking around at the dialect settings, and whilst I can find 'skipinitialspace', this doesn't seem to solve the problem.

To be clear - this is not valid 'CSV', its data objects that loosely follow a CSV structure, but have , and " chars inside the field test.

The lineterminator is \x0d\x0a

I have tried a number of goes at differnt permuations of doublequote and the quoting variable in the dialect module, but I can't get this parse correctly.

I can not be confident that a ," or ", combination exists only on field boundaries.

This problem only exists for one (the last) of several fields in the file, and there are several thousand files.

488

asked Feb 10 '12 23:02

Jay Gattuso

1 Answers

Have you tried passing csv.QUOTE_NONE via the quoting keyword arg? Without having some code or data to test this on, I have no way to know whether this actually works on your data, but it seems to work with the fragment you provided.

>>> import csv
>>> r = csv.reader(open('foo.csv', 'rb'), quoting=csv.QUOTE_NONE)
>>> for row in r: print row
... 
['"A"', '"B"', '"ccc "ccccccc" cccccc"']

101

answered Nov 15 '22 16:11

senderle

Related questions
                            
                                Error when parsing JSON data
                            
                                Python: passing flags to functions
                            
                                Parse FB Graph API date string into python datetime
                            
                                Checking if Two Massive Python Dictionaries are Equivalent
                            
                                How can I efficiently transform a numpy.int8 array in-place to a value-shifted numpy.uint8 array?
                            
                                Http POST Curl in python
                            
                                Programatically opening URLs in web browser in Python
                            
                                Serving Files with Pyramid
                            
                                parsing json python
                            
                                How can I append a number to a string in Racket?
                            
                                How do I compute the logarithm of 1 minus the exponent of a given small number in python
                            
                                Is there a matplotlib counterpart of Matlab "stem3"?
                            
                                Django reverse url with parameters to a class based view
                            
                                Start python .py as a service in windows
                            
                                Unable to restore stdout to original (only to terminal)
                            
                                How do I compare dates from Twitter data stored in MongoDB via PyMongo?
                            
                                Generate random filename-safe and URL-safe string
                            
                                Does CREATE TABLE IF NOT EXISTS work in MySQLdb? Syntax?
                            
                                os.path.isfile does not work as expected
                            
                                How do I "pickle" instances of Django models in a database into sample python code I can use to load sample data?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

CSV files with quote and comma chars inside fields

Tags:

python

csv

quote

Jay Gattuso

People also ask

1 Answers

senderle

Recent Activity

Donate For Us