I am working with Python's CSV module, specifically the writer. My question is how can I add double quotes to a single item in a list and have the writer write the string the same way as a print statement would?
for example:
import csv
#test "data"
test = ['item1','01','001',1]
csvOut = csv.writer(open('file.txt','a')) #'a' used for keeping past results
test[1] = '"'+test[1]+'"'
print test
#prints: ['item1', '"01"', '001', 1]
csvOut.writerow(test)
#written in the output file: item1,"""01""",001,1
#I was expecting: item1,"01",001,1
del csvOut
I tired adding a quoting=csv.QUOTE_NONE option, but that raised an error. I am guessing this is related to the many csv dialects, I was hoping to avoid digging too far into that.
In retrospect I could probably have built my initial data set smarter and perhaps avoided the need for this situation but at this point curiosity is really getting the better of me (this is a simplified example): how do you keep the written output from adding those extra quotes?
Python's triple quotes comes to the rescue by allowing strings to span multiple lines, including verbatim NEWLINEs, TABs, and any other special characters. The syntax for triple quotes consists of three consecutive single or double quotes.
Spanning strings over multiple lines can be done using python's triple quotes. It can also be used for long comments in code. Special characters like TABs, verbatim or NEWLINEs can also be used within the triple quotes. As the name suggests its syntax consists of three consecutive single or double-quotes.
csv. QUOTE_NONE means that do not quote anything on output. However, while reading quotes are included around the field values.
wrap(text, width=70, **kwargs): This function wraps the input paragraph such that each line in the paragraph is at most width characters long. The wrap method returns a list of output lines. The returned list is empty if the wrapped output has no content.
It's not actually triple-quoting, although it looks that way. Try it with another example to see:
test = ['item1', 'abc"def']
Now you'll see that it writes this:
"abc""def"
In other words, it's just wrapping quotes around your string, and escaping the literal quote characters by doubling them, because that's how default Excel-style CSV handles quote characters.
The question is, what format do you want here? Almost anything you want (within reason) is doable, but you have to pick something. Backslash-escaping quotes? Backslash-escaping everything instead of using quotes in the first place? Single quotes instead of double quotes?
For example, this looks like an answer:
csvOut = csv.writer(open('file.txt','a'), quotechar="'")
… until you have an item like Filet O'Fish
and the whole thing gets single-quoted and the '
gets doubled and you have the exact same problem you were trying to avoid. If you're aiming for human readability, and '
is a lot less common in your data than "
, that may actually be the right answer, but it's not a perfect answer.
And really, no answer can be perfect: you need some way to either quote or escape commas—and other things, like newlines—and the way you do that is going to add at least one more character that needs to be quote-doubled or escaped. If you know there are never any commas, newlines, etc. in your data, and there's at least one other character you know will never show up, you can get away with setting either quotechar
to that other character, or escapechar
to that other character and quoting=QUOTE_NONE
. But the first time someone unexpectedly uses the character you were sure would never appear, your code will break, so you'd better actually be sure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With