I'm generating some CSV output using Ruby's built-in CSV. Everything works fine, but the customer wants the name field in the output to have wrapping double-quotes so the output looks like the input file. For instance, the input looks something like this:
1,1.1.1.1,"Firstname Lastname",more,fields 2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
CSV's output, which is correct, looks like:
1,1.1.1.1,Firstname Lastname,more,fields 2,2.2.2.2,"Firstname Lastname, Jr.",more,fields
I know CSV is doing the right thing by not double-quoting the third field just because it has embedded blanks, and wrapping the field with double-quotes when it has the embedded comma. What I'd like to do, to help the customer feel warm and fuzzy, is tell CSV to always double-quote the third field.
I tried wrapping the field in double-quotes in my to_a
method, which creates a "Firstname Lastname"
field being passed to CSV, but CSV laughed at my puny-human attempt and output """Firstname Lastname"""
. That is the correct thing to do because it's escaping the double-quotes, so that didn't work.
Then I tried setting CSV's :force_quotes => true
in the open
method, which output double-quotes wrapping all fields as expected, but the customer didn't like that, which I expected also. So, that didn't work either.
I've looked through the Table and Row docs and nothing appeared to give me access to the "generate a String field" method, or a way to set a "for field n always use quoting" flag.
I'm about to dive into the source to see if there's some super-secret tweaks, or if there's a way to monkey-patch CSV and bend it to do my will, but wondered if anyone had some special knowledge or had run into this before.
And, yes, I know I could roll my own CSV output, but I prefer to not reinvent well-tested wheels. And, I'm also aware of FasterCSV; That's now part of Ruby 1.9.2, which I'm using, so explicitly using FasterCSV buys me nothing special. Also, I'm not using Rails and have no intention of rewriting it in Rails, so unless you have a cute way of implementing it using a small subset of Rails, don't bother. I'll downvote any recommendations to use any of those ways just because you didn't bother to read this far.
Well, there's a way to do it but it wasn't as clean as I'd hoped the CSV code could allow.
I had to subclass CSV, then override the CSV::Row.<<=
method and add another method forced_quote_fields=
to make it possible to define the fields I want to force-quoting on, plus pull two lambdas from other methods. At least it works for what I want:
require 'csv' class MyCSV < CSV def <<(row) # make sure headers have been assigned if header_row? and [Array, String].include? @use_headers.class parse_headers # won't read data for Array or String self << @headers if @write_headers end # handle CSV::Row objects and Hashes row = case row when self.class::Row then row.fields when Hash then @headers.map { |header| row[header] } else row end @headers = row if header_row? @lineno += 1 @do_quote ||= lambda do |field| field = String(field) encoded_quote = @quote_char.encode(field.encoding) encoded_quote + field.gsub(encoded_quote, encoded_quote * 2) + encoded_quote end @quotable_chars ||= encode_str("\r\n", @col_sep, @quote_char) @forced_quote_fields ||= [] @my_quote_lambda ||= lambda do |field, index| if field.nil? # represent +nil+ fields as empty unquoted fields "" else field = String(field) # Stringify fields # represent empty fields as empty quoted fields if ( field.empty? or field.count(@quotable_chars).nonzero? or @forced_quote_fields.include?(index) ) @do_quote.call(field) else field # unquoted field end end end output = row.map.with_index(&@my_quote_lambda).join(@col_sep) + @row_sep # quote and separate if ( @io.is_a?(StringIO) and output.encoding != raw_encoding and (compatible_encoding = Encoding.compatible?(@io.string, output)) ) @io = StringIO.new(@io.string.force_encoding(compatible_encoding)) @io.seek(0, IO::SEEK_END) end @io << output self # for chaining end alias_method :add_row, :<< alias_method :puts, :<< def forced_quote_fields=(indexes=[]) @forced_quote_fields = indexes end end
That's the code. Calling it:
data = [ %w[1 2 3], [ 2, 'two too', 3 ], [ 3, 'two, too', 3 ] ] quote_fields = [1] puts "Ruby version: #{ RUBY_VERSION }" puts "Quoting fields: #{ quote_fields.join(', ') }", "\n" csv = MyCSV.generate do |_csv| _csv.forced_quote_fields = quote_fields data.each do |d| _csv << d end end puts csv
results in:
# >> Ruby version: 1.9.2 # >> Quoting fields: 1 # >> # >> 1,"2",3 # >> 2,"two too",3 # >> 3,"two, too",3
This post is old, but I can't believe no one thought of this.
Why not do:
csv = CSV.generate :quote_char => "\0" do |csv|
where \0 is a null character, then just add quotes to each field where they are needed:
csv << [product.upc, "\"" + product.name + "\"" # ...
Then at the end you can do a
csv.gsub!(/\0/, '')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With