Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

CSV.foreach Not Reading First Column in CSV File

Tags:

ruby

csv

Learning Ruby for the first time to automate cleaning up some CSV files.

I've managed to piece together the script below from other SO questions but for some reason the script does not read the first column of the original CSV file. If I add a dummy first column everything works perfectly. What am I missing?

require 'csv'

COLUMNS = ['SFID','Date','Num','Transaction Type']

CSV.open("invoicesfixed.csv", "wb",
  :write_headers=> true,
  :headers => ["Account__c","Invoice_Date__c","Invoice_Number__c","Transaction_Type__c"]) do |csv|

  CSV.foreach('invoices.csv', :headers=>true, :converters => :all) do |row|


    #convert date format to be compatible with Salesforce
    row['Date'] = Date.strptime(row['Date'], '%m/%d/%y').strftime('%Y-%m-%d')
    csv << COLUMNS.map { |col| row[col] }

  end
end

This input file:

Transaction Type,Date,Num,SFID
Invoice,7/1/19,151466,SFID1
Invoice,7/1/19,151466,SFID2
Invoice,7/1/19,151466,SFID3
Invoice,7/1/19,151466,SFID4
Invoice,7/1/19,151466,SFID5
Invoice,7/1/19,151466,SFID6
Invoice,7/1/19,151153,SFID7
Sales Receipt,7/1/19,149487,SFID8
Sales Receipt,7/1/19,149487,SFID9
Sales Receipt,7/1/19,149758,SFID10
Sales Receipt,7/1/19,149758,SFID11

Yields this output:

Account__c,Invoice_Date__c,Invoice_Number__c,Transaction_Type__c
SFID1,2019-07-01,151466,
SFID2,2019-07-01,151466,
SFID3,2019-07-01,151466,
SFID4,2019-07-01,151466,
SFID5,2019-07-01,151466,
SFID6,2019-07-01,151466,
SFID7,2019-07-01,151153,
SFID8,2019-07-01,149487,
SFID9,2019-07-01,149487,
SFID10,2019-07-01,149758,
SFID11,2019-07-01,149758,

However, this input:

Dummy,Transaction Type,Date,Num,SFID
,Invoice,7/1/19,151466,SFID1
,Invoice,7/1/19,151466,SFID2
,Invoice,7/1/19,151466,SFID3
,Invoice,7/1/19,151466,SFID4
,Invoice,7/1/19,151466,SFID5
,Invoice,7/1/19,151466,SFID6
,Invoice,7/1/19,151153,SFID7
,Sales Receipt,7/1/19,149487,SFID8
,Sales Receipt,7/1/19,149487,SFID9
,Sales Receipt,7/1/19,149758,SFID10
,Sales Receipt,7/1/19,149758,SFID11

Yields the correct output of:

Account__c,Invoice_Date__c,Invoice_Number__c,Transaction_Type__c
SFID1,2019-07-01,151466,Invoice
SFID2,2019-07-01,151466,Invoice
SFID3,2019-07-01,151466,Invoice
SFID4,2019-07-01,151466,Invoice
SFID5,2019-07-01,151466,Invoice
SFID6,2019-07-01,151466,Invoice
SFID7,2019-07-01,151153,Invoice
SFID8,2019-07-01,149487,Sales Receipt
SFID9,2019-07-01,149487,Sales Receipt
SFID10,2019-07-01,149758,Sales Receipt
SFID11,2019-07-01,149758,Sales Receipt

Any ideas why this might be happening?

like image 248
Steven Carlton Avatar asked Aug 08 '19 18:08

Steven Carlton


People also ask

Do all lines in a CSV files have the same number of columns?

A CSV file should have the same number of columns in each row. A CSV file stores data in rows and the values in each row is separated with a separator, also known as a delimiter.


1 Answers

I had a similar problem, though running your example worked. I realized that problem (at least for me) was that I was creating CSV file using "Save As UTF-8 CSV" from Excel.

This adds BOM to the beginning of the file - before the first column header name and consequently row['firstColumnName'] was returning nil.

Saving file as CSV fixed the issue for me.

like image 148
Milan Avatar answered Sep 24 '22 10:09

Milan