Say I have the following csv file:
id,message,time
123,"Sorry, This message
has commas and newlines",2016-03-28T20:26:39
456,"It makes the problem non-trivial",2016-03-28T20:26:41
I want to write a bash command that will return only the time column. i.e.
time
2016-03-28T20:26:39
2016-03-28T20:26:41
What is the most straight forward way to do this? You can assume the availability of standard unix utils such as awk, gawk, cut, grep, etc.
Note the presence of "" which escape , and newline characters which make trivial attempts with
cut -d , -f 3 file.csv
futile.
Since CSV files use the comma character "," to separate columns, values that contain commas must be handled as a special case. These fields are wrapped within double quotation marks. The first double quote signifies the beginning of the column data, and the last double quote marks the end.
Re: Handling 'comma' in the data while writing to a CSV. So for data fields that contain a comma, you should just be able to wrap them in a double quote. Fields containing line breaks (CRLF), double quotes, and commas should be enclosed in double-quotes.
As chepner said, you are encouraged to use a programming language which is able to parse csv.
Here comes an example in python:
import csv
with open('a.csv', 'rb') as csvfile:
reader = csv.reader(csvfile, quotechar='"')
for row in reader:
print(row[-1]) # row[-1] gives the last column
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With