I have a simple grep command trying to get only the first column of a CSV file including the comma. It goes like this...
grep -Eo '^[^,]+,' some.csv
So in my head, that reads like "get me only the matching part of the line where each line starts with at least one character that is not a comma, followed by a single comma."
So on a file, some.csv, that looks like this:
column1,column2,column3,column4
column1,column2,column3,column4
column1,column2,column3,column4
I'm expecting this output:
column1,
column1,
column1,
But I get this output:
column1,
column2,
column3,
column1,
column2,
column3,
column1,
column2,
column3,
Why is that? What am I missing from my grep/regex? Is my expected output incorrect?
If I remove the requirement of the trailing comma in the regex, the command works as I expect.
grep -Eo '^[^,]+' some.csv
Gives me:
column1
column1
column1
NOTE: I'm on macOS High Sierra with grep version: grep (BSD grep) 2.5.1-FreeBSD 
BSD grep is buggy in general. See the following related posts:
That last link above mentions your case: when -o option is used, grep ignores the ^ anchor for some reason. This issue is also described in a FreeBSD bug:
I've noticed some more issues with the same version of grep. I don't know whether they're related, but I'll append them here for now.
$ printf abc | grep -o '^[a-c]'should just print 'a', but instead gives three hits, against each letter of the incoming text.
As a workaround, it might be a better idea to just install GNU grep that works as expected.
Or, use sed with a BRE POSIX pattern:
sed -i '' 's/^\([^,]*,\).*/\1/' file
where the pattern matches
^ - start of a line\([^,]*,\) - Group 1 (later referred to with \1 backreference from the RHS):
[^,]* - zero or more chars other than ,
, - a , char.* - the rest of the line.Note that -i will change the file contents inplace. Use -i.bak to create a backup file if needed (then, you wouldn't need the next empty '' though).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With