I have a simple grep
command trying to get only the first column of a CSV file including the comma. It goes like this...
grep -Eo '^[^,]+,' some.csv
So in my head, that reads like "get me only the matching part of the line where each line starts with at least one character that is not a comma, followed by a single comma."
So on a file, some.csv
, that looks like this:
column1,column2,column3,column4
column1,column2,column3,column4
column1,column2,column3,column4
I'm expecting this output:
column1,
column1,
column1,
But I get this output:
column1,
column2,
column3,
column1,
column2,
column3,
column1,
column2,
column3,
Why is that? What am I missing from my grep/regex? Is my expected output incorrect?
If I remove the requirement of the trailing comma in the regex, the command works as I expect.
grep -Eo '^[^,]+' some.csv
Gives me:
column1
column1
column1
NOTE: I'm on macOS High Sierra with grep version: grep (BSD grep) 2.5.1-FreeBSD
BSD grep
is buggy in general. See the following related posts:
That last link above mentions your case: when -o
option is used, grep
ignores the ^
anchor for some reason. This issue is also described in a FreeBSD bug:
I've noticed some more issues with the same version of grep. I don't know whether they're related, but I'll append them here for now.
$ printf abc | grep -o '^[a-c]'
should just print 'a', but instead gives three hits, against each letter of the incoming text.
As a workaround, it might be a better idea to just install GNU grep that works as expected.
Or, use sed
with a BRE POSIX pattern:
sed -i '' 's/^\([^,]*,\).*/\1/' file
where the pattern matches
^
- start of a line\([^,]*,\)
- Group 1 (later referred to with \1
backreference from the RHS):
[^,]*
- zero or more chars other than ,
,
- a ,
char.*
- the rest of the line.Note that -i
will change the file contents inplace. Use -i.bak
to create a backup file if needed (then, you wouldn't need the next empty ''
though).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With