Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to search/replace a bunch of text files in unix (osx)

I have a regular expression which I tested successfully on http://regexpal.com/ :

^(\".+?\"),\d.+?,"X",-99,-99,-99,-99,-99,-99,-99,(\d*),(\d*)

Where my test data looks like:

"AB101AA",10,"X",-99,-99,-99,-99,-99,-99,-99,394251,806376,179,"S00","SN9","00","QA","MH","X"
"AB101AF",10,"X",-99,-99,-99,-99,-99,-99,-99,394181,806429,179,"S00","SN9","00","QA","MH","X"
"AB101AG",10,"X",-99,-99,-99,-99,-99,-99,-99,394251,806376,179,"S00","SN9","00","QA","MH","X"
"AB101AH",10,"X",-99,-99,-99,-99,-99,-99,-99,394371,806359,179,"S00","SN9","00","QA","MH","X"
"AB101AJ",10,"X",-99,-99,-99,-99,-99,-99,-99,394171,806398,179,"S00","SN9","00","QA","MH","X"
"AB101AL",10,"X",-99,-99,-99,-99,-99,-99,-99,394331,806530,179,"S00","SN9","00","QA","MH","X"

I want to replace it with \1,\2,\3 on each line so for example line 1 would give

"AB101AA",394251,806376

How can I run this regex search & replace against all csv files in my folder in osx? I tried using sed but that complains with a syntax error (plus I'm unsure it will support this regex?). Additionaly, will the ^ (begining of line) and $ (end of line) anchors work line by line, or will they match the begin and end of the file?

UPDATE: Some good responses with cut, awk ect that get specific fields from the csv, but I've recently learnt I need to take the numbers from that list and chop them into 2 sub-values, so my example output from above would need to look like:

"AB101AA",3,94251,8,06376

As far as I know, I need to use a regex for this.

like image 392
Matt Roberts Avatar asked Feb 24 '23 18:02

Matt Roberts


1 Answers

You would like to extract field 1, 11 and 12? For a task like this, awk or cut really excells! E.g.

awk -F, '{print $1, $11, $12}' input

using cut:

cut -d, -f1,11,12 input 

using perl. -a turns on autosplit mode – perl will automatically split input lines on whitespace into the @F array. -F is used in conjunction with -a, to choose the delimiter on which to split lines.

perl -F, -lane 'printf "%s, %d, %d\n", $F[0], $F[10], $F[11]' input 

...and finally, a pure bash solution

#!/bin/bash
IFS=,
while read -ra ARRAY;
do
    echo ${ARRAY[0]}, ${ARRAY[10]}, ${ARRAY[11]}
done < input
like image 152
Fredrik Pihl Avatar answered Mar 05 '23 18:03

Fredrik Pihl