I have to transform (preprocess) a CSV file, by generating / inserting a new column, being the result of the concat of existing columns.
For example, transform:
A|B|C|D|E
into:
A|B|C|D|C > D|E
In this example, I do it with:
cat myfile.csv | awk 'BEGIN{FS=OFS="|"} {$4 = $4 OFS $3" > "$4} 1'
But now I have something more complex to do, and dont find how to do this.
I have to transform:
A|B|C|x,y,z|E
into
A|B|C|x,y,z|C > x,C > y,C > z|E
How can it be done in awk (or other command) efficiently (my csv file can contains thousands of lines)?
Thanks.
With GNU awk (for gensub which is a GNU extension):
awk -F'|' '{$6=$5; $5=gensub(/(^|,)/,"\\1" $3 " > ","g",$4); print}' OFS='|'
You can split the 4th field into an array:
awk 'BEGIN{FS=OFS="|"} {split($4,a,",");$4="";for(i=1;i in a;i++)$4=($4? $4 "," : "") $3 " > " a[i]} 1' myfile.csv
A|B|C|C > x,C > y,C > z|E
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With