Currently I'm having a problem on the ff CSV data.
COLUMN1,COLUMN2,COLUMN3,COLUMN4
apple1,apple2,apple3,apple4
banana1,banana2,banana3,
caimito1,"caimito21
caimito22","caimito31
caimito32",caimito4
That would look like this:
╔══════════╦═══════════╦═══════════╦══════════╗
║ COLUMN1 ║ COLUMN2 ║ COLUMN3 ║ COLUMN4 ║
╠══════════╬═══════════╬═══════════╬══════════╬
║ apple1 ║ apple2 ║ apple3 ║ apple4 ║
║ banana1 ║ banana2 ║ banana3 ║ ║
║ caimito1 ║ caimito21 ║ caimito31 ║ caimito4 ║
║ ║ caimito22 ║ caimito32 ║ ║
╚══════════╩═══════════╩═══════════╩══════════╝
So my plan is to add COLUMN5 and every row of it will have a value "FRUIT".
Command used:
sed "1 s/$/,COLUMN5/g" FILE.csv | sed "2,$ s/$/,FRUIT/g" > OUTPUT.csv
Output:
╔══════════╦════════════════╦════════════════╦══════════╦═════════╗
║ COLUMN1 ║ COLUMN2 ║ COLUMN3 ║ COLUMN4 ║ COLUMN5 ║
╠══════════╬════════════════╬════════════════╬══════════╬═════════╣
║ apple1 ║ apple2 ║ apple3 ║ apple4 ║ FRUIT ║
║ banana1 ║ banana2 ║ banana3 ║ ║ FRUIT ║
║ caimito1 ║ caimito21FRUIT ║ caimito31FRUIT ║ caimito4 ║ FRUIT ║
║ ║ caimito22 ║ caimito32 ║ ║ ║
╚══════════╩════════════════╩════════════════╩══════════╩═════════╝
Is there any way to add the "FRUIT" without affecting "caimito" row?
I also tried the ff. command but it didn't work out. Added "," before "$".
sed "1 s/$/,COLUMN5/g" FILE.csv | sed "2,$ s/,$/,FRUIT/g" > OUTPUT.csv
There are different ways to insert a new line in a file using sed, such as using the “a” command, the “i” command, or the substitution command, “s“. sed's “a” command and “i” command are pretty similar.
Sed is probably not the right tool to handle csv files since the rules are way more complicated than how it may look (it's probably possible, but such scripts are in general rather error-prone, etc.). You can however use csvtools
to handle this:
file="FILE.csv"
nr=$(csvtool height $file)
ot=$(perl -e "print \"COLUMN5\\n\";for\$i(2..$nr){print \"FRUIT\\n\";}")
echo "$ot" | csvtool paste "$file" -
The script works as follows:
csvtool height
,COLUMN5
followed by n-1 times FRUIT
.EDIT: I only just saw the csvtool solution; it is of course much more practical. I leave this solution up mainly because it would be a pity to hide it and its Lovecraftian beauty.
Well, here goes. This is a way to do it in sed:
sed ':a $!{ N; ba }; s/"[^"]*"/{&}/g; :b s/\({"[^"]*\)\n\([^"]*"}\)/\1~"~\2/g; tb; s/\n\|$/,FRUIT&/g; s/,FRUIT\(\n\|$\)/,COLUMN5\1/; :c s/\({"[^"]\)*~"~/\1\n/g; tc; s/{"\|"}/"/g' filename
This is going to be a bit of a ride. It works as follows:
:a $!{ N; ba } # assemble the whole file in the
# hold buffer
s/"[^"]*"/{&}/g # encase all "-enclosed fields in
# {"..."} to make matching the beginning
# and end separately possible.
:b # jump mark for looping
s/\({"[^"]*\)\n\([^"]*"}\)/\1~"~\2/g # replace the first newline in all
# {"..."} fields with ~"~
tb # loop until all were replaced
s/\n\|$/,FRUIT&/g # Put FRUIT at the end of all lines
s/,FRUIT\(\n\|$\)/,COLUMN5\1/ # Replace the first ,FRUIT with ,COLUMN5
# The \(\n\|$\) bit is so that this
# works with empty files (that only
# have a header line)
:c # Jump mark for looping
s/\({"[^"]\)*~"~/\1\n/g # replace the first ~"~ in all {"..."}
# fields with a newline
tc # loop until all were replaced
s/{"\|"}/"/g # replace all {", "} markers with "
# again.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With