Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing specific characters in first column of text

Tags:

replace

awk

I have a text file and I'm trying to replace a specific character (.) in the first column to another character (-). Every field is delimited by comma. Some of the lines have the last 3 columns empty, so they have 3 commas at the end.

Example of text file:

abc.def.ghi,123.4561.789,ABC,DEF,GHI
abc.def.ghq,124.4562.789,ABC,DEF,GHI
abc.def.ghw,125.4563.789,ABC,DEF,GHI
abc.def.ghe,126.4564.789,,,
abc.def.ghr,127.4565.789,,,

What I tried was using awk to replace '.' in the first column with '-', then print out the contents.

ETA: Tried out sarnold's suggestion and got the output I want.

ETA2: I could have a longer first column. Is there a way to change ONLY the first 3 '.' in the first column to '-', so I get the output

abc-def-ghi-qqq.www,123.4561.789,ABC,DEF,GHI
abc-def-ghq-qqq.www,124.4562.789,ABC,DEF,GHI
abc-def-ghw-qqq.www,125.4563.789,ABC,DEF,GHI
abc-def-ghe-qqq.www,126.4564.789,,,
abc-def-ghr-qqq.www,127.4565.789,,,
like image 269
Rayne Avatar asked May 02 '12 02:05

Rayne


1 Answers

. is regexp notation for "any character". Escape it with \ and it means .:

$ awk -F, '{gsub(/\./,"-",$1); print}' textfile.csv 
abc-def-ghi 123.4561.789 ABC DEF GHI
abc-def-ghq 124.4562.789 ABC DEF GHI
abc-def-ghw 125.4563.789 ABC DEF GHI
abc-def-ghe 126.4564.789   
abc-def-ghr 127.4565.789   
$ 

The output field separator is a space, by default. Set OFS = "," to set that:

$ awk  -F, 'BEGIN {OFS=","} {gsub(/\./,"-",$1); print}' textfile.csv 
abc-def-ghi,123.4561.789,ABC,DEF,GHI
abc-def-ghq,124.4562.789,ABC,DEF,GHI
abc-def-ghw,125.4563.789,ABC,DEF,GHI
abc-def-ghe,126.4564.789,,,
abc-def-ghr,127.4565.789,,,

This still allows changing multiple fields:

$ awk  -F, 'BEGIN {OFS=","} {gsub(/\./,"-",$1); gsub("1", "#",$2); print}' textfile.csv 
abc-def-ghi,#23.456#.789,ABC,DEF,GHI
abc-def-ghq,#24.4562.789,ABC,DEF,GHI
abc-def-ghw,#25.4563.789,ABC,DEF,GHI
abc-def-ghe,#26.4564.789,,,
abc-def-ghr,#27.4565.789,,,

I don't know what -OFS, does, but it isn't a supported command line option; using it to set the output field separator was a mistake on my part. Setting OFS within the awk program works well.

like image 148
sarnold Avatar answered Oct 13 '22 23:10

sarnold