Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repeating regex replace with SED

Tags:

regex

linux

sed

I have the following lines (in reality there are ~1M of these lines):

foo|||bar
qux||boo|fzx

Note that every line contain exactly 4 fields, but the number of characters can be more than 3.

What I want to do is to replace every|| with |nil| resulting:

foo|nil|nil|bar
qux|nil|boo|fzx

What's the way to do it with sed?

I tried this but fail:

sed 's/||/|nil/g'
like image 611
neversaint Avatar asked Feb 06 '13 08:02

neversaint


1 Answers

You need to repeat the substitution until it doesn't change:

sed ':a; s/||/|nil|/g; ta'

However this will not handle empty fields at the beginning or end, for that you need two more patterns:

sed 's/^|/nil|/; s/|$/|nil/; :a; s/||/|nil|/g; ta'

Testing

Input:

cat << EOF > infile
foo|||bar
qux||boo|fzx
|||
EOF

Run it:

<infile sed 's/^|/nil|/; s/|$/|nil/; :a; s/||/|nil|/g; ta'

Output:

foo|nil|nil|bar
qux|nil|boo|fzx
nil|nil|nil|nil

an awk way

awk '{ for(i=1;i<=NF;i++) if(length($i)==0) $i="nil" } 1' FS='|' OFS='|'
like image 192
Thor Avatar answered Oct 11 '22 01:10

Thor