Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Execute command on the same line multiple times with sed

Tags:

regex

linux

sed

I need to highlight every duplicate word in the text with * symbol.
For example

lol foo lol bar foo bar

should be

lol foo *lol* bar *foo* *bar*

I tried with the following command:

echo "lol foo lol bar foo bar" | sed -r -e 's/(\b[a-zA-Z]+\b)([^*]+)(\1)/\1\2*\3*/'

It gives me:

lol foo *lol* bar foo bar

Then I added g flag:

lol foo *lol* bar foo *bar*

But foo is not highlighted.
I know that it happens because sed doesn't look behind if the match was found.

Can I handle it with only sed?

like image 210
Dany Avatar asked Sep 27 '13 21:09

Dany


2 Answers

Sed is not the best tool for this task. It doesn't look-ahead, look-behind and non-greedy quantifiers, but give a try to the following command:

sed -r -e ':a ; s/\b([a-zA-Z]+)\b(.*) (\1)( |$)/\1\2 *\3* / ; ta'

It uses conditional branching to execute the substitution command until it fails. Also, you cannot check ([^*]+) because for second round it has to traverse some * of the first substitution, your option is a greedy .*. And last, you cannot match (\1) only because it would match the first string lol again and again. You need some context like surrounded by spaces or end of line.

The command yields:

lol foo *lol* bar *foo* *bar*

UPDATE: An improvement provided by potong in comments:

sed -r ':a;s/\b(([[:alpha:]]+)\s.*\s)\2\b/\1*\2*/;ta' file
like image 100
Birei Avatar answered Oct 22 '22 13:10

Birei


Using awk

awk '{for (i=1;i<=NF;i++) if (a[$i]++>=1) printf "*%s* ",$i; else printf "%s ",$i; print ""}' file
lol foo *lol* bar *foo* *bar*
like image 34
Jotne Avatar answered Oct 22 '22 14:10

Jotne