I'm not sure this is possible to do what I want in sed (or awk or any bash tool):
I want to make a script that replaces : )
in a string by <happy>
and ) :
by <sad>
. This can easily be done with sed with:
echo "test : )" | sed 's/: )/<happy>/g'
echo "test ) :" | sed 's/) :/<sad>/g'
Unfortunately, sometimes I have strings like these:
I'm happy : ) : ) : )
I'm sad ) : ) : ) :
In that case, the output should be:
I'm happy <happy> <happy> <happy>
I'm sad <sad> <sad> <sad>
But by combining the two commands above:
echo "I'm happy : ) : ) : )" | sed 's/: )/<happy>/g' | sed 's/) :/<sad>/g'
echo "I'm sad ) : ) : ) :" | sed 's/: )/<happy>/g' | sed 's/) :/<sad>/g'
I will get:
I'm happy <happy> <happy> <happy>
I'm sad ) <happy> <happy> :
The way to solve this would be to do both replacements in parallel, by treating the string from left to right. I tried to use something like this: sed 's/a/b/g;s/c/d/g'
but the replacement is only done one pattern after one other, and doesn't solve the problem.
With GNU awk for the 3rd arg to match():
$ cat script1.awk
BEGIN {
map[": )"] = "<happy>"
map[") :"] = "<sad>"
}
{
while ( match($0,/(.*)(: \)|\) :)(.*)/,a) ) {
$0 = a[1] map[a[2]] a[3]
}
print
}
$ awk -f script1.awk file
I'm happy <happy> <happy> <happy>
I'm sad <sad> <sad> <sad>
With any awk:
$ cat script2.awk
BEGIN {
map[": )"] = "<happy>"
map[") :"] = "<sad>"
}
{
while ( match($0,/: \)|\) :/) ) {
$0 = substr($0,1,RSTART-1) map[substr($0,RSTART,RLENGTH)] substr($0,RSTART+RLENGTH)
}
print
}
$ awk -f script2.awk file
I'm happy <happy> <happy> <happy>
I'm sad <sad> <sad> <sad>
Although both approaches produce the same output in this case, the first approach actually works from the end of the string to the front courtesy of the leading .*
while the second approach works front to back. You can see that with this test:
$ echo ': ) :' | awk -f script1.awk
: <sad>
$ echo ': ) :' | awk -f script2.awk
<happy> :
You can do a back-to-front pass with any awk with a tweak but I don't think that's what you really want anyway.
Edit to build the regexp from the map:
$ cat tst.awk
BEGIN {
map[": )"] = "<happy>"
map[") :"] = "<sad>"
for (emoji in map) {
gsub(/[^^]/,"[&]",emoji)
gsub(/\^/,"\\^",emoji)
emojis = (emojis == "" ? "" : emojis "|") emoji
}
}
{
while ( match($0,emojis) ) {
$0 = substr($0,1,RSTART-1) map[substr($0,RSTART,RLENGTH)] substr($0,RSTART+RLENGTH)
}
print
}
$ awk -f tst.awk file
I'm happy <happy> <happy> <happy>
I'm sad <sad> <sad> <sad>
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With