Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace/delete special characters within matched strings in sed

Tags:

regex

sed

I have a file containing lines like

I want a lot <*tag 1> more <*tag 2>*cheese *cakes.

I am trying to remove the * within <> but not outside. The tags can be more complicated than above. For example, <*better *tag 1>.

I tried /\bregex\b/s/\*//g, which works for tag 1 but not tag 2. So how can I make it work for tag 2 as well?

Many thanks.

like image 503
ToonZ Avatar asked May 30 '13 17:05

ToonZ


People also ask

How can I replace text after a specific word using sed?

Find and replace text within a file using sed command Use Stream EDitor (sed) as follows: sed -i 's/old-text/new-text/g' input. txt. The s is the substitute command of sed for find and replace.

How do I remove special characters from a string in Linux?

The tr command (short for translate) is used to translate, squeeze, and delete characters from a string. You can also use tr to remove characters from a string. For demonstration purposes, we will use a sample string and then pipe it to the tr command.


3 Answers

Obligatory Perl solution:

perl -pe '$_ = join "",
        map +($i++ % 2 == 0 ? $_ : s/\*//gr),
        split /(<[^>]+>)/, $_;' FILE

Append:

perl -pe 's/(<[^>]+>)/$1 =~ s(\*)()gr/ge' FILE
like image 104
bambams Avatar answered Nov 02 '22 23:11

bambams


Simple solution if you have only one asterisk in tag

sed 's/<\([^>]*\)\*\([^>]*\)>/<\1\2>/g'

If you can have more, you can use sed goto label system

sed ':doagain s/<\([^>]*\)\*\([^>]*\)>/<\1\2>/g; t doagain'

Where doagain is label for loop, t doagain is conditional jump to label doagain. Refer to the sed manual:

t label

 Branch to label only if there has been a successful substitution since the last 
 input line was read or conditional branch was taken. The label may be omitted, in 
 which case the next cycle is started.
like image 36
bartimar Avatar answered Nov 02 '22 23:11

bartimar


awk could solve your problem:

awk '{x=split($0,a,/<[^>]*>/,s);for(i in s)gsub(/\*/,"",s[i]);for(j=1;j<=x;j++)r=r a[j] s[j]; print r}' file

more readable version:

 awk '{x=split($0,a,/<[^>]*>/,s)
       for(i in s)gsub(/\*/,"",s[i])
       for(j=1;j<=x;j++)r=r a[j] s[j]
       print r}' file

test with your data:

kent$  cat file
I want a lot <*tag 1> more <*tag 2>*cheese *cakes. <*better *tag X*>

kent$  awk '{x=split($0,a,/<[^>]*>/,s);for(i in s)gsub(/\*/,"",s[i]);for(j=1;j<=x;j++)r=r a[j] s[j]; print r}' file
I want a lot <tag 1> more <tag 2>*cheese *cakes. <better tag X>
like image 31
Kent Avatar answered Nov 03 '22 00:11

Kent