Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Regular Expression in sed for multiple replacements in one statement

I want to sanitise some input and replace several characters with acceptable input, e.g. a Danish 'å' with 'aa'.

This is easily done using several statements, e.g. /æ/ae/, /å/aa/, /ø/oe/, but due to tool limitations, I want to be able to do this in a single regular expression.

I can catch all of the relevant cases (/[(æ)(ø)(å)(Æ)(Ø)(Å)]/) but I replacement does not work as I want it to (but probably completely as intended):

 $ temp="RødgrØd med flæsk"   $ echo $temp  RødgrØd med flæsk   $ echo $temp | sed 's/[(æ)(ø)(å)(Æ)(Ø)(Å)]/(ae)(oe)(aa)(Ae)(Oe)(Aa)/g'  R(ae)(oe)(aa)(Ae)(Oe)(Aa)dgr(ae)(oe)(aa)(Ae)(Oe)(Aa)d med fl(ae)(oe)(aa)(Ae)(Oe)(Aa)sk 

(first echo line is to show that it isn't an encoding issue)

Just as an aside, the tool issue is that I should like to also use the same regex in a Sublime Text 2 snippet.

Anyone able to discern what is wrong with my regex statement?

Thanks in advance.

like image 974
Jan Avatar asked Jan 03 '13 07:01

Jan


People also ask

Can you use regex in sed command?

The sed command has longlist of supported operations that can be performed to ease the process of editing text files. It allows the users to apply the expressions that are usually used in programming languages; one of the core supported expressions is Regular Expression (regex).

What type of regex does sed use?

As Avinash Raj has pointed out, sed uses basic regular expression (BRE) syntax by default, (which requires ( , ) , { , } to be preceded by \ to activate its special meaning), and -r option switches over to extended regular expression (ERE) syntax, which treats ( , ) , { , } as special without preceding \ .

How do you use sed to match word and perform find and replace?

Find and replace text within a file using sed command Use Stream EDitor (sed) as follows: sed -i 's/old-text/new-text/g' input.txt. The s is the substitute command of sed for find and replace. It tells sed to find all occurrences of 'old-text' and replace with 'new-text' in a file named input.txt.


2 Answers

Split it up into several sed statements, separated by ;:

sed 's/æ/ae/g;s/ø/oe/g;s/å/aa/g;s/Æ/Ae/g;s/Ø/Oe/g;s/Å/Aa/g' 
like image 101
Anders Johansson Avatar answered Sep 30 '22 21:09

Anders Johansson


With

sed -e 's/Find/Replace/g;s/Find/Replace/g;[....];/Find/Replace/g' 

you'll do the trick.

So, translate into what you need

sed -e 's/æ/ae/g;s/ø/oe/g;s/å/aa/g;s/Æ/Ae/g;s/Ø/Oe/g;s/Å/Aa/g' 
like image 24
DonCallisto Avatar answered Sep 30 '22 21:09

DonCallisto