I want to sanitise some input and replace several characters with acceptable input, e.g. a Danish '<code>å</code>' with '<code>aa</code>'. This is easily done using several statements, e.g. <code>/æ/ae/</code>, <code>/å/aa/</code>, <code>/ø/oe/</code>, but due to tool limitations, I want to be able to do this in a single regular expression. I can catch all of the relevant cases (<code>/[(æ)(ø)(å)(Æ)(Ø)(Å)]/</code>) but I replacement does not work as I want it to (but probably completely as intended): <pre class="prettyprint"><code> $ temp="RødgrØd med flæsk" $ echo $temp RødgrØd med flæsk $ echo $temp | sed 's/[(æ)(ø)(å)(Æ)(Ø)(Å)]/(ae)(oe)(aa)(Ae)(Oe)(Aa)/g' R(ae)(oe)(aa)(Ae)(Oe)(Aa)dgr(ae)(oe)(aa)(Ae)(Oe)(Aa)d med fl(ae)(oe)(aa)(Ae)(Oe)(Aa)sk </code></pre> (first echo line is to show that it isn't an encoding issue) Just as an aside, the tool issue is that I should like to also use the same regex in a Sublime Text 2 snippet. Anyone able to discern what is wrong with my regex statement? Thanks in advance.

Split it up into several <code>sed</code> statements, separated by <code>;</code>: <pre class="prettyprint"><code>sed 's/æ/ae/g;s/ø/oe/g;s/å/aa/g;s/Æ/Ae/g;s/Ø/Oe/g;s/Å/Aa/g' </code></pre>

With <pre class="prettyprint"><code>sed -e 's/Find/Replace/g;s/Find/Replace/g;[....];/Find/Replace/g' </code></pre> you'll do the trick. So, translate into what you need <pre class="prettyprint"><code>sed -e 's/æ/ae/g;s/ø/oe/g;s/å/aa/g;s/Æ/Ae/g;s/Ø/Oe/g;s/Å/Aa/g' </code></pre>

Regular Expression in sed for multiple replacements in one statement

Tags:

regex

regex-group

sed

sublimetext2

I want to sanitise some input and replace several characters with acceptable input, e.g. a Danish 'å' with 'aa'.

This is easily done using several statements, e.g. /æ/ae/, /å/aa/, /ø/oe/, but due to tool limitations, I want to be able to do this in a single regular expression.

I can catch all of the relevant cases (/[(æ)(ø)(å)(Æ)(Ø)(Å)]/) but I replacement does not work as I want it to (but probably completely as intended):

 $ temp="RødgrØd med flæsk"   $ echo $temp  RødgrØd med flæsk   $ echo $temp | sed 's/[(æ)(ø)(å)(Æ)(Ø)(Å)]/(ae)(oe)(aa)(Ae)(Oe)(Aa)/g'  R(ae)(oe)(aa)(Ae)(Oe)(Aa)dgr(ae)(oe)(aa)(Ae)(Oe)(Aa)d med fl(ae)(oe)(aa)(Ae)(Oe)(Aa)sk

(first echo line is to show that it isn't an encoding issue)

Just as an aside, the tool issue is that I should like to also use the same regex in a Sublime Text 2 snippet.

Anyone able to discern what is wrong with my regex statement?

Thanks in advance.

974

asked Jan 03 '13 07:01

Jan

2 Answers

Split it up into several sed statements, separated by ;:

sed 's/æ/ae/g;s/ø/oe/g;s/å/aa/g;s/Æ/Ae/g;s/Ø/Oe/g;s/Å/Aa/g'

101

answered Sep 30 '22 21:09

Anders Johansson

With

sed -e 's/Find/Replace/g;s/Find/Replace/g;[....];/Find/Replace/g'

you'll do the trick.

So, translate into what you need

sed -e 's/æ/ae/g;s/ø/oe/g;s/å/aa/g;s/Æ/Ae/g;s/Ø/Oe/g;s/Å/Aa/g'

answered Sep 30 '22 21:09

DonCallisto

Related questions
                            
                                Regex in JavaScript for validating decimal numbers
                            
                                Do RE2-like regular expression library for Java exist?
                            
                                How to extract numbers (along with comparison adjectives or ranges)
                            
                                regexp: match character group or end of line
                            
                                Regular expression to remove one parameter from query string
                            
                                jQuery :contains(regex)? [duplicate]
                            
                                How do I group regular expressions past the 9th backreference?
                            
                                regex word boundary excluding the hyphen
                            
                                Why can't you use repetition quantifiers in zero-width look behind assertions?
                            
                                Any way to escape a Go string in a regular expression?
                            
                                mod_rewrite RewriteCond - is NC flag necessary for just domain part? And some more
                            
                                Dart how to match and then replace a regexp
                            
                                php string matching with wildcard *?
                            
                                Do Python regular expressions have an equivalent to Ruby's atomic grouping?
                            
                                Regular expression, split string by capital letter but ignore TLA
                            
                                Regular expression parsing a binary file?
                            
                                Regular Expressions: How to Express \w Without Underscore
                            
                                Ruby match first occurrence of string for a gsub replacement
                            
                                Does lookbehind work in sed?
                            
                                How to write a search pattern to include a space in findstr?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With