I am comparing strings and have to replace umlauts in JS, but it seems JS does not recognize the umlauts in the strings. The text comes from the database and in the browser the umlauts do show fine.
function replaceUmlauts(string)
{
value = string.toLowerCase();
value = value.replace(/ä/g, 'ae');
value = value.replace(/ö/g, 'oe');
value = value.replace(/ü/g, 'ue');
return value;
}
As search patterns I tried:
ä
", "ö
", "ü
" (well total despair ;-))To be sure, that it is not a matter with the replace function I tried indexOf:
console.log(value.indexOf('ä'));
But the output with all patterns is: -1
So I guess it is some kind of a problem with encoding, but as I said on the page the umlauts do just look fine.
Any ideas? This seems so simple...
EDIT: Even if I found my answer, the problem was not really solved "at the root" (the encoding). This is my page encoding:
<meta content="text/html; charset=UTF-8" http-equiv="Content-Type">
The database has: utf8_general_ci
Seems totally alright to me.
The "g" that you are talking about at the end of your regular expression is called a "modifier". The "g" represents the "global modifier". This means that your replace will replace all copies of the matched string with the replacement string you provide.
Either ensure that your script's encoding is correctly specified (in <script>
tag or in page's header/meta if it's embedded) or specify symbols with \uNNNN
syntax that will always unambiguously resolve to some specific Unicode codepoint.
For example:
str.replace(/\u00e4/g, "ae")
Will always replace ä with ae, no matter what encoding is set for your page/script, even if it is incorrect.
Here are the codes needed for Germanic languages:
// Ü, ü \u00dc, \u00fc // Ä, ä \u00c4, \u00e4 // Ö, ö \u00d6, \u00f6 // ß \u00df
If you are looking to replace the German Umlaute with cleverly respecting the case, use this (opensource, happy to share, all by me):
const umlautMap = {
'\u00dc': 'UE',
'\u00c4': 'AE',
'\u00d6': 'OE',
'\u00fc': 'ue',
'\u00e4': 'ae',
'\u00f6': 'oe',
'\u00df': 'ss',
}
function replaceUmlaute(str) {
return str
.replace(/[\u00dc|\u00c4|\u00d6][a-z]/g, (a) => {
const big = umlautMap[a.slice(0, 1)];
return big.charAt(0) + big.charAt(1).toLowerCase() + a.slice(1);
})
.replace(new RegExp('['+Object.keys(umlautMap).join('|')+']',"g"),
(a) => umlautMap[a]
);
}
const test = ['Übung', 'ÜBUNG', 'üben', 'einüben', 'EINÜBEN', 'Öde ätzende scheiß Übung']
test.forEach((str) => console.log(str + " -> " + replaceUmlaute(str)))
It will:
Here's a function that replaces most common chars to produce a Google friendly SEO url:
function deUmlaut(value){
value = value.toLowerCase();
value = value.replace(/ä/g, 'ae');
value = value.replace(/ö/g, 'oe');
value = value.replace(/ü/g, 'ue');
value = value.replace(/ß/g, 'ss');
value = value.replace(/ /g, '-');
value = value.replace(/\./g, '');
value = value.replace(/,/g, '');
value = value.replace(/\(/g, '');
value = value.replace(/\)/g, '');
return value;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With