How do I replace the following special characters by their equivalent?
Vowels: ÁÉÍÓÚáéíóú by AEIOUaeiou respectively. And letter Ñ by N.
The expression:
str = regexprep(str,'[^a-zA-Z]','');
Will remove all characters non in alphabet, but how do I replace with something equivalent like shown above?
Thanks
You could write a series of regular expressions like:
s = regexprep(s,'(?:À|Á|Â|Ã|Ä|Å)','A')
s = regexprep(s,'(?:Ì|Í|Î|Ï)','I')
and so on for the rest of the accented characters... (for both upper/lower cases)
Warning: there are so many variations even for the small subset of Latin alphabet
A simpler example:
chars_old = 'ÁÉÍÓÚáéíóú';
chars_new = 'AEIOUaeiou';
str = 'Ámró';
[tf,loc] = ismember(str, chars_old);
str(tf) = chars_new( loc(tf) )
The string before:
>> str
str =
Ámró
after:
>> str
str =
Amro
The following code normalizes all diacritic characters ie ÅÄÖ.
function inputWash {
param(
[string]$inputString
)
[string]$formD = $inputString.Normalize(
[System.text.NormalizationForm]::FormD
)
$stringBuilder = new-object System.Text.StringBuilder
for ($i = 0; $i -lt $formD.Length; $i++){
$unicodeCategory = [System.Globalization.CharUnicodeInfo]::GetUnicodeCategory($formD[$i])
$nonSPacingMark = [System.Globalization.UnicodeCategory]::NonSpacingMark
if($unicodeCategory -ne $nonSPacingMark){
$stringBuilder.Append($formD[$i]) | out-null
}
}
$string = $stringBuilder.ToString().Normalize([System.text.NormalizationForm]::FormC)
return $string.toLower()
}
Write-Host inputWash("ÖÄÅÑÜ");
oaanu
Ommit .toLower() if you don't want that feature
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With