Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replace special characters by equivalent

How do I replace the following special characters by their equivalent?

Vowels: ÁÉÍÓÚáéíóú by AEIOUaeiou respectively. And letter Ñ by N.

The expression:

str = regexprep(str,'[^a-zA-Z]','');

Will remove all characters non in alphabet, but how do I replace with something equivalent like shown above?

Thanks

like image 557
Jorge Zapata Avatar asked Dec 16 '22 18:12

Jorge Zapata


2 Answers

You could write a series of regular expressions like:

s = regexprep(s,'(?:À|Á|Â|Ã|Ä|Å)','A')
s = regexprep(s,'(?:Ì|Í|Î|Ï)','I')

and so on for the rest of the accented characters... (for both upper/lower cases)

Warning: there are so many variations even for the small subset of Latin alphabet


A simpler example:

chars_old = 'ÁÉÍÓÚáéíóú';
chars_new = 'AEIOUaeiou';

str = 'Ámró';
[tf,loc] = ismember(str, chars_old);
str(tf) = chars_new( loc(tf) )

The string before:

>> str
str =
Ámró

after:

>> str
str =
Amro
like image 72
Amro Avatar answered Dec 28 '22 12:12

Amro


The following code normalizes all diacritic characters ie ÅÄÖ.

function inputWash {
    param(
        [string]$inputString
    )
    [string]$formD = $inputString.Normalize(
            [System.text.NormalizationForm]::FormD
    )
    $stringBuilder = new-object System.Text.StringBuilder
    for ($i = 0; $i -lt $formD.Length; $i++){
        $unicodeCategory = [System.Globalization.CharUnicodeInfo]::GetUnicodeCategory($formD[$i])
        $nonSPacingMark = [System.Globalization.UnicodeCategory]::NonSpacingMark
        if($unicodeCategory -ne $nonSPacingMark){
            $stringBuilder.Append($formD[$i]) | out-null
        }
    }
    $string = $stringBuilder.ToString().Normalize([System.text.NormalizationForm]::FormC)
    return $string.toLower()
}
Write-Host inputWash("ÖÄÅÑÜ");

oaanu

Ommit .toLower() if you don't want that feature

like image 31
Otto Remse Avatar answered Dec 28 '22 14:12

Otto Remse