Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

multibyte strtr() -> mb_strtr()

Does anyone have written multibyte variant of function strtr() ? I need this one.

Edit 1 (example of desired usage):

Example:
$from = 'ľľščťžýáíŕďňäô'; // these chars are in UTF-8
$to   = 'llsctzyaiŕdnao';

// input - in UTF-8
$str  = 'Kŕdeľ ďatľov učí koňa žrať kôru.';
$str  = mb_strtr( $str, $from, $to );

// output - str without diacritic
// $str = 'Krdel datlov uci kona zrat koru.';
like image 697
Martin Ille Avatar asked May 03 '10 14:05

Martin Ille


2 Answers

I believe strtr is multi-byte safe, either way since str_replace is multi-byte safe you could wrap it:

function mb_strtr($str, $from, $to)
{
  return str_replace(mb_str_split($from), mb_str_split($to), $str);
}

Since there is no mb_str_split function you also need to write your own (using mb_substr and mb_strlen), or you could just use the PHP UTF-8 implementation (changed slightly):

function mb_str_split($str) {
    return preg_split('~~u', $str, null, PREG_SPLIT_NO_EMPTY);;

}

However if you're looking for a function to remove all (latin?) accentuations from a string you might find the following function useful:

function Unaccent($string)
{
    return preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml|caron);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8'));
}

echo Unaccent('ľľščťžýáíŕďňä'); // llsctzyairdna
echo Unaccent('Iñtërnâtiônàlizætiøn'); // Internationalizaetion
like image 121
Alix Axel Avatar answered Sep 28 '22 11:09

Alix Axel


function mb_strtr($str,$map,$enc){
$out="";
$strLn=mb_strlen($str,$enc);
$maxKeyLn=1;
foreach($map as $key=>$val){
    $keyLn=mb_strlen($key,$enc);
    if($keyLn>$maxKeyLn){
        $maxKeyLn=$keyLn;
    }
}
for($offset=0; $offset<$strLn; ){
    for($ln=$maxKeyLn; $ln>=1; $ln--){
        $cmp=mb_substr($str,$offset,$ln,$enc);
        if(isset($map[$cmp])){
            $out.=$map[$cmp];
            $offset+=$ln;
            continue 2;
        }
    }
    $out.=mb_substr($str,$offset,1,$enc);
    $offset++;
}
return $out;
}
like image 39
Taras Bogach Avatar answered Sep 28 '22 12:09

Taras Bogach