Does anyone have written multibyte variant of function strtr() ? I need this one.
Edit 1 (example of desired usage):
Example: $from = 'ľľščťžýáíŕďňäô'; // these chars are in UTF-8 $to = 'llsctzyaiŕdnao'; // input - in UTF-8 $str = 'Kŕdeľ ďatľov učí koňa žrať kôru.'; $str = mb_strtr( $str, $from, $to ); // output - str without diacritic // $str = 'Krdel datlov uci kona zrat koru.';
I believe strtr
is multi-byte safe, either way since str_replace
is multi-byte safe you could wrap it:
function mb_strtr($str, $from, $to)
{
return str_replace(mb_str_split($from), mb_str_split($to), $str);
}
Since there is no mb_str_split
function you also need to write your own (using mb_substr
and mb_strlen
), or you could just use the PHP UTF-8 implementation (changed slightly):
function mb_str_split($str) {
return preg_split('~~u', $str, null, PREG_SPLIT_NO_EMPTY);;
}
However if you're looking for a function to remove all (latin?) accentuations from a string you might find the following function useful:
function Unaccent($string)
{
return preg_replace('~&([a-z]{1,2})(?:acute|cedil|circ|grave|lig|orn|ring|slash|th|tilde|uml|caron);~i', '$1', htmlentities($string, ENT_QUOTES, 'UTF-8'));
}
echo Unaccent('ľľščťžýáíŕďňä'); // llsctzyairdna
echo Unaccent('Iñtërnâtiônàlizætiøn'); // Internationalizaetion
function mb_strtr($str,$map,$enc){
$out="";
$strLn=mb_strlen($str,$enc);
$maxKeyLn=1;
foreach($map as $key=>$val){
$keyLn=mb_strlen($key,$enc);
if($keyLn>$maxKeyLn){
$maxKeyLn=$keyLn;
}
}
for($offset=0; $offset<$strLn; ){
for($ln=$maxKeyLn; $ln>=1; $ln--){
$cmp=mb_substr($str,$offset,$ln,$enc);
if(isset($map[$cmp])){
$out.=$map[$cmp];
$offset+=$ln;
continue 2;
}
}
$out.=mb_substr($str,$offset,1,$enc);
$offset++;
}
return $out;
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With