I'm trying to do accented character replacement in PHP but get funky results, my guess being because i'm using a UTF-8 string and str_replace can't properly handle multi-byte strings..
$accents_search = array('á','à','â','ã','ª','ä','å','Á','À','Â','Ã','Ä','é','è',
'ê','ë','É','È','Ê','Ë','í','ì','î','ï','Í','Ì','Î','Ï','œ','ò','ó','ô','õ','º','ø',
'Ø','Ó','Ò','Ô','Õ','ú','ù','û','Ú','Ù','Û','ç','Ç','Ñ','ñ');
$accents_replace = array('a','a','a','a','a','a','a','A','A','A','A','A','e','e',
'e','e','E','E','E','E','i','i','i','i','I','I','I','I','oe','o','o','o','o','o','o',
'O','O','O','O','O','u','u','u','U','U','U','c','C','N','n');
$str = str_replace($accents_search, $accents_replace, $str);
Results I get:
Ørjan Nilsen -> �orjan Nilsen
Expected Result:
Ørjan Nilsen -> Orjan Nilsen
Edit: I've got my internal character handler set to UTF-8 (according to mb_internal_encoding()), also the value of $str is UTF-8, so from what I can tell, all the strings involved are UTF-8. Does str_replace() detect char sets and use them properly?
The str_replace() function replaces some characters with some other characters in a string. This function works by the following rules: If the string to be searched is an array, it returns an array. If the string to be searched is an array, find and replace is performed with every array element.
Approach 1: Using the str_replace() and str_split() functions in PHP. The str_replace() function is used to replace multiple characters in a string and it takes in three parameters. The first parameter is the array of characters to replace.
Using str_replace() Method: The str_replace() method is used to remove all the special characters from the given string str by replacing these characters with the white space (” “).
A null-terminated multibyte string (NTMBS), or "multibyte string", is a sequence of nonzero bytes followed by a byte with value zero (the terminating null character). Each character stored in the string may occupy more than one byte.
According to php documentation str_replace function is binary-safe, which means that it can handle UTF-8
encoded text without any data loss.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With