Related questions:
As in the questions above, I'm looking for a reliable, robust way to reduce any unicode character to near-equivalent ASCII using PHP. I really want to avoid rolling my own look up table.
For example (stolen from 1st referenced question): Gračišće
becomes Gracisce
The iconv module can do this, more specifically, the iconv() function:
$str = iconv('Windows-1252', 'ASCII//TRANSLIT//IGNORE', "Gracišce");
echo $str;
//outputs "Gracisce"
The main hassle with iconv is that you just have to watch your encodings, but it's definitely the right tool for the job (I used 'Windows-1252' for the example due to limitations of the text editor I was working with ;) The feature of iconv that you definitely want to use is the //TRANSLIT
flag, which tells iconv to transliterate any characters that don't have an ASCII match into the closest approximation.
I found another solution, based on @zombat's answer.
The issue with his answer was that I was getting:
Notice: iconv() [function.iconv]: Wrong charset, conversion from `UTF-8' to `ASCII//TRANSLIT//IGNORE' is not allowed in D:\www\phpcommand.php(11) : eval()'d code on line 3
And after removing //IGNORE
from the function, I got:
Gr'a'e~a~o^O"ucisce
So, the š
character was translated correctly, but the other characters weren't.
The solution that worked for me is a mix between preg_replace
(to remove everything but [a-zA-Z0-9] - including spaces) and @zombat's solution:
preg_replace('/[^a-zA-Z0-9.]/','',iconv('UTF-8', 'ASCII//TRANSLIT', "GráéãõÔücišce"));
Output:
GraeaoOucisce
My solution is to create two strings - first with not wanted letters and second with letters that will replace firsts.
$from = 'čšć';
$to = 'csc';
$text = 'Gračišće';
$result = str_replace(str_split($from), str_split($to), $text);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With