We have several database fields that contain Windows-1252 characters:
an example pain— if you’re
Those values map to the desired values from this list:
http://www.i18nqa.com/debug/utf8-debug.html
I've tried various permutations of htmlentites, mb_detect_encoding, uft8_decode, etc, but have not yet been able to transform those values to:
an example pain — if you're
How can I transform these characters to their listed values in php?
Just open up the windows-1252 encoded file in Notepad, then choose 'Save as' and set encoding to UTF-8.
Windows-1252 has characters between bytes 127 and 255 that UTF-8 has a different encoding for. Any visible character in the ASCII range (127 and below) are encoded 1:1 in UTF-8. So while you can convert between the two, A CP-1252 string is not guaranteed to be a valid UTF-8 string.
The utf8_encode() function is an inbuilt function in PHP which is used to encode an ISO-8859-1 string to UTF-8. Unicode has been developed to describe all possible characters of all languages and includes a lot of symbols with one unique number for each symbol/character.
You can use mb_convert_encoding
$str = "an example pain— if you’re";
$str = mb_convert_encoding($str, "Windows-1252", "UTF-8");
echo $str;
//an example pain— if you’re
DEMO:
http://ideone.com/NsIb5x
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With