Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert utf8-characters to iso-88591 and back in PHP

Some of my script are using different encoding, and when I try to combine them, this has becom an issue.

But I can't change the encoding they use, instead I want to change the encodig of the result from script A, and use it as parameter in script B.

So: is there any simple way to change a string from UTF-8 to ISO-88591 in PHP? I have looked at utf_encode and _decode, but they doesn't do what i want. Why doesn't there exsist any "utf2iso()"-function, or similar?

I don't think I have characters that can't be written in ISO-format, so that shouldn't be an huge issue.

like image 527
qualbeen Avatar asked Dec 17 '08 12:12

qualbeen


People also ask

Is ISO 8859 the same as UTF-8?

UTF-8 is a multibyte encoding that can represent any Unicode character. ISO 8859-1 is a single-byte encoding that can represent the first 256 Unicode characters. Both encode ASCII exactly the same way.

How do I UTF-8 encode a string in PHP?

PHP utf8_encode() Function$text = "\xE0"; echo utf8_encode($text);

Does PHP use UTF-8?

The utf8_encode() function is an inbuilt function in PHP which is used to encode an ISO-8859-1 string to UTF-8.


2 Answers

Have a look at iconv() or mb_convert_encoding(). Just by the way: why don't utf8_encode() and utf8_decode() work for you?

utf8_decode — Converts a string with ISO-8859-1 characters encoded with UTF-8 to single-byte ISO-8859-1

utf8_encode — Encodes an ISO-8859-1 string to UTF-8

So essentially

$utf8 = 'ÄÖÜ'; // file must be UTF-8 encoded $iso88591_1 = utf8_decode($utf8); $iso88591_2 = iconv('UTF-8', 'ISO-8859-1', $utf8); $iso88591_2 = mb_convert_encoding($utf8, 'ISO-8859-1', 'UTF-8');  $iso88591 = 'ÄÖÜ'; // file must be ISO-8859-1 encoded $utf8_1 = utf8_encode($iso88591); $utf8_2 = iconv('ISO-8859-1', 'UTF-8', $iso88591); $utf8_2 = mb_convert_encoding($iso88591, 'UTF-8', 'ISO-8859-1'); 

all should do the same - with utf8_en/decode() requiring no special extension, mb_convert_encoding() requiring ext/mbstring and iconv() requiring ext/iconv.

like image 53
Stefan Gehrig Avatar answered Sep 23 '22 20:09

Stefan Gehrig


First of all, don't use different encodings. It leads to a mess, and UTF-8 is definitely the one you should be using everywhere.

Chances are your input is not ISO-8859-1, but something else (ISO-8859-15, Windows-1252). To convert from those, use iconv or mb_convert_encoding.

Nevertheless, utf8_encode and utf8_decode should work for ISO-8859-1. It would be nice if you could post a link to a file or a uuencoded or base64 example string for which the conversion fails or yields unexpected results.

like image 29
phihag Avatar answered Sep 19 '22 20:09

phihag