Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

iconv() Vs. utf8_encode()

when you have a charset different of UTF-8 and you need to put it on JSON format to migrate it to a DB, there are two methods that can be used in PHP, calling utf8_encode() and iconv(). I would like to know which one have better performance, and when is convenient to use one or another.

like image 846
Pedro Teran Avatar asked Feb 29 '12 12:02

Pedro Teran


People also ask

What is iconv function?

The iconv() function converts a sequence of characters in one character encoding to a sequence of characters in another character encoding.

What is iconv PHP?

Iconv converts character encoding by setting a unique number for every character. The module contains an interface to iconv character set conversion facility. We use this module to transfer the character set of a string from a local character set to another character set.

Does PHP use utf8?

The utf8_encode() function is an inbuilt function in PHP which is used to encode an ISO-8859-1 string to UTF-8. Unicode has been developed to describe all possible characters of all languages and includes a lot of symbols with one unique number for each symbol/character.

What is utf8 PHP?

Definition and Usage. The utf8_encode() function encodes an ISO-8859-1 string to UTF-8. Unicode is a universal standard, and has been developed to describe all possible characters of all languages plus a lot of symbols with one unique number for each character/symbol.


1 Answers

when you have a charset different of UTF-8

Nope - utf8_encode() is suitable only for converting a ISO-8859-1 string to UTF-8. Iconv provides a vast number of source and target encodings.

Re performance, I have no idea how utf8_encode() works internally and what libraries it uses, but my prediction is there won't be much of a difference - at least not on "normal" amounts of data in the bytes or kilobytes. If in doubt, do a benchmark.

I tend to use iconv() because it's clearer that there is a conversion from character set A to character set B.

Also, iconv() provides more detailed control on what to do when it encounters invalid data. Adding //IGNORE to the target character set will cause it to silently drop invalid characters. This may be helpful in certain situations.

like image 76
Pekka Avatar answered Sep 21 '22 07:09

Pekka