Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Easy way of converting php serialized strings to utf8?

I'm trying to convert a greek database to utf8. At this point, I've figured out how to do it (via MySQL, not through the iconv() function) but I have a problem: The application stores lots of data in the database in php serialized format (via serialize()).

As you may know, this format stores the string lengths in the serialized string. This means that since the lengths change after the conversion (because php5 doesn't support Unicode properly) those strings can't be unserialized anymore.

So far, I'm considering using one of the following approaches to work around this:

  1. Use PHP to convert those strings to utf8, and instead of converting the whole serialized string, unserialize it and convert every item in the array.
  2. Write a script to re-calculate the lengths of the serialized strings.

Option #2 seems easier, but I'm thinking there has to be a quicker way to do this. Maybe even a freely available script for converting them, since I'm definitely not the first one to face this problem. Any ideas?

Thanks in advance.

like image 973
Lea Verou Avatar asked Dec 04 '25 00:12

Lea Verou


1 Answers

Do a SHOW CREATE TABLE and check the TABLE's encoding. Then connect to the database with that same encoding (execute a USE 'that encoding';).

Now when you retrieve the serialized string unserialize() it. The return will be whatever your application passed to serialize().

Once you get here you'll need to know what encoding the strings were inserted originally (e.g. ISO-8859-1, CP1252, etc...), so you can convert it to utf-8.

Now that you have your Greek, no pun intended, converted to a utf-8 string you can put it back into the database.

I would highly recommend you reorganize the database to NOT use serialized strings to store data. If you are storing BLOBS in your database consider moving them out of the database and storing them on your file system.

Good luck.

like image 159
Yzmir Ramirez Avatar answered Dec 06 '25 12:12

Yzmir Ramirez