I'm trying to convert a greek database to utf8. At this point, I've figured out how to do it (via MySQL, not through the iconv() function) but I have a problem: The application stores lots of data in the database in php serialized format (via serialize()).
As you may know, this format stores the string lengths in the serialized string. This means that since the lengths change after the conversion (because php5 doesn't support Unicode properly) those strings can't be unserialized anymore.
So far, I'm considering using one of the following approaches to work around this:
Option #2 seems easier, but I'm thinking there has to be a quicker way to do this. Maybe even a freely available script for converting them, since I'm definitely not the first one to face this problem. Any ideas?
Thanks in advance.
Do a SHOW CREATE TABLE and check the TABLE's encoding. Then connect to the database with that same encoding (execute a USE 'that encoding';).
Now when you retrieve the serialized string unserialize() it. The return will be whatever your application passed to serialize().
Once you get here you'll need to know what encoding the strings were inserted originally (e.g. ISO-8859-1, CP1252, etc...), so you can convert it to utf-8.
Now that you have your Greek, no pun intended, converted to a utf-8 string you can put it back into the database.
I would highly recommend you reorganize the database to NOT use serialized strings to store data. If you are storing BLOBS in your database consider moving them out of the database and storing them on your file system.
Good luck.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With