I'm attempting to export our very large MySQL database (1.6GB - mostly BLOB) and import into a new server. I have worked through most of the problems and finally completed the import without any errors. Using MySQL Query Browser I ran a query on a table with BLOBs of images and saved one to disk (using the save icon in query browser). When I tried to open the file I received an "invalid image format" error. Uh oh.
Using query browser I inspected the value on the source database and the new, recently imported database. The values are different, I think. It could just be encoding issues or something. Here is what I see:
Source (valid data) server:
FF D8 FF E0 00 10 4A 46 49 46 00 01 01 01 00 60
00 60 00 00 FF DB 00 43 00 08 06 06 07 06 05 08
and so on...
New server:
C3 BF C3 98 C3 BF C3 A0 00 10 4A 46 49 46 00 01
01 01 00 60 00 60 00 00 C3 BF C3 9B 00 43 00 08
and so on...
In this example it appears to my newbie eye that there are 3 bytes of extra data in the front of the data on the "new" server.
I then checked out the sql dump file using 010 editor. I found the line for this particular example and here is what I see:
FF D8 FF E0 5C 30 10 4A 46 49 46 5C 30 01 01 01
5C 30 60 5C 30 60 5C 30 5C 30 FF DB 5C 30 43 5C
30 08 06 06 07 06 05 08 and so on...
Now I'm way over my head. I see the pattern, I do notice that the HEX pair 5C 30 appears to be the same as 00 but I don't understand WHY. At this point I have a source server that is about to be wiped and a new one that I'm afraid has corrupt data imported into it. I'm hoping this is some kind of encoding issue that can be solved by setting a global variable in MySQL but I really don't know.
I should also mention that when I save the files from the source (working) server and the new (corrupt) server the files size is about 40% larger for the corrupt file.
I checked the character set variables on both servers:
SHOW VARIABLES LIKE '%char%'
source server:
character_set_client utf8
character_set_connection utf8
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
new server:
character_set_client utf8
character_set_connection utf8
character_set_database latin1
character_set_filesystem binary
character_set_results utf8
character_set_server latin1
character_set_system utf8
They are the same.
The corrupted data from the new database looks like the result of converting the source data from ISO‑8859‑1 to UTF‑8 (e.g. U+00FF — ÿ — is FF
in the former and C3 BF
in the latter).
Since BLOBs have no character set, the character encoding is not controlled by server variables; I suspect mysqldump
is outputting your BLOB data in a UTF-8–encoded file (which is the default), and it is encoded along the way somehow, through some combination of server settings and options passed to mysqldump
.
The best solution may be to use the --hex-blob
option when exporting BLOB fields, which would result in something like:
INSERT INTO `table` VALUES (0xFFD8FFE0...);
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With