The code:
<?php
$pdo = new \PDO("mysql:host=127.0.0.1;dbname=***", '***', '***');
$pdo->exec('SET CHARACTER SET utf8');
$sql = "INSERT INTO pdo_blob (the_blob) VALUES(:the_blob)";
$insertStm = $pdo->prepare($sql);
$blob = (binary) file_get_contents('/home/***/test.pdf');
$insertStm->bindParam(":the_blob", $blob, \PDO::PARAM_LOB);
$insertStm->execute();
$selectStm = $pdo->prepare("SELECT the_blob FROM pdo_blob ORDER BY id DESC LIMIT 1");
$selectStm->execute();
$savedBlob = null;
$selectStm->bindColumn(1, $savedBlob, \PDO::PARAM_LOB);
$selectStm->fetch();
echo 'equal: ' . ((int) ($blob == $savedBlob));
In MySQL, we can use BLOB datatype to store the files. A BLOB is a binary large object that can hold a variable amount of data. We can represent the files in binary format and then store them in our database. The four BLOB types are TINYBLOB, BLOB, MEDIUMBLOB, and LONGBLOB.
BLOB, which stands for a Binary Large Object, is a MySQL data type that can store images, PDF files, multimedia, and other types of binary data.
The default MySQL server character set and collation are latin1 and latin1_swedish_ci , but you can specify character sets at the server, database, table, column, and string literal levels.
A BLOB (binary large object) is a varying-length binary string that can be up to 2,147,483,647 characters long. Like other binary types, BLOB strings are not associated with a code page. In addition, BLOB strings do not hold character data.
Simply delete or comment out line below, and it will always work, no matter which database encoding is really in use (utf8
, latin1
, etc):
$pdo->exec('SET CHARACTER SET utf8');
This is not PDO bug, this is MySQL bug.
When actual database encoding is latin1
, but you use:
SET CHARACTER SET utf8
(or vice versa: actual is utf8
, but you use latin1
- important part is that it is different), then, as far as I can tell, MySQL will try to perform charset conversion for all traffic between client and server (even for BLOB
!).
If you DO NOT use SET CHARACTER SET
statement, from what I see for scripts (PHP/PDO or Perl/DBI) connection charset by default is set to be the database charset, and in that case no implicit conversion takes place.
Obviously, this automatic conversion is what kills BLOBs, which do not want any conversion to happen.
I have tested this on both PHP/PDO and Perl/DBI, and issue is easily reproducible: both will fail if using database with latin1
encoding and using SET CHARACTER SET utf8
(or vice versa).
If you want to be fully UTF8
compatible, you should change encoding of your database using:
ALTER DATABASE mydb CHARSET utf8;
With this, everything will be using UTF8
, and BLOBs will also work fine.
Minimal file that causes this corruption problem is blob.bin
with single byte 0xFF
. On Linux, you can create this test file using printf
command:
printf "0xFF" > blob.bin
Now, test scripts that reproduce the problem:
<?php
$dbh = new PDO("mysql:host=127.0.0.1;dbname=test");
# If database encoding is NOT utf8, uncomment to break it:
# $dbh->exec("SET CHARACTER SET utf8");
$blob1 = file_get_contents("blob.bin");
$sth = $dbh->prepare(
"INSERT INTO pdo_blob (the_blob) VALUES(:the_blob)"
);
$sth->bindParam(":the_blob", $blob1, PDO::PARAM_LOB);
$sth->execute();
$sth = $dbh->prepare(
"SELECT the_blob FROM pdo_blob ORDER BY id DESC LIMIT 1"
);
$sth->execute();
$blob2 = null;
$sth->bindColumn(1, $blob2, PDO::PARAM_LOB);
$sth->fetch();
if ($blob1 == $blob2) {
echo "Equal\n";
} else {
echo "Not equal\n";
$arr1 = str_split($blob1);
$arr2 = str_split($blob2);
$i=0;
for ($i=0; $i<count($arr1); $i++) {
if ($arr1[$i] != $arr2[$i]) {
echo "First diff: " . dechex(ord($arr1[$i])) . " != "
. dechex(ord($arr2[$i])) . "\n";
break;
}
}
}
?>
#!/usr/bin/perl -w
use strict;
use DBI qw(:sql_types);
my $dbh = DBI->connect("dbi:mysql:host=127.0.0.1;dbname=test");
# If database encoding is NOT utf8, uncomment to break it:
# $dbh->do("SET CHARACTER SET utf8");
open FILE, "blob.bin";
binmode FILE;
read(FILE, my $blob1, 100000000);
close FILE;
my $sth = $dbh->prepare(
"INSERT INTO pdo_blob (the_blob) VALUES(?)"
);
$sth->bind_param(1, $blob1, SQL_BLOB);
$sth->execute();
my ($blob2) = $dbh->selectrow_array(
"SELECT the_blob FROM pdo_blob ORDER BY id DESC LIMIT 1"
);
print ($blob1 eq $blob2 ? "Equal" : "Not equal") , "\n";
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With