I have code looking something like this:
$data = file_get_contents($tempFile); // perhaps 30MB of file data, now in PHP's memory
$hash = md5($data);
$query = "INSERT INTO some_table
SET BlobData = '" . mysql_real_escape_string($data) . "',
BlobHash = '$hash'
";
mysql_query($query);
I know this isn't very efficient as each of the '.' operators will reallocate a bigger memory block and the 30MB string will be copied several times.
Is there anything more efficient than the following solution?
$data = file_get_contents($tempFile); // perhaps 30MB of file data, now in PHP's memory
$hash = md5($data);
$query = "INSERT INTO some_table SET BlobData = '%s', BlobHash = '$hash'";
mysql_query(sprintf($query, mysql_real_escape_string($data)));
You have two issues here:
#1, there are several different ways you can compute the MD5 hash:
md5()
md5_file()
md5
or md5_file
to avoid loading entirely into memoryexec()
to call the system's md5sum
commandMD5()
function to compute the hashSince these are all trivial to implement it would be easy for you to implement and benchmark them all for memory usage and speed. Here are some benchmarks showing system md5 via exec
to be a lot faster than PHP's md5_file
as file size increases. Doing it your way is definitely the worst way as far as memory usage is concerned.
#2, mysql_real_escape_string
performs a database query, so you're actually transmitting your blob data to the database, getting it back as a string, and transmitting it again(!) with the INSERT query. So it's traveling to/from the DB server 3x instead of 1x and using 2x the memory in PHP.
It's going to be more efficient to use PHP5 prepared statements and only send this data to the database once. Read the linked article section, you'll see it mentions that when you are binding parameters, you can use the blob type to stream blob data to the DB in chunks. The PHP docs for mysqli_stmt::send_long_data
have a great simple example of this that INSERTs a file into a blob column just like you are.
By doing that, and by using either the streams API, md5_file
or exec
with the system md5 command, you can do your entire INSERT without ever loading the entire file into memory, which means memory usage for your series of operations can be as low as you want!
If you are using PDO, and prepared statments you can use the PDO::PARAM_LOB type. See example #2 on the LOB page showing how to insert an image to a database using the file pointer.
http://us2.php.net/manual/en/pdo.lobs.php
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With