I have a table in a SQLite db that stores blobs compressed with LZ4 algorithm. I am trying to use decompress/uncompress functions from Compress::LZ4, but not getting any success with it.
The sample SQLite db can be downloaded from here.
Here is how I am connecting to the SQLite db and getting the blob:
use DBI;
use Data::Dump;
use MIME::Base64;
use Compress::LZ4;
my $dbh = DBI->connect("dbi:SQLite:dbname=$ARGV[0]","","");
$sth = $dbh->prepare("select blob_data from blob_parts where data_fk = 6");
$sth->execute();
$result = $sth->fetch;
$blob = $result->[0];
dd $blob;
dd (decompress($blob));
$sth->finish();
$dbh->disconnect;
For the particular blob that I am selecting in this sample code (data_fk=6), dd outputs the following:
"LZ4\1>\1\0\0\xF7\xD6df\xF1mBXML\1\xA1\aVersion\xA1\4Type\xA1\2Id\xA1\3Ref\xA1\4Size\xA1\3use\xA1\4expr\xA1\5value\xA1\4data\xA1/Serialization\xA1\aPoints3\xA1\tuser_ E\0\xF0\16\bvertices\xA1\6double\xA1\bhas_attr\xA1\16\n\0\xC7object_ids\xA1\n\f\0\xF1M\4item\xA1\tis_active\xA0~B\20\n\22\6\4\x8C\1\0\0\0\6\2\xAA\24\6\0\xA4\x82\x88\2\x80\x82\x82\xA6B\26\6\b\x80\1B\30 \6\b\x88\2\3B\32\1\x93\6\0\0\0`\xACu\xCF\xBF\0\0\0\0\xCC\xF8\xC2?\0\0\0\0\0\x004\@\0\0\0 \xAA\xEF\xA9\20\x001h\xC5\xB1\b\0\xD0\0\0\$\@\1B\34\x85B\36\x87B C\0\xF0\aB\"\6\0\x88\3B\$\x85\1B\"\6\0\x88\3B\ $\x85\1\1\1"
But the decompress/uncompress functions just return undef. The uncompressed data should be something like (The following output is generated by a XML converter):
<?xml version="1.0" encoding="utf-8"?>
<MultiStreamDocument>
<!-- Stream 1 -->
<?xml version="1.0" encoding="utf-8"?>
<data xmlns="" Id="1" Type="Points3" Version="1 2 0 1 1">
<user_data Size="0"></user_data>
<vertices Size="2">
<double>-0.24577860534191132</double>
<double>0.14821767807006836</double>
<double>20</double>
<double>0.050656620413064957</double>
<double>0.069418430328369141</double>
<double>10</double>
</vertices>
<has_attr>false</has_attr>
<has_object_ids>true</has_object_ids>
<object_ids Size="2">
<item Version="3">
<is_active>false</is_active>
</item>
<item Version="3">
<is_active>false</is_active>
</item>
</object_ids>
</data><!-- Stream size: 126 bytes -->
</MultiStreamDocument>
What is the correct way to get uncompressed blob data from this SQLite database?
You data looks like it is LZ4-compressed and prefixed with the four bytes "LZ4\1"
presumably as a format indicator
The next four bytes ">\1\0\0"
are a little-endian original-size field which evaluates to 318 bytes, which is reasonable. The decompress
library function expects this field
So in theory, you should be able to write
$blob = substr($blob(4);
dd decompress($blob);
and get the correct result. However this also results in a value of undef
for me, which suggests that the data is corrupted somehow
What is certain is that most of the data has ended up uncompressed. The two bytes following the length field are "\xF7\xD6"
, which indicates that the data following that is 229 bytes of literal data (the upper nybl of the first byte - 0xF - plus the second byte - 0xD6 - is 0xE5 or 229). So this part of the data
"df\xF1mBXML\1\xA1\aVersion\xA1\4Type\xA1\2Id\xA1\3Ref\xA1\4Size\xA1\3use\xA1\4expr\xA1\5value\xA1\4data\xA1/http://www.slb.com/Petrel/2011/03/Serialization\xA1\aPoints3\xA1\tuser_E\0\xF0\16\bvertices\xA1\6double\xA1\bhas_attr\xA1\16\n\0\xC7object_ids\xA1\n\f\0\xF1M\4item\xA1\tis_active\xA0~B\20\n\22\6\4\x8C\1\0\0\0\6\2\xAA\24\6\0\xA4\x82\x88\2\x80\x82\x82\xA6B\26\6\b\x80\1"
is literal, as could be guessed by the amount of readable text it contains
The following two bytes, "B\30"
should indicate an offset within the translated buffer from which data should be copied. Unfortunately this evaluates to 6210, whereas, as we have seen, the buffer is only 229 bytes long so far. This is presumably where the data causes the decompress
function to balk and return undef
That's the best I can make of your data. I hope it helps
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With