Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to DEFLATE with a command line tool to extract a git object?

Tags:

git

blob

deflate

I'm looking for a command line wrapper for the DEFLATE algorithm.

I have a file (git blob) that is compressed using DEFLATE, and I want to uncompress it. The gzip command does not seem to have an option to directly use the DEFLATE algorithm, rather than the gzip format.

Ideally I'm looking for a standard Unix/Linux tool that can do this.

edit: This is the output I get when trying to use gzip for my problem:

$ cat .git/objects/c0/fb67ab3fda7909000da003f4b2ce50a53f43e7 | gunzip

gzip: stdin: not in gzip format
like image 999
Felix Geisendörfer Avatar asked Jul 05 '10 09:07

Felix Geisendörfer


People also ask

What is deflate encoding?

In computing, Deflate (stylized as DEFLATE) is a lossless data compression file format that uses a combination of LZ77 and Huffman coding. It was designed by Phil Katz, for version 2 of his PKZIP archiving tool. Deflate was later specified in RFC 1951 (1996).


4 Answers

Something like the following will print the raw content, including the "$type $length\0" header:

perl -MCompress::Zlib -e 'undef $/; print uncompress(<>)' \
     < .git/objects/27/de0a1dd5a89a94990618632967a1c86a82d577
like image 192
araqnid Avatar answered Oct 16 '22 13:10

araqnid


You can do this with the OpenSSL command line tool:

openssl zlib -d < $IN > $OUT

Unfortunately, at least on Ubuntu, the zlib subcommand is disabled in the default build configuration (--no-zlib --no-zlib-dynamic), so you would need to compile openssl from source to use it. But it is enabled by default on Arch, for example.

Edit: Seems like the zlib command is no longer supported on Arch either. This answer might not be useful anymore :(

like image 41
Jack O'Connor Avatar answered Oct 16 '22 14:10

Jack O'Connor


pythonic one-liner:

$> python -c "import zlib,sys;print \
           repr(zlib.decompress(sys.stdin.read()))" < $IN
like image 41
akira Avatar answered Oct 16 '22 15:10

akira


UPDATE: Mark Adler noted that git blobs are not raw DEFLATE streams, but zlib streams. These can be unpacked by the pigz tool, which comes pre-packaged in several Linux distributions:

$ cat foo.txt 
file foo.txt!

$ git ls-files -s foo.txt
100644 7a79fc625cac65001fb127f468847ab93b5f8b19 0   foo.txt

$ pigz -d < .git/objects/7a/79fc625cac65001fb127f468847ab93b5f8b19 
blob 14file foo.txt!

Edit by kriegaex: Git Bash for Windows users will notice that pigz is unavailable by default. You can find precompiled 32/64-bit versions here. I tried the 64-bit version and it works nicely. You can e.g. copy pigz.exe directly to c:\Program Files\Git\usr\bin in order to put it on the path.

Edit by mjaggard: Homebrew and Macports both have pigz available so you can install with brew install pigz or sudo port install pigz (if you do not have it already, you can install Homebrew by following the instructions on their website)


My original answer, kept for historical reasons:

If I understand the hint in the Wikipedia article mentioned by Marc van Kempen, you can use puff.c from zlib directly.

This is a small example:

#include <assert.h>
#include <string.h>
#include "puff.h"

int main( int argc, char **argv ) {
    unsigned char dest[ 5 ];
    unsigned long destlen = 4;
    const unsigned char *source = "\x4B\x2C\x4E\x49\x03\x00";
    unsigned long sourcelen = 6;    
    assert( puff( dest, &destlen, source, &sourcelen ) == 0 );
    dest[ 4 ] = '\0';
    assert( strcmp( dest, "asdf" ) == 0 );
}
like image 33
mkluwe Avatar answered Oct 16 '22 13:10

mkluwe