Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I decompress a Git object properly in Raku Perl 6?

Tags:

git

raku

I have the following Python code snippet:

import zlib

def object_read(repo, sha):
    path = repo + "/objects/" + sha[0:2] +  "/" + sha[2:]

    with open (path, "rb") as f:
        raw = zlib.decompress(f.read())
        return len(raw)

print(object-read(".git", "1372c654fd9bd85617f0f8b949f1405b0bd71ee9"))

and one of its P6 counterparts:

#!/usr/bin/env perl6
use Compress::Zlib;

sub object-read( $repo, $sha ) {
    my $path = $repo ~ "/objects/" ~ $sha.substr(0, 2) ~ "/" ~
               $sha.substr(2, *);

    given slurp($path, :bin) -> $f {
        my $raw = uncompress($f).decode('utf8-c8'); # Probable error here?!
        return $raw.chars;
    }

}

put object-read(".git", "1372c654fd9bd85617f0f8b949f1405b0bd71ee9")

However, when I run them, they give me back off-by-one results:

$ python bin.py
75
$ perl6 bin.p6
74
like image 976
Luis F. Uceta Avatar asked Mar 31 '19 14:03

Luis F. Uceta


1 Answers

@melpomene has hit the spot. You are not decoding in Python, and the number of bytes in the raw file might be a bit more; insert

say uncompress($f).elems;

before decoding to $raw and you will see that it includes (in the file and in my system) 2 bytes more. Rendering via utf8-c8 might merge a couple of bytes into a single codepoint (or more). In general, the number of codepoints will be less than the number of bytes in an IO stream.

like image 86
jjmerelo Avatar answered Oct 22 '22 02:10

jjmerelo