Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I fetch git objects using the smart protocol (raw) over http?

Tags:

git

I'm trying to fetch the annotation of tag "v2.4.2" from github.com/git/git using the git smart protocol over http.

// Get the refs

curl -H "User-Agent: git/1.8.1" -v  https://github.com/git/git/info/refs?service=git-upload-pack

Returns the refs:

.....
003e2be062dfcfd1fd4aca132ec02a40b56f63776202 refs/tags/v2.4.1
0041aaa7e0d7f8f003c0c8ab34f959083f6d191d44ca refs/tags/v2.4.1^{}
003e29932f3915935d773dc8d52c292cadd81c81071d refs/tags/v2.4.2
00419eabf5b536662000f79978c4d1b6e4eff5c8d785 refs/tags/v2.4.2^{}

// Make the upload pack request

printf "0031want 00419eabf5b536662000f79978c4d1b6e4eff5c8d785\n0024have 003e2be062dfcfd1fd4aca132ec02a40b56f63776202\n0000" | curl -H "User-Agent: git/1.8.1" -v  -d @- https://github.com/git/git/git-upload-pack -H "Content-Type: application/x-git-upload-pack-request" --trace-ascii /dev/stdout

This returns nothing. I'm wondering what's wrong in the request (i.e did I miscalculate the hex?)

Warning: --trace-ascii overrides an earlier trace/verbose option
== Info: Hostname was NOT found in DNS cache
== Info:   Trying 192.30.252.130...
== Info: Connected to github.com (192.30.252.130) port 443 (#0)
== Info: TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
== Info: Server certificate: github.com
== Info: Server certificate: DigiCert SHA2 Extended Validation Server CA
== Info: Server certificate: DigiCert High Assurance EV Root CA
=> Send header, 170 bytes (0xaa)
0000: POST /git/git/git-upload-pack HTTP/1.1
0028: Host: github.com
003a: Accept: */*
0047: User-Agent: git/1.8.1
005e: Content-Type: application/x-git-upload-pack-request
0093: Content-Length: 110
00a8: 
=> Send data, 110 bytes (0x6e)
0000: 0031want 00419eabf5b536662000f79978c4d1b6e4eff5c8d7850024have 00
0040: 3e2be062dfcfd1fd4aca132ec02a40b56f637762020000
== Info: upload completely sent off: 110 out of 110 bytes
<= Recv header, 17 bytes (0x11)
0000: HTTP/1.1 200 OK
== Info: Server GitHub Babel 2.0 is not blacklisted
<= Recv header, 26 bytes (0x1a)
0000: Server: GitHub Babel 2.0
<= Recv header, 52 bytes (0x34)
0000: Content-Type: application/x-git-upload-pack-result
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 40 bytes (0x28)
0000: Expires: Fri, 01 Jan 1980 00:00:00 GMT
<= Recv header, 18 bytes (0x12)
0000: Pragma: no-cache
<= Recv header, 53 bytes (0x35)
0000: Cache-Control: no-cache, max-age=0, must-revalidate
<= Recv header, 23 bytes (0x17)
0000: Vary: Accept-Encoding
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 5 bytes (0x5)
0000: 0
0003: 
== Info: Connection #0 to host github.com left intact

Why am I trying this?

  • I don't have write access to the file system
  • Avoid fetching unnecessary data (i.e the commits)
  • Standard API/Protocol
like image 468
themihai Avatar asked May 31 '15 14:05

themihai


1 Answers

Commit hex

You didn't miscalculate the hex, but you're not passing the correct value. Remember that each line in the smart protocol is preceded by a length count:

<length><data>

So for a line that looks like this:

00419eabf5b536662000f79978c4d1b6e4eff5c8d785 refs/tags/v2.4.2^{}

You need to discard the first four characters, which makes the actual commit hex:

9eabf5b536662000f79978c4d1b6e4eff5c8d785

Request format

When POSTing a request, the have and want lines are supposed to be separated by a newline, but if you take a look at the output from curl, you can see that there is no newline:

=> Send data, 110 bytes (0x6e)
0000: 0031want 00419eabf5b536662000f79978c4d1b6e4eff5c8d7850024have 00
0040: 3e2be062dfcfd1fd4aca132ec02a40b56f637762020000

You need to use --data-binary instead of --data:

--data-binary @-

You need to prefix these lines with a length count, and you need to end with a line consisting of 0000:

0032want 9eabf5b536662000f79978c4d1b6e4eff5c8d785
0032have 2be062dfcfd1fd4aca132ec02a40b56f63776202
0000

Debugging tips

You can set GIT_TRACE_PACKET=1 in your environment if you want to get copious debugging information from git to see exactly what it's sending back and forth.

And that's all he wrote

I'm not able to get a response myself, even given the above information, but I figured it would help.

Update

So, this was fun.

I set up a git server locally (using git http-backend and thttpd), and ran tcpdump to grab the traffic generated by a git remote update operation. It turns out that the you need to separate the want and have directives with a null command, which is 0000 (no newline, because the length encodes newlines, too). That is:

<length>want <commitid><newline>
0000<length>have <commitid><newline>
<length>done

E.g:

0032want 9eabf5b536662000f79978c4d1b6e4eff5c8d785
00000032have 2be062dfcfd1fd4aca132ec02a40b56f63776202
0009done

That gives me:

0000: POST /git/git/git-upload-pack HTTP/1.1
0028: Host: github.com
003a: Accept: */*
0047: Content-type: application/x-git-upload-pack-request
007c: User-agent: git/1.8
0091: Content-Length: 113
00a6: 
=> Send data, 113 bytes (0x71)
0000: 0032want 9eabf5b536662000f79978c4d1b6e4eff5c8d785.00000032have 2
0040: be062dfcfd1fd4aca132ec02a40b56f63776202.0009done.
== Info: upload completely sent off: 113 out of 113 bytes
<= Recv header, 17 bytes (0x11)
0000: HTTP/1.1 200 OK
<= Recv header, 26 bytes (0x1a)
0000: Server: GitHub Babel 2.0
<= Recv header, 52 bytes (0x34)
0000: Content-Type: application/x-git-upload-pack-result
<= Recv header, 28 bytes (0x1c)
0000: Transfer-Encoding: chunked
<= Recv header, 40 bytes (0x28)
0000: Expires: Fri, 01 Jan 1980 00:00:00 GMT
<= Recv header, 18 bytes (0x12)
0000: Pragma: no-cache
<= Recv header, 53 bytes (0x35)
0000: Cache-Control: no-cache, max-age=0, must-revalidate
<= Recv header, 23 bytes (0x17)
0000: Vary: Accept-Encoding
<= Recv header, 2 bytes (0x2)
0000: 
<= Recv data, 4 bytes (0x4)
0000: 31
<= Recv data, 51 bytes (0x33)
0000: 0031ACK 2be062dfcfd1fd4aca132ec02a40b56f63776202.
<= Recv data, 6 bytes (0x6)
0000: 1fff
<= Recv data, 1370 bytes (0x55a)
0000: PACK.......[..x...An.0...z.?`..d.*[email protected]..(.tu......>~B.....]..8
0040: 2...j).OQ}..#.....'......[..8K..t..,%[email protected]......'...
[....]

Double-bonus update

You can use the git unpack-objects command to extract the packfile. As you can see from the above trace, you first get back a length-encoded response ( 0031ACK 2be062dfcfd1fd4aca132ec02a40b56f63776202) followed by the pack data, so you need to discard that first line:

$ git init tmprepo
$ cd temprepo
$ tail -n +2 output_from_curl | git unpack-objects
Unpacking objects: 100% (91/91), done.
$ find .git/objects -type f | head -3
$ git cat-file -p dc940e63c453199dd9a7285533fbf2355bab03d1
/*
 * GIT - The information manager from hell
 *
 * Copyright (C) Linus Torvalds, 2005
 *
 * This handles basic git sha1 object files - packing, unpacking,
 * creation etc.
 */
[...]
like image 190
larsks Avatar answered Oct 26 '22 22:10

larsks