Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What exactly is the git:// protocol?

Tags:

git

ssh

https

ssl

I was checking the ISO OSI chart where you can see the other two protocols git uses:

https: ( this is http over ssl)

and

ssh

but no mention of git://

Here is ISO OSI:

https://en.wikipedia.org/wiki/OSI_model

like image 511
cade galt Avatar asked Nov 21 '15 18:11

cade galt


People also ask

What is Git and how it works?

Git is the most commonly used version control system. Git tracks the changes you make to files, so you have a record of what has been done, and you can revert to specific versions should you ever need to. Git also makes collaboration easier, allowing changes by multiple people to all be merged into one source.

Is Git over HTTP?

Git can communicate over HTTP using two different modes.

Does GIT use TCP or UDP?

The native git transport uses TCP port 9418. However, git can also run over ssh (often used for pushing), http, https, and less often others. You can look at the repository URL to find out which port it uses.

What protocol does git clone use?

The git:// specifier uses a lightweight protocol atop TCP as a transport protocol. It does no authentication and has no permissions checking. It is purely read-only: it does not need anything to control who can push, since no one can push.


1 Answers

The git protocol is a special daemon that comes packaged with Git; it listens on a dedicated port (9418) that provides a service similar to the SSH protocol, but with absolutely no authentication.

It was introduced at the very beginning of Git, in commit 2386d65 (July 2005, Git 0.99.1)

Add first cut at "git protocol" connect logic.

Useful for pulling stuff off a dedicated server. Instead of connecting with ssh or just starting a local pipeline, we connect over TCP to the other side and try to see if there's a git server listening.

Of course, since I haven't written the git server yet, that will never happen. But the server really just needs to listen on a port, and execute a "git-upload-pack" when somebody connects.

(It should read one packet-line, which should be of the format

"git-upload-pack directoryname\n" 

and eventually we might have other commands the server might accept).

The protocol is initially described in the next commit 9b011b2

There are two Pack push-pull protocols.

  • upload-pack (S) | fetch/clone-pack (C) protocol:
  • send-pack | receive-pack protocol

Nowadays, the full characteristic of a git daemon server is described in Documentation/git-daemon.txt.

A really simple TCP Git daemon that normally listens on port "DEFAULT_GIT_PORT" aka 9418.
It waits for a connection asking for a service, and will serve that service if it is enabled.

Note that even though is isn't listed in OSI model, 9418 is still listed from the very beginning as IANA (Internet Assigned Numbers Authority)

See commit ba8a497 (Setp. 2005, Git 0.99.7a:

[PATCH] Add note about IANA confirmation

The git port (9418) is officially listed by IANA now.
So document it.


With Git 2.31 (Q1 2021), newline characters in the host and path part of git:// URL are now forbidden.

See commit 6aed567, commit a02ea57 (07 Jan 2021) by Jeff King (peff).
(Merged by Junio C Hamano -- gitster -- in commit c7b1aaf, 25 Jan 2021)

git_connect_git(): forbid newlines in host and path

Reported-by: Harold Kim
Signed-off-by: Jeff King

When we connect to a git:// server, we send an initial request that looks something like:

002dgit-upload-pack repo.git\0host=example.com 

If the repo path contains a newline, then it's included literally, and we get:

002egit-upload-pack repo .git\0host=example.com 

This works fine if you really do have a newline in your repository name; the server side uses the pktline framing to parse the string, not newlines.

However, there are many other protocols in the wild that do parse on newlines, such as HTTP.

So a carefully constructed git:// URL can actually turn into a valid HTTP request.
For example:

git://localhost:1234/%0d%0a%0d%0aGET%20/%20HTTP/1.1 %0d%0aHost:localhost%0d%0a%0d%0a 

becomes:

0050git-upload-pack / GET / HTTP/1.1 Host:localhost  host=localhost:1234 

on the wire.
Again, this isn't a problem for a real Git server, but it does mean that feeding a malicious URL to Git (e.g., through a submodule) can cause it to make unexpected cross-protocol requests.

Since repository names with newlines are presumably quite rare (and indeed, we already disallow them in git-over-http), let's just disallow them over this protocol.

Hostnames could likewise inject a newline, but this is unlikely a problem in practice; we'd try resolving the hostname with a newline in it, which wouldn't work.
Still, it doesn't hurt to err on the side of caution there, since we would not expect them to work in the first place.

The ssh and local code paths are unaffected by this patch.
In both cases we're trying to run upload-pack via a shell, and will quote the newline so that it makes it intact.
An attacker can point an ssh url at an arbitrary port, of course, but unless there's an actual ssh server there, we'd never get as far as sending our shell command anyway.
We could similarly restrict newlines in those protocols out of caution, but there seems little benefit to doing so.

The new test here is run alongside the git-daemon(man) tests, which cover the same protocol, but it shouldn't actually contact the daemon at all.
In theory we could make the test more robust by setting up an actual repository with a newline in it (so that our clone would succeed if our new check didn't kick in).
But a repo directory with newline in it is likely not portable across all filesystems.
Likewise, we could check git-daemon's log that it was not contacted at all, but we do not currently record the log (and anyway, it would make the test racy with the daemon's log write).
We'll just check the client-side stderr to make sure we hit the expected code path.

like image 93
VonC Avatar answered Oct 08 '22 21:10

VonC