Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why Isn't There A Git Clone Specific Commit Option?

Tags:

git

git-clone

In light of a recent question on SO, I am wondering why isn't there an option in git clone such that the HEAD pointer of the newly created branch will point to a specified commit? In say question above, OP is trying to provide instructions on the specific commit his users should clone.

Note that this question is not about How To Clone To A Particular Version using reset; but about why isn't there?

like image 696
ivan.sim Avatar asked Oct 01 '14 06:10

ivan.sim


People also ask

Can you git clone a specific commit?

Git Clone From Specific Commit ID There is no direct way to clone directly using the commit ID. But you can clone from a git tag. However, you can do the following workaround to perform a clone if it is really necessary. The above steps will make your current HEAD pointing to the specific commit id SHA.

How do I clone a specific tag in git?

git clone If you only need the specific tag, you can pass the --single-branch flag, which prevents fetching all the branches in the cloned repository. With the --single-branch flag, only the branch/tag specified by the --branch option is cloned. $ git clone -b <tagname> –single-branch <repository> .

How do I fetch a specific commit?

How do I pull a specific commit? The short answer is: you cannot pull a specific commit from a remote. However, you may fetch new data from the remote and then use git-checkout COMMIT_ID to view the code at the COMMIT_ID .

How do I clone only a specific branch?

There are two ways to clone a specific branch. You can either: Clone the repository, fetch all branches, and checkout to a specific branch immediately. Clone the repository and fetch only a single branch.


2 Answers

Two answers so far (at the time I wrote this, now there are more) are correct in what they say, but don't really answer the "why" question. Of course, the "why" question is really hard to answer, except by the authors of the various bits of Git (and even then, what if two frequent Git contributors gave two different answers?).

Still, considering Git's "philosophy" as it were, in general, the various transfer protocols work by naming a reference. If they provide an SHA-1, it's the SHA-1 of that reference. For someone who does not already have direct (e.g., command-line) access to the repository, none1 of the built in commands allow one to refer to commits by ID. The closest thing I can find to a reason for this—and it is actually a good reason2—is this bit in the git upload-archive documentation:

SECURITY

In order to protect the privacy of objects that have been removed from history but may not yet have been pruned, git-upload-archive avoids serving archives for commits and trees that are not reachable from the repository's refs. However, because calculating object reachability is computationally expensive, git-upload-archive implements a stricter but easier-to-check set of rules ...

However, it goes on to say:

If the config option uploadArchive.allowUnreachable is true, these rules are ignored, and clients may use arbitrary sha1 expressions. This is useful if you do not care about the privacy of unreachable objects, or if your object database is already publicly available for access via non-smart-http.

which is particularly interesting since git clone gets all reachable objects in the first place, after which your local clone could trivially check out a commit by SHA-1 ID (and create a local branch name pointing to that ID if desired, or just leave your clone in "detached HEAD" mode).

Given these two cross-currents, I think the real answer to "why", at this point, is "nobody cares enough to add it". :-) The privacy argument is valid, but there is no reason that git clone could not check out a commit by ID after cloning, just as it can be told to check out some branch other than master3 with git clone -b .... The only drawback to allowing -b sha1 is that Git cannot check up front (before the cloning process begins) whether sha1 will be received. It can check reference names, since those are transferred (along with their branch tips or other SHA-1 values) up front, so git clone -b nonexistentbranch ssh://... terminates quickly and does not create the copy:

fatal: Remote branch nonexistentbranch not found in upstream origin fatal: The remote end hung up unexpectedly 

If -b allowed an ID, you'd get the whole clone, then it would have to tell you: "oh gosh, sorry, can't check out that ID, I'll leave you on master instead" or whatever. (Which is more or less what happens now with a busted submodule.)


1While git upload-archive now enforces this "privacy" rule, this was not always the case (it was introduced in version 1.7.8.1); and many (most?) git-web servers, including the one distributed with Git itself, allow viewing by arbitrary ID. This is probably why allowUnreachable was added to upload-archive a few years after the "only by ref name" code was added (but note that releases of Git after 1.7.8 and before 2.0.0 have no way to loosen the rules). Hence, while the "security" idea is valid, there was a period (pre 1.7.8.1) when it was not enforced.

2There are numerous ways to "leak" ostensibly private data out of a Git repository. A new file, Documentation/transfer-data-leaks, is about to appear in Git 2.11.1, while Git 2.11.0 added some internal features (see commit 722ff7f87 among others) to immediately drop objects pushed but not accepted. Such objects are eventually garbage-collected, but that leaves them exposed for the duration.

3Actually, by default git clone makes a local check-out of the branch it thinks goes with the remote's HEAD reference. Usually that's master anyway, though.

like image 100
torek Avatar answered Sep 26 '22 23:09

torek


Cloning a repo is a different operation than checkout. You don't "clone a specific commit". For convenience you can clone and then checkout a particular pre-existing branch at the same time, since that is what most people want. If that doesn't meet your needs (no branch for the particular SHA you want) just use or alias some form of

git clone -n <some repo> && cd <some repo> && git checkout SHA 
like image 41
Andrew C Avatar answered Sep 26 '22 23:09

Andrew C