Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

git shallow clone - how do I remove the "grafted tag" and what is it?

Tags:

git

git-clone

So we have created a template project "template_proj.git".

update git version is: 2.14.1 on Windows 7 prof

We have new projects that are empty except they have one commit with a .gitignore file in them. Lets say one of these projects is called "projectA.git".

So my method is:

  1. clone template_proj.git into a folder called "Project_A". For this I use: clone template_prog.git --depth=1 --recursive
  2. Remove the remote: git remote rm origin
  3. Add the new remote: git remote add origin projectA.git
  4. Forcefully merge the projects: git pull origin master --allow-unrelated-histories

This works well. Note: The main reason that I don't just delete my .git folder from the template clone is that it has submodules.

This gives me a repo with 3 commits (which are exactly what I want):

  • the tip of template_proj.git
  • the tip (and only commit) of projectA.git
  • the commit that contains new merge of the two.

However there is the special tag/branch "grafted" associated with the the tip of template_proj.git commit. I don't really want that.

So my questions:

  • Is this an efficient way to do this operation (i.e. is there a better way)?
  • How do I get rid of the grafted tag?
  • What is the grafted tag?

I have not been able to fully understand what grafted really is/means - I did search for it and found some information but still not really sure. As a keyword in a git search it got over-ruled by more common items (or my google-fu is weak) :(

Update: Also this question here does not quite answer: What exactly is a "grafted" commit in a shallow clone? - because it does not really say why grafted is there or what to do about it (if anything). Also I don't have a: .git/info/grafts file in my repo.

like image 528
code_fodder Avatar asked Apr 09 '18 11:04

code_fodder


People also ask

What does grafted mean in git?

From Git SCM Wiki. Graft points or grafts enable two otherwise different lines of development to be joined together. It works by letting users record fake ancestry information for commits. This way you can make git pretend the set of parents a commit has is different from what was recorded when the commit was created.

What does shallow clone mean?

Shallow Cloning: Definition: "A shallow copy of an object copies the 'main' object, but doesn't copy the inner objects." When a custom object (eg. Employee) has just primitive, String type variables then you use Shallow Cloning.

What is shallow cloning and deep cloning in git?

Git's solution to the problem is shallow clone where you can use clone depth to define how deep your clone should go. For example, if you use –depth 1, then during cloning, Git will only get the latest copy of the relevant files. It can save you a lot of space and time.


2 Answers

It's not a tag, and you cannot remove it without making the repository non-shallow (e.g., git fetch --unshallow). It's the marker that indicates that this is the point at which the history cuts off.

You can, however, move the marker, by deepening the history. Since the marker exists at each commit at which history is cut off, if the history cuts off below the point you care about, you will not see the marker. For instance, using a depth of 2 will put the marker below the commit you get.

Background explanation

Note that computer scientists like to draw their trees upside down: instead of leaves at the top, branches towards the ground, and then a trunk sticking into the ground, computer science theory people start with the trunk:

    |

and then add branches below it:

    |
   / \

and put the leaves on the bottom.

For StackOverflow purposes, I like to draw my trees with the root / trunk at the left and the leaves at the right:

        o--o
       /
o--o--o
       \
        o--o

Here we have a simple tree with two branches. Let's label the branches:

        o--o   <-- master
       /
o--o--o
       \
        o--o   <-- develop

Congratulations, you now understand Git's branches! cough OK, maybe not quite yet. :-) There's a lot more to it, including that commits form a graph rather than a simple tree, but this is all we need at this point: we have what we need to say what a shallow clone is. Let's draw a shallow clone of depth 2, made from this same repository:

        o--o   <-- master
       X

       X
        o--o   <-- develop

Here, we still have our same two branch names, master and develop, which still point to two different commits. Each of their commits point to a second (earlier) commit. Each of those two would point to the (shared) third-back commit, but we have hit our depth limit, so each has a marker on it—an X crossing out the links going back to the earlier commits.

It's this marker that you see when you run git log or anything that shows you the commits. Git needs to know that it should not try to look for more commits—the commits it does have say "my previous commit is ..." but the previous commit is missing. Without the marker, Git would tell you that your repository is broken.

If we set the --depth to 3, the marker is even further back:

        o--o   <-- master
       /
    X-o
       \
        o--o   <-- develop

but if the --depth is set to 1, the marker is right at each tip commit, and you will always see it.

like image 130
torek Avatar answered Oct 07 '22 14:10

torek


After looking around I finally found what I needed - I got to it after following a long chain of question --> answer --> link-to-question --> answer --> 12th-comment. Anyway here are some options:

  • git fetch --unshallow - This un-shallows your clone and basically gets back the full history. Not what I want, but I could have used this to undo the --depth=1 clone.
  • git filter-branch -f -- --all - This seems to trim off the grafted bits. Note: Without the -f option it does the job fine but it leaves the old commits kicking about so you end up with 2 trees (for lack of better word) once which starts from the grafted point and another which is all new. But this is messy to a casual onlooker - so use the force option to trim all that away.

Source of my info: how-do-i-remove-the-old-history-from-a-git-repository - 9th comment highlights the -f option (you have to expand the comments).

So this is all to do with git grafting. I did not get a .git/info/grafts file, but I did create one manually echo <SOME-COMMIT-SHA> > .git/info/grafts. When I did this I got a second grafted label on the commit-sha I selected. So I guess you can use that to pick a point to chop of the history along with git filter-branch....

Really need to read up more on grafting, but its not really a feature I am interested in at the moment - other then in this case to get rid of it :o

Update:

I had to run:

git filter-branch -- --all

then

git filter-branch -f -- --all

As a two step process... not quite sure how/why. The first one splits them and the second one removes unreachable commits or something?

Update from Toreks comments

Now I do the following:

git clone <url> --recursive --depth=1
cd into folder
git remote rm origin
git remote add origin <new url>
git filter-branch -- --all
rm -rf .git/refs/original/*

Now I can operate normally to get the empty proj/merge and then push my changes up:

git pull origin master
commit anything here if needed...
git push origin master
like image 41
code_fodder Avatar answered Oct 07 '22 14:10

code_fodder