Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Effects of git remote update origin --prune on local changes

Tags:

git

Please note: I first took a look at this question hoping it would answer my question, but I think mine is slightly different!


I was told that running git remote update origin --prune locally would force my local repo to have the exact same branches as the origin repo. However I can find loads of info/documentation surrounding the use of git remote prune origin, but not git remote update origin --prune...

Is my understanding of this command correct? If not, how am I misled? And if I am correct, then what happens if I have local feature branches that haven't been pushed remotely yet? What happens if I have changes to branches that do existing remotely/on the origin but have I have not pushed those changes yet?

Thanks in advance!

like image 467
hotmeatballsoup Avatar asked Apr 17 '19 16:04

hotmeatballsoup


People also ask

What does Git remote update origin--prune do?

With --prune, the update process simply removes any remote-tracking names that exist in your repository, but no longer correspond to a branch name in the repository at remote. I was told that running git remote update origin --prune locally would force my local repo to have the exact same branches as the origin repo.

How do I prune the origin of a git repository?

$ git fetch --prune origin In cases where you’d like to perform a prune and not fetch remote data, you can use it with the git remote command: $ git remote prune origin The result will be identical in both cases: stale references to remote branches that do not exist anymore on the desired remote repository will be removed.

How does Git remote prune work?

git remote prune and git fetch --prune do the same thing: deleting the refs to the branches that don't exist on the remote, as you said. The second command connects to the remote and fetches its current branches before pruning.

How to overwrite local changes using git--pull force?

An alternative approach to overwriting local changes using git --pull force could be git pull --force "@ {u}:HEAD". The world of Git is vast. This article covered only one of the facets of repository maintenance: incorporating remote changes into a local repository.


1 Answers

TL;DR

The git remote update remote command is the same as the git fetch remote command. (There were some bugs in some versions of Git that made them different, but they are supposed to do the same thing.) With --prune, the update process simply removes any remote-tracking names that exist in your repository, but no longer correspond to a branch name in the repository at remote.

Long

I was told that running git remote update origin --prune locally would force my local repo to have the exact same branches as the origin repo.

That's not quite right. Well, that's not at all right, really. And this does tie in to the way you've phrased your question title:

Effects ... on local changes

When using Git, it's important to recognize several things that are odd / different-from most other version control systems. The first is that branches, or branch names more precisely, don't really matter that much. Branch names like master and develop are mostly just for use by humans, and you can do pretty much anything you want with them, including changing them around arbitrarily. (There are a few exceptions that we'll get to in a bit.)

What you can't change—and what really matters to Git—are commit hash IDs. You will have seen these things in git log output: they're big ugly number-and-letter strings like b5101f929789889c2e536d915698f58d5c5c6b7a, which is a commit in the Git repository for Git. These numbers (these are actually hexadecimal representations) are unique and universal: every Git everywhere will use that number for that commit. If you're not working with a clone of this particular Git repository, you won't have that commit. If you are working with a clone of this particular Git repository and don't have this commit, your Git just needs to connect to another Git that does have this commit, and you'll get this commit.

These numbers are how Git identifies and finds commits. They're what Git really cares about: the commits themselves, and these unique hash IDs to identify them.

Each commit contains some number of other, earlier commits' hash IDs. Usually each commit has just one earlier commit hash ID. That one earlier commit ID is the parent of the commit. So given some commit hash ID, your Git can tell if you have the commit. If so, your Git can fish out the parent hash ID, and tell if you have that commit, and then fish out its parent ID, and so on. What this means is that given any one commit (by hash ID), your Git can find all the commits, one by one, working backwards. This makes for an easy way to draw them:

... <-F <-G <-H

where H stands in for some commit hash. Git will find the actual commit, read out its parent hash ID G, read the commit, find its parent F, and so on. The action stops when Git gets back to the very first commit ever made, which has no parent hash.

Nothing in any commit can ever change. (In part, that's because the actual hash ID is a cryptographic checksum of all of the contents of the commit. If you were to change anything—even a single bit—you'd get a new, different hash for a new, different commit.) So since H store's G's hash ID, H will always, forever, point back to G. The one-way-ness of the connecting arrows is therefore mostly irrelevant and we can stop drawing them; we just need to remember that Git itself has to work backwards.

A branch name, like master or develop, is just a way that your Git offers to let you help remember one of these hash IDs. It only remembers one of them! The one it remembers is, by definition, the latest one. If you change the number associated with your master, you've told your Git: Use a different commit as the latest one. Hence we augment the picture a bit:

...--F--G--H   <-- master

Thus, the hash ID in H is that of the last commit that Git should treat as being "on" branch master.

When you make a new commit, starting with git checkout master and doing all the usual stuff up to the point where you run git commit, Git writes out a new commit. This new commit gets a new, unique hash ID, which we can call I:

...--F--G--H   <-- master
            \
             I

Having written out the commit and found its cryptographic-checksum hash ID, Git now writes that hash ID into the name master, so that master points to I:

...--F--G--H
            \
             I   <-- master

and now I is the tip commit of master. You / your Git can still find commit H, by starting at I and walking back one step.

If something were to overwrite your master and make it point back to H, it would become very difficult to find commit I, because the internal arrows all point backwards. I points to H—the child knows its parent—but H has no idea that I exists at all. (It didn't when whoever made H, made it, and H can't change now!)

For this reason (among other reasons), your branch names are yours. No one but you and your Git can write new numbers into them. The hash IDs of actual commits are unique and universal: if someone else makes a new commit, that new commit gets a new hash ID that no other commit will ever have, and your Git will know if you have that commit or not. If you need it, your Git will get it from their Git when you connect your Git to their Git. But your branch names are yours.

So git fetch or git remote update—again, both do the same thing—will call up some other Git, and get any new commits from them that you don't have. Let's say you git fetch origin to call up a Git at the URL you're calling origin. If you added commit I to your master, and they added commit J to theirs, your Git now has:

...--F--G--H--J   <-- ???
            \
             I   <-- master

How will your Git find commit J? Writing that hash ID into your master is obviously a bad idea: that loses I! So where can your Git write that hash ID?

Git's answer to this is remote-tracking names (which most people call remote-tracking branches, but I think Git already uses the word branch too much). Your Git's name, with which your Git remembers origin's master, is origin/master, and the picture should read:

...--F--G--H--J   <-- origin/master
            \
             I   <-- master

If you want to add commit I—or another different commit that's just as good—to their master, you now have to decide, do you really want I itself, or another commit that's just as good? If you want to keep I itself, you'd merge I and J to make a new commit that has two parents:

...--F--G--H--J   <-- origin/master
            \  \
             I--M   <-- master

Since merge commit M has arrows back to both J and I, you can now send them commit M and ask their Git to set their master to point to commit M.

Your Git has a remote-tracking name for each of their branches

When you run git fetch origin or git remote update origin, your Git calls up the other Git at origin and asks for a list of all of its branches. Their branches are of course theirs, but your Git would like to get any new commits they've made and remember the latest one for you. So if they have master, develop, feature/A, and feature/B, your Git gets any commits they have that you don't and remembers their branch tips under your origin/* names, corresponding to their branch names:

                L--N   <-- origin/feature/A
               /
...--F--G--H--J   <-- origin/master
            \
             I   <-- master

If they've deliberately removed some commit(s) from some of their branches by writing an older hash ID into their branch name(s), your Git will adjust your remote-tracking branches correspondingly. In some cases, this may result in your Git forgetting their commits (if you've never bothered to make your own names to save them):

                  N   [abandoned]
                 /
                L   <-- origin/feature/A
               /
...--F--G--H--J   <-- origin/master
            \
             I   <-- master

Commit N is no longer findable, and eventually git gc will remove it entirely from your repository. (This is where branch names, or other names, become important: they not only let you find the commits, they also protect the tip commit from the Grim Reaper Collector. Those tip commits protect their parents, who protect commits further back in the chain, all the way to the root commit.)

Suppose, though, that at some point they decide that feature A is finished. They have merged it back into their master, or fast-forwarded their master to point to it directly:

...--F--G--H--J--L   <-- master, feature/A   [no origin/ -- this is THEIR Git!]

which in your Git is:

...--F--G--H--J--L   <-- origin/master, origin/feature/A
            \
             I   <-- master

They may now delete their feature/A name entirely. When they do, your Git stops seeing new values for your origin/feature/A.

Without --prune, however, your Git *doesn't remove your origin/feature/A. So you continue to remember origin/feature/A as specifying commit L. This doesn't do any real harm; you'll just think, based on looking at your origin/* names, that they still have feature/A and that it means commit L.

Whether to use --prune is up to you. I turn it on automatically in my own Git configuration; this keeps my repositories less-cluttered.

like image 109
torek Avatar answered Nov 15 '22 02:11

torek