Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

git add multiple times without commit

Tags:

git

I notice that if I edit a file in my repo, stage it but don't commit it, edit it again, stage it again but don't commit etc. then for each time I do this a new snapshot is taken and the disk space increases.

Furthermore, if I stage 5 times following tiny edits and finally commit once after all stagings, the disk space of the repo still increases approximately 5x the file size.

My question is, why doesn't git just forget about the other staged versions if only the latest one has a commit sha1 reference to the state? The other 4 staged versions will be garbage collected? Is there a way to checkout a staged state which was never committed?

like image 388
sphere Avatar asked Jan 26 '19 07:01

sphere


People also ask

Can I do git add multiple times?

This command can be performed multiple times before a commit. It only adds the content of the specified file(s) at the time the add command is run; if you want subsequent changes included in the next commit, then you must run git add again to add the new content to the index.

Which command you will use to delete the file from project and stage the removal for commit?

Removing Files To remove a file from Git, you have to remove it from your tracked files (more accurately, remove it from your staging area) and then commit. The git rm command does that, and also removes the file from your working directory so you don't see it as an untracked file the next time around.

Which command to use to change the stage the all files?

Enter one of the following commands, depending on what you want to do: Stage all files: git add . Stage a file: git add example. html (replace example.

Which command show modified files in working directory staged for your next commit?

The git add command adds a change in the working directory to the staging area. It tells Git that you want to include updates to a particular file in the next commit.


1 Answers

TL;DR

See git fsck --lost-found.

Longer, point-by-point

My question is, why doesn't git just forget about the other staged versions if only the latest one has a commit sha1 reference to the state?

It does ... eventually.

The other 4 staged versions will be garbage collected?

Yes, when git gc eventually runs automatically. If you want this to happen sooner, you can run git gc yourself, but there's only rarely any reason to bother (the common case being oops, I did not mean to git add 10terabytes.db).1

Is there a way to checkout a staged state which was never committed?

Sort of. The git checkout command cannot do it because git checkout works by file names, and these staged content-only blobs have no file name. They have only a hash ID. To extract their data, you must first find their hash ID. This is easy to do: you just checksum the data the way Git would, which just means that you need to have the data available first, in order to get the data. :-)

Alternatively, you can do much of what git gc does, which is:

  • Enumerate every object ID in the object database.
  • Enumerate every reachable object ID. For details on reachability, see Think Like (a) Git. Note that reachability here includes all reflog entries for all references, and all index and HEAD entries from all active work-trees.2
  • Subtract the second set of object IDs (reachable) from the first set of IDs (all). The resulting IDs are unreferenced, i.e., objects that are candidates for garbage collection.

(This is a bit slow, but git fsck does it for you, so that you do not have to write code to do it.)

From the set of all unreachable objects, select those that have type blob, i.e., files that were git added but never committed. Inspect each blob, using its hash ID to access it, to see if it is the one you wanted. Here git cat-file -p is useful, or use git fsck --lost-found, which takes each such blob, de-compresses it, and writes the data to an ordinary file in .git/lost-found/other/.


1Note that you may also need --prune= options: git gc defaults to giving other Git processes 14 days to complete the job of hooking up objects. If you use --prune=all, make sure no other Git activity is occurring.

2When you remember to include work-trees added via git worktree add, you will be doing something the Git folks forgot to do. This is a particularly nasty bug, present in Git version 2.5 through 2.14.*: work being done in an added work-tree can be pruned via an automatic git gc, if you've left that work-tree idle for 2 weeks or more. If you are using git worktree add, I recommend making sure your Git is at least version 2.15.

like image 134
torek Avatar answered Sep 28 '22 14:09

torek