In my repo, git diff
and git stash
both run quickly, in less than a second. However git stash -p
takes a good 20 seconds before showing the first hunk. Why could this be?
You can have as many stashes as you want. Get rid of old ones when you feel like it by running git stash drop or git stash clear (read the docs for those).
If you want to git stash pop twice because you want both stashes in the same commit but you encounter "error: Your local changes to the following files would be overwritten by merge:" on your 2nd git stash pop , then you can: 1) git stash pop , 2) git add . , and 3) git stash pop .
git stash temporarily shelves (or stashes) changes you've made to your working copy so you can work on something else, and then come back and re-apply them later on.
A safer option is to run git stash --all to remove everything but save it in a stash. Assuming you do want to remove cruft files or clean your working directory, you can do so with git clean .
This should improve with Git 2.25.2 (March 2020), which adds code simplification.
See discussion.
See commit 26f924d (07 Jan 2020) by Elijah Newren (newren
).
(Merged by Junio C Hamano -- gitster
-- in commit a3648c0, 22 Jan 2020)
unpack-trees
: exitcheck_updates()
early if updates are not wantedSigned-off-by: Elijah Newren
check_updates()
has a lot of code that repeatedly checks whethero->update
oro->dry_run
are set.(Note that
o->dry_run
is a near-synonym for!o->update,
but not quite as per commit 2c9078d05bf2 ("unpack-trees
: add thedry_run
flag tounpack_trees_options
", 2011-05-25, Git v1.7.6-rc0).)
In fact, this function almost turns into a no-op whenever the condition!o->update || o->dry_run
is met.
Simplify the code by checking this condition at the beginning of the function, and when it is true, do the few things that are relevant and return early.
There are a few things that make the conversion not quite obvious:
- The fact that check_updates() does not actually turn into a no-op when updates are not wanted may be slightly surprising.
However, commit 33ecf7eb61 (Discard "deleted
" cache entries after using them to update the working tree, 2008-02-07, Git v1.5.5-rc0) put the discarding of unused cache entries incheck_updates()
so we still need to keep the call toremove_marked_cache_entries()
.
It's possible this call belongs in another function, but it is certainly needed as tests will fail if it is removed.- The original called
remove_scheduled_dirs()
unconditionally.
Technically, commit 7847892716 (unlink_entry()
: introduceschedule_dir_for_removal()
, 2009-02-09, Git v1.6.3-rc0) should have made that call conditional, but it didn't matter in practice becauseremove_scheduled_dirs()
becomes a no-op when all the calls to unlink_entry() are skipped.
As such, we do not need to call it.- When
(o->dry_run && o->update)
, the original would have two calls togit_attr_set_direction()
surrounding a bunch of skipped updates.
These two calls togit_attr_set_direction()
cancel each other out and thus can be omitted wheno->dry_run
is true just as they already are when!o->update
.- The code would previously call
setup_collided_checkout_detection()
andreport_collided_checkout()
even wheno->dry_run
.
However, this was just an expensive no-op becausesetup_collided_checkout_detection()
merely cleared theCE_MATCHED
flag for each cache entry, andreport_collided_checkout()
reported which ones had it set.
Since a dry-run would skip all thecheckout_entry()
calls,CE_MATCHED
would never get set and thus no collisions would be reported.
Since we can't detect the collisions anyway without doing updates, skipping the collisions detection setup and reporting is an optimization.- The code previously would call
get_progress()
anddisplay_progress()
even when(!o->update || o->dry_run)
.
This served to show how long it took to skip all the updates, which is somewhat useless.
Since we are skipping the updates, we can skip showing how long it takes to skip them.
I notice the same problem. This started at least over a year ago and has not improved since than. I also use git on a very big repo. Unfortunately in my case there is also a lot of binary data in it since it’s just a mirror of a SVN repo using git_svn and my colleagues think it’s a good idea to place binary test data into the repo.
No answer, just hints and guesses where to search:
It seams the big difference is, that in case of stash -p
the function stash_patch
is called. Otherwise stash_working_tree
.
In stash_patch
there are child processes called executing other git commands. One of these is read-tree
(see: man git-read-tree
). The final command looks like this: GIT_INDEX_FILE=index.stash.<PID> git read-tree HEAD
. This actually takes no time.
The next step is another child process calling GIT_INDEX_FILE=index.stash.<PID> git add--interactive --patch=stash -- <PATH>
– This is where all the reads come from and what takes up all the time.
Interesting thing is: Calling just GIT_INDEX_FILE=index.stash.<PID> git status
after GIT_INDEX_FILE=index.stash.<PID> git read-tree HEAD
is as expensive as git add--interactive
. Actually add--interactive
is a perl script implementing add -p
. I don’t know perl and had a hard time reading this, but probably it will somehow check the working dir state and use the same code for it as git status
.
The basic idea seams to be:
The expensive part seams to be to get the state of the working dir w.r.t the temporary index. Why it’s so expensive I don’t know. Probably there is some cached data invalidated and it has to read all the files in the working copy at least to some amount to compare with the temporary index, but to understand this one has to dive deeper into the internals of git status
.
I tried measuring this like this:
GIT_INDEX_FILE=.git/index.stash.test git read-tree HEAD
GIT_TRACE_PERFORMANCE=/tmp/trace_status GIT_INDEX_FILE=.git/index.stash.test git st .
Result looks like this:
20:31:20.439868 read-cache.c:2290 performance: 0.000269090 s: read cache .git/index.stash.test
20:31:20.441368 preload-index.c:147 performance: 0.001419629 s: preload index
20:32:15.568433 read-cache.c:1605 performance: 55.128484420 s: refresh index
20:32:15.568611 diff-lib.c:251 performance: 0.000054503 s: diff-files
20:32:15.568847 unpack-trees.c:1546 performance: 0.000004362 s: traverse_trees
20:32:15.568868 unpack-trees.c:447 performance: 0.000008189 s: check_updates
20:32:15.568874 unpack-trees.c:1643 performance: 0.000040807 s: unpack_trees
20:32:15.568879 diff-lib.c:537 performance: 0.000079322 s: diff-index
20:32:15.569115 name-hash.c:600 performance: 0.000197074 s: initialize name hash
20:32:15.573785 dir.c:2326 performance: 0.004883714 s: read directory
20:32:15.574904 read-cache.c:3017 performance: 0.001083674 s: write index, changed mask = 82
20:32:15.575125 trace.c:475 performance: 55.135763475 s: git command: /usr/lib/git-core/git status .
20:32:15.575421 trace.c:475 performance: 55.136831211 s: git command: git st .
My repo looks like this:
>$ du -hd 1
1,1M ./.idea
74M ./code
3,0G ./.git
2,4G ./test-data
5,5G .
Similar picture if trace directly applied to git stash -p
:
20:43:55.968088 read-cache.c:1605 performance: 59.716998605 s: refresh index
20:43:55.969584 trace.c:475 performance: 59.719061140 s: git command: git update-index --refresh
Man page for git update-index --refresh
states:
USING --REFRESH
--refresh does not calculate a new sha1 file or bring the index up to date for mode/content changes. But what it does do is to "re-match" the stat information of a file with the index, so that you can refresh the index for a
file that hasn’t been changed but where the stat entry is out of date.
For example, you’d want to do this after doing a git read-tree, to link up the stat index details with the proper files.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With