I have a repo of 10 GB on a Linux machine which is on NFS. The first time git status
takes 36 minutes and subsequent git status
takes 8 minutes. Seems Git depends on the OS for caching files. Only the first git
commands like commit
, status
that involves pack/repack the whole repo takes a very long time for a huge repo. I am not sure if you have used git status
on such a large repo, but has anyone come across this issue?
I have tried git gc
, git clean
, git repack
but the time taken is still/almost the same.
Will sub-modules or any other concepts like breaking the repo into smaller ones help? If so which is the best for splitting a larger repo. Is there any other way to improve time taken for git commands on a large repo?
When in doubt, run git status . This is always a good idea. The git status command only outputs information, it won't modify commits or changes in your local repository. A useful feature of git status is that it will provide helpful information depending on your current situation.
When I run git status , it regularly takes 20-30 seconds to complete.
To recap, git clean is a convenience method for deleting untracked files in a repo's working directory. Untracked files are those that are in the repo's directory but have not yet been added to the repo's index with git add .
To be more precise, git depends on the efficiency of the lstat(2)
system call, so tweaking your client’s “attribute cache timeout” might do the trick.
The manual for git-update-index
— essentially a manual mode for git-status
— describes what you can do to alleviate this, by using the --assume-unchanged
flag to suppress its normal behavior and manually update the paths that you have changed. You might even program your editor to unset this flag every time you save a file.
The alternative, as you suggest, is to reduce the size of your checkout (the size of the packfiles doesn’t really come into play here). The options are a sparse checkout, submodules, or Google’s repo tool.
(There’s a mailing list thread about using Git with NFS, but it doesn’t answer many questions.)
I'm also seeing this problem on a large project shared over NFS.
It took me some time to discover the flag -uno that can be given to both git commit and git status.
What this flag does is to disable looking for untracked files. This reduces the number of nfs operations significantly. The reason is that in order for git to discover untracked files it has to look in all subdirectories so if you have many subdirectories this will hurt you. By disabling git from looking for untracked files you eliminate all these NFS operations.
Combine this with the core.preloadindex flag and you can get resonable perfomance even on NFS.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With