What algorithm does git use to detect changes on your working tree?

Tags:

git

This is about the internals of git.

I've been reading the great 'Pro Git' book and learning a little about how git is working internally (all about the SHA1, blobs, references, trees, commits, etc, etc). Pretty clever architecture, by the way.

So, to put into context, git references the content of a file as a SHA1 value, so it's able to know if a specific content has changed just comparing the hash values. But my question is specifically about how git checks that the content in the working tree has changed or not.

The naive approach will be thinking that each time you run a command as git status or similar command, it will search through all the files on the working directory, calculating the SHA1 and comparing it with the one that has the last commit. But that seems very inefficient for big projects, as the Linux kernel.

Another idea could be to check last modification date on the file, but I think git is not storing that information (when you clone a repository, all the files have a new time)

I'm sure it's doing it in an efficient way (git is really fast), does anyone know how that is achieved?

PD: Just to add an interesting link about the git index, specifically stating that the index keeps information about files timestamps, even when the tree objects do not.

432

asked Nov 02 '10 06:11

Khelben

1 Answers

Git’s index maintains timestamps of when git last wrote each file into the working tree (and updates these whenever files are cached from the working tree or from a commit). You can see the metadata with git ls-files --debug. In addition to the timestamp, it records the size, inode, and other information from lstat to reduce the chance of a false positive.

When you perform git-status, it simply calls lstat on every file in the working tree and compares the metadata in order to quickly determine which files are unchanged. This is described in the documentation under racy-git and update-index.

answered Oct 02 '22 14:10

Josh Lee

Related questions
                            
                                Unable to find stash/apply functionalitit in EGit
                            
                                SourceTree very slow with many repositories
                            
                                How to undo a merge on Bitbucket?
                            
                                Error when push commits with Github: fatal: could not read Username
                            
                                Gerrit error when Change-Id in commit messages are missing
                            
                                Git shortcut to pull with clone if no local there yet?
                            
                                Git clone with custom SSH using GIT_SSH error
                            
                                How to undo a git commit --amend [duplicate]
                            
                                HTTP Basic: Access denied fatal: Authentication failed
                            
                                Git pull not pulling everything
                            
                                How to make GIT ignore my changes
                            
                                Git revert local commit
                            
                                How do I git clone --recursive and checkout master on all submodules in a single line?
                            
                                Undoing a git pull --rebase
                            
                                How to fix 'The project you were looking for could not be found' when using git clone
                            
                                django migrations - workflow with multiple dev branches
                            
                                npm notice created a lockfile as package-lock.json. You should commit this file
                            
                                git: How do I get rid of "warning: CRLF will be replaced by LF" without disabling safecrlf?
                            
                                How do git grafts and replace differ? (Are grafts now deprecated?)
                            
                                How do I determine the source branch of a particular branch?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With