I would like to ensure that my executable is built with the most up to date version of the code. For example, I can take the current git commit at the time of compile and bake it into the executable; then when the executable is run, it compares this with the current git commit and if they don't match it complains that the code has been modified and that it is out of date. However, sometimes I recompile without making a commit, after making small changes to a the code. Then this method doesn't work, as it only accounts for committed changes. Is there any convenient way to programatically get a hash of the current commit PLUS the state of the working directory, using git or otherwise? Also, is there a name for this practice?

If all you want to do is determine whether there are any uncommitted modifications, that's easy; just run <code>git diff --quiet HEAD</code> and check whether the return code is non-zero. If you actually need a hash of the changes, so that two users with the same starting commit and the same local modifications will get the same hash, that's trickier. My first thought is to pipe the output of <code>git diff HEAD</code> into <code>sha1sum</code>, and concatenate it to the commit hash, but the output of <code>git diff</code> might vary for different Git versions and config options. Alternatively, you could use <code>git add -u . && git write-tree</code> to get an honest-to-goodness Git tree object for the current working tree. But that's a destructive operation; it clobbers any partially-staged changes that were already in your index.

It is possible to create and store a majority of changes in the current working tree, including all staged, unstaged and untracked files, while respecting .gitignore. Roughly, one needs to <pre class="prettyprint"><code>#!/bin/sh { git diff-index --name-only HEAD git ls-files -o --exclude-standard } \ | while read path; do test -f "$path" && printf "100644 blob %s\t$path\n" $(git hash-object -w "$path"); test -d "$path" && printf "160000 commit %s\t$path\n" $(cd "$path"; git rev-parse HEAD); done | sed 's,/,\\,g' | git mktree --missing </code></pre> The first diff lists all tracked files different from HEAD. Then we find the untracked ones, but exclude the ignored. We then pipe output of these two commands into a loop tnat constructs <code>git mktree</code> input for all the files. The output of that goes through <code>sed</code> because <code>git mktree</code> doesn't recursively construct trees, but the actual paths here don't matter since we just want a hashcode, none of the actual content is ever stored for retrieval. Finally, we pass this <code>ls-tree</code>-formatted output to <code>mktree</code>, which constructs the specified tree and stores it in Git, outputting the hash to us. With a bit of extra effort one can also keep information about permissions and possibly even file deletions. After all, this is what Git does when you do an actual commit. One can argue that all these hoops are useful in situations when you do want to store your changes for future reference but don't want to pollute the index with unnecessary commits for every little change. As such, it may be useful for internal testing with micro-releases, where you can log the local hash as the actual version of your code instead of just the non-descriptive <code>-dirty</code> flag, to see where exactly your code failed when you forgot to tag or commit it for each working version. Some may consider this to be a bad habit that should instead force you to do commit for every successful build, however small - it's hard to argue with that, but then again it's all about convenience.

Git: get a hash of the current state of the working tree?

Tags:

git

version-control

I would like to ensure that my executable is built with the most up to date version of the code.

For example, I can take the current git commit at the time of compile and bake it into the executable; then when the executable is run, it compares this with the current git commit and if they don't match it complains that the code has been modified and that it is out of date.

However, sometimes I recompile without making a commit, after making small changes to a the code. Then this method doesn't work, as it only accounts for committed changes.

Is there any convenient way to programatically get a hash of the current commit PLUS the state of the working directory, using git or otherwise?

Also, is there a name for this practice?

315

asked May 29 '14 18:05

user2664470

2 Answers

If all you want to do is determine whether there are any uncommitted modifications, that's easy; just run git diff --quiet HEAD and check whether the return code is non-zero.

If you actually need a hash of the changes, so that two users with the same starting commit and the same local modifications will get the same hash, that's trickier. My first thought is to pipe the output of git diff HEAD into sha1sum, and concatenate it to the commit hash, but the output of git diff might vary for different Git versions and config options.

Alternatively, you could use git add -u . && git write-tree to get an honest-to-goodness Git tree object for the current working tree. But that's a destructive operation; it clobbers any partially-staged changes that were already in your index.

answered Oct 21 '22 12:10

David

It is possible to create and store a majority of changes in the current working tree, including all staged, unstaged and untracked files, while respecting .gitignore. Roughly, one needs to

#!/bin/sh
{   git diff-index --name-only HEAD
    git ls-files -o --exclude-standard
} \
| while read path; do
    test -f "$path" && printf "100644 blob %s\t$path\n" $(git hash-object -w "$path");
    test -d "$path" && printf "160000 commit %s\t$path\n" $(cd "$path"; git rev-parse HEAD);
done | sed 's,/,\\,g' | git mktree --missing

The first diff lists all tracked files different from HEAD.

Then we find the untracked ones, but exclude the ignored.

We then pipe output of these two commands into a loop tnat constructs git mktree input for all the files.

The output of that goes through sed because git mktree doesn't recursively construct trees, but the actual paths here don't matter since we just want a hashcode, none of the actual content is ever stored for retrieval.

Finally, we pass this ls-tree-formatted output to mktree, which constructs the specified tree and stores it in Git, outputting the hash to us.

With a bit of extra effort one can also keep information about permissions and possibly even file deletions. After all, this is what Git does when you do an actual commit.

One can argue that all these hoops are useful in situations when you do want to store your changes for future reference but don't want to pollute the index with unnecessary commits for every little change. As such, it may be useful for internal testing with micro-releases, where you can log the local hash as the actual version of your code instead of just the non-descriptive -dirty flag, to see where exactly your code failed when you forgot to tag or commit it for each working version. Some may consider this to be a bad habit that should instead force you to do commit for every successful build, however small - it's hard to argue with that, but then again it's all about convenience.

answered Oct 21 '22 13:10

dinvlad

Related questions
                            
                                git rebase, "would be overwritten", and "No changes - did you forget to use 'git add'?"
                            
                                git-svn - Checkout a remote branch that already exists in svn when my master is already a branch. Explain this .git/config
                            
                                Is it possible to cleanup a bare repo?
                            
                                How can I tell how far into a git rebase I am when resolving conflicts?
                            
                                Using a single git repository for multiple git projects
                            
                                Cloning a Git repository over SFTP
                            
                                How to keep a clean history after GitHub Pull Request code review?
                            
                                Is it possible to do a shallow git clone based on datetime?
                            
                                Git Shows Random Files as Modified After Clone; Can't Discard Them
                            
                                Integrate indentation & content changes in Git during merge: Best practices?
                            
                                Github Windows client "loading commits failed"
                            
                                Commits in a git bundle
                            
                                File will not sync after committing in GitHub Windows client?
                            
                                Removing history from git - git command fails
                            
                                Why is the git push syntax for creating a new remote branch from a detached HEAD so different?
                            
                                GitLab SSH keys stopped working
                            
                                Couldn't find remote ref HEAD in Git
                            
                                How to fetch someone else's pull request (to fix it)
                            
                                git: fatal: index-pack failed
                            
                                GitEye won't show changes

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With