I want to use git to keep a historical record of the actual dependencies an application has used over time, with higher fidelity than I can get from the package manager.
I am using these branches:
.gitignore
And this script, build-release.sh
:
DEV_MODULES="mocha chai bower coffeelint"
BUILT_FILES="node_modules build"
DATE=$(date)
TIMESTAMP=$(date +"%s")
BRANCH=$(git rev-parse --abbrev-ref HEAD)
# create a temporary branch with the current dependencies and binaries
npm uninstall $DEV_MODULES
git checkout -b build-$TIMESTAMP
git add --all --force $BUILT_FILES
git commit -m "copy $BUILT_FILES from $BRANCH"
# merge the temporary branch into the build branch
git branch build || echo "build branch already exists"
git checkout build --force
git merge build-$TIMESTAMP --strategy=subtree -m "Build as of $DATE"
git branch -D build-$TIMESTAMP
# restore the original branch
git checkout $BRANCH
git checkout build -- $BUILT_FILES
git rm -r --cached $BUILT_FILES
Which works, and gives me a useful view of changes to source, dependencies, and binaries from one release to the next:
But it takes twice as many commits as necessary. I want the tree to look like this:
How can I combine the "copy built files" commit with the "build as of" commit?
When I try to git merge --squash
, it ends up with the state that was on build
instead of the state that was on build-$TIMESTAMP
, which is incorrect (I want to import changes to ignored files, but merge seems to have no language to do this). When I try to git rebase --onto build build-$TIMESTAMP
I lose the parentage of the new commit.
I just want to record the exact files I get on the build-$TIMESTAMP
branch, but with both the build
and master
branches as parents, then point the build
branch to that commit.
This is straightforward plumbing territory. You're using "porcelain" commands, the source-control system built on top of git's content-tracker core, in ways that happen to winkle that porcelain into doing what you want, but it's much simpler to just talk to the content tracker directly.
For the simplest reading of what's in your question, namely that you want your "build" branch to record snapshots of the current checkout along with a selection of what's in the the "$BUILT_FILES"
paths/directories, it's
# knobs
DEV_MODULES="mocha chai bower coffeelint"
BUILT_FILES="node_modules build"
DATE=$(date)
# clean out stuff we don't care about
npm uninstall $DEV_MODULES
# record current checkout plus "$BUILT_FILES" to `build` branch
git add --all --force $BUILT_FILES
build=`git rev-parse -q --verify build`
git update-ref refs/heads/build $(
git commit-tree ${build:+-p $build} -p HEAD \
-m "build as of $DATE" \
`git write-tree`
)
# reset index to HEAD
git reset # `git read-tree HEAD` will have the same effect, perhaps more quietly
As a quick overview or reminder as the case may be, git's object database is a hashcode-indexed key-value store. You ask git for anything by its type and hash, it obligingly regurgitates exactly that from its object db. The index is nothing more than a flat file, a path-indexed manifest, showing what content in the object db goes at what path, along with some metadata and notes for tracking in-flight operations. git add
ing content at a path just puts the content in the object db and the content's hash in the index entry for that path.
(somewhat of a rant here, skip if not in the mood to be preached at) The thing to understand is that git is utterly, brutally concrete. Everything about a repository beyond the object db is pure convention. git checkout
makes HEAD
refer to the commit you checked out purely by convention. You can implement git checkout
as an extremely thin wrapper around git read-tree -um
-- the chief extra operation being to set HEAD
to the commit you got that tree from . git commit
makes HEAD
the parent of what you're committing purely by convention. You can implement git commit
yourself as an extremely thin wrapper around git commit-tree
and git write-tree
, the chief extra operation being to supply HEAD
as a parent and tu opdate HEAD
to the new commit. The name HEAD
is itself purely conventional. The content tracker on which those are built couldn't care less about HEAD
, or the distinction between branches and tags, or anything of the sort. The conventions are intentionally, aggressively and brutally simple, because (a) there's no need for abstraction, the content model already matches the requirements perfectly, and (b) the whole point of git is the lack of abstraction at the core: it's "stupid".
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With