Git

Question

I want to use git to keep a historical record of the actual dependencies an application has used over time, with higher fidelity than I can get from the package manager.

I am using these branches:

master: source code only. dependencies in .gitignore
build: source code and dependencies
build-$TIMESTAMP: temporary branch used to force commit of ignored files

And this script, build-release.sh:

DEV_MODULES="mocha chai bower coffeelint"
BUILT_FILES="node_modules build"
DATE=$(date)
TIMESTAMP=$(date +"%s")
BRANCH=$(git rev-parse --abbrev-ref HEAD)

# create a temporary branch with the current dependencies and binaries
npm uninstall $DEV_MODULES
git checkout -b build-$TIMESTAMP
git add --all --force $BUILT_FILES
git commit -m "copy $BUILT_FILES from $BRANCH"

# merge the temporary branch into the build branch
git branch build || echo "build branch already exists"
git checkout build --force
git merge build-$TIMESTAMP --strategy=subtree -m "Build as of $DATE"
git branch -D build-$TIMESTAMP

# restore the original branch
git checkout $BRANCH
git checkout build -- $BUILT_FILES
git rm -r --cached $BUILT_FILES

Which works, and gives me a useful view of changes to source, dependencies, and binaries from one release to the next:

Screenshot showing merge bubbles

But it takes twice as many commits as necessary. I want the tree to look like this:

artist's concept

How can I combine the "copy built files" commit with the "build as of" commit?

When I try to git merge --squash, it ends up with the state that was on build instead of the state that was on build-$TIMESTAMP, which is incorrect (I want to import changes to ignored files, but merge seems to have no language to do this). When I try to git rebase --onto build build-$TIMESTAMP I lose the parentage of the new commit.

I just want to record the exact files I get on the build-$TIMESTAMP branch, but with both the build and master branches as parents, then point the build branch to that commit.

jthill · Accepted Answer

This is straightforward plumbing territory. You're using "porcelain" commands, the source-control system built on top of git's content-tracker core, in ways that happen to winkle that porcelain into doing what you want, but it's much simpler to just talk to the content tracker directly.

For the simplest reading of what's in your question, namely that you want your "build" branch to record snapshots of the current checkout along with a selection of what's in the the "$BUILT_FILES" paths/directories, it's

# knobs
DEV_MODULES="mocha chai bower coffeelint"
BUILT_FILES="node_modules build"
DATE=$(date)

# clean out stuff we don't care about
npm uninstall $DEV_MODULES

# record current checkout plus "$BUILT_FILES" to `build` branch
git add --all --force $BUILT_FILES
build=`git rev-parse -q --verify build`
git update-ref refs/heads/build $(
        git commit-tree ${build:+-p $build} -p HEAD \
                -m "build as of $DATE" \
                `git write-tree`
)

# reset index to HEAD
git reset  # `git read-tree HEAD` will have the same effect, perhaps more quietly

As a quick overview or reminder as the case may be, git's object database is a hashcode-indexed key-value store. You ask git for anything by its type and hash, it obligingly regurgitates exactly that from its object db. The index is nothing more than a flat file, a path-indexed manifest, showing what content in the object db goes at what path, along with some metadata and notes for tracking in-flight operations. git adding content at a path just puts the content in the object db and the content's hash in the index entry for that path.

(somewhat of a rant here, skip if not in the mood to be preached at) The thing to understand is that git is utterly, brutally concrete. Everything about a repository beyond the object db is pure convention. git checkout makes HEAD refer to the commit you checked out purely by convention. You can implement git checkout as an extremely thin wrapper around git read-tree -um -- the chief extra operation being to set HEAD to the commit you got that tree from . git commit makes HEAD the parent of what you're committing purely by convention. You can implement git commit yourself as an extremely thin wrapper around git commit-tree and git write-tree, the chief extra operation being to supply HEAD as a parent and tu opdate HEAD to the new commit. The name HEAD is itself purely conventional. The content tracker on which those are built couldn't care less about HEAD, or the distinction between branches and tags, or anything of the sort. The conventions are intentionally, aggressively and brutally simple, because (a) there's no need for abstraction, the content model already matches the requirements perfectly, and (b) the whole point of git is the lack of abstraction at the core: it's "stupid".

Git - How do I squash changes to ignored files without losing those changes?

Tags:

Phssthpok

1 Answers

jthill

Recent Activity

Donate For Us