I'm writing a git post-receive hook using Python and Git-Python that gathers information about the commits contained in a push, then updates our bug tracker and IM with a summary. I'm having trouble in the case where a push creates a branch (i.e. the fromrev parameter to post-receive is all zeroes) and also spans several commits on that branch. I'm walking the list of parents backwards from the torev commit, but I can't figure out how to tell which commit is the first one in the branch, i.e. when to stop looking.
On the command line I can do
git rev-list this-branch ^not-that-branch ^master
which will give me exactly the list of commits in this-branch, and no others. I've tried to replicate this using the Commit.iter_parents method which is documented to take the same parameters as git-rev-list but it doesn't like positional parameters as far as I can see, and I can't find a set of keyword params that work.
I read the doco for Dulwich but it wasn't clear whether it would do anything very differently from Git-Python.
My (simplified) code looks like this. When a push starts a new branch it currently only looks at the first commit and then stops:
import git
repo = git.Repo('.')
for line in input:
    (fromrev, torev, refname) = line.rstrip().split(' ')
    commit = repo.commit(torev)
    maxdepth = 25    # just so we don't go too far back in the tree
    if fromrev == ('0' * 40):
        maxdepth = 1
    depth = 0
    while depth < maxdepth:
        if commit.hexsha == fromrev:
            # Reached the start of the push
            break
        print '{sha} by {name}: {msg}'.format(
            sha = commit.hexsha[:7], user = commit.author.name, commit.summary)
        commit = commit.parents[0]
        depth += 1
Using pure Git-Python, it can also be done. I have not found a way to identify a set of kwargs that would do it in one go either. But one can simply construct a set of shas of the master branch, then use iter_commits on the to-be-examined branch in order to find the first one that doesn't appear in the parent:
from git import *
repo_path = '.'
repo = Repo(repo_path)
parent_branch = repo.branches.master
examine_branch = repo.branches.test_feature_branch
other_shas = set()
for parent_commit in repo.iter_commits(rev=parent_branch):
    other_shas.add(parent_commit.hexsha)
for commit in repo.iter_commits(rev=examine_branch):
    if commit.hexsha not in other_shas:
        first_commit = commit
print '%s by %s: %s' % (first_commit.hexsha[:7],
        first_commit.author.name, first_commit.summary)
And if you really want to be sure to exclude all commits on all other branches, you can wrap that first for-loop in another for-loop over repo.branches:
other_shas = set()
for branch in repo.branches:
    if branch != examine_branch:
        for commit in repo.iter_commits(rev=branch):
            other_shas.add(commit.hexsha)
I just played around with dulwich, maybe there's a much better way to do this (with a builtin walker?). Assuming there's just one new branch (or multiple new branches with nothing in common):
#!/usr/bin/env python
import sys
from dulwich.repo import Repo
from dulwich.objects import ZERO_SHA
def walk(repo, sha, shas, callback=None, depth=100):
    if not sha in shas and depth > 0:
        shas.add(sha)
        if callback:
            callback(sha)
        for parent in repo.commit(sha).parents:
            walk(repo, parent, shas, callback, depth - 1)
def reachable_from_other_branches(repo, this_branch):
    shas = set()
    for branch in repo.refs.keys():
        if branch.startswith("refs/heads") and branch != this_branch:
            walk(repo, repo.refs[branch], shas)
    return shas
def branch_commits(repo, fromrev, torev, branchname):
    if fromrev == ZERO_SHA:
        ends = reachable_from_other_branches(repo, branchname)
    else:
        ends = set([fromrev])
    def print_callback(sha):
        commit = repo.commit(sha)
        msg = commit.message.split("\n")[0]
        print('{sha} by {author}: {msg}'
              .format(sha=sha[:7], author=commit.author, msg=msg))
    print(branchname)
    walk(repo, torev, ends, print_callback)
repo = Repo(".")
for line in sys.stdin:
    fromrev, torev, refname = line.rstrip().split(' ')
    branch_commits(repo, fromrev, torev, refname)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With