Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using git filter-branch to rewrite authors/committers and commit messages simultaneously

I have a Git repository originally imported from Subversion. Parts of the author/committer information are wrong, which is not Git's fault but mostly due to sloppy committing with Subversion.

I would like to use git filter-branch to rewrite the history of the repository, fixing the committer and author information.

The trouble is... I need to slurp author information out of the commit messages. As far as I can tell, git filter-branch allows you to filter and alter the author information (with --env-filter) and/or to filter the commit messages (with --msg-filter), but not to do both simultaneously, with information shared between the different filters.

So I'm kind of stumped about how to do this... the best I can think of is to do it in multiple passes: first, collect allllll the commit messages, then make a script to go through and filter all the author/committer info. This seems quite inelegant and error-prone, so I'm wondering if anyone else has figured out a do this kind of work more smoothly.

like image 501
Dan Lenski Avatar asked Oct 15 '22 02:10

Dan Lenski


1 Answers

The only thing I can think of to get it done in one pass is to use a commit filter. Like the message filter, it takes the log message on stdin, so you will be able to parse it and find out what you need to. You can then set the appropriate variables yourself and call git commit-tree yourself. (The commit filter is essentially a drop-in replacement for commit-tree, taking the same arguments and producing the same output.)

In bash, it'd be something like this:

message=$(read_from_stdin)

modify_env_vars "$message"

echo "$message" | git commit-tree "$@"

I've never tried this, but I can't see why it wouldn't work, assuming you write those two shell functions properly!

(And just a small note - it's not so much that --env-filter and --msg-filter can't influence each other, it's that they're always run in that order. So, the first filter could leave behind side-effects in files or environment for the other to see, but they're in an order that keeps you from doing what you want.)

like image 50
Cascabel Avatar answered Oct 27 '22 00:10

Cascabel