Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Git: changing committers info

Tags:

git

bash

I'm using this script to modify commits:

rm -rf repo

echo "clonning $1"
git clone $1 repo

cd repo
git checkout dev

echo "setting remote origin to $2"
git remote set-url origin $2

array=( '[email protected]' '[email protected]' )
for OLD_EMAIL in "${array[@]}"
do
  echo $OLD_EMAIL
  git filter-branch -f --env-filter '
  CORRECT_NAME="New name"
  CORRECT_EMAIL="[email protected]"
  if [ "$GIT_COMMITTER_EMAIL" = '$OLD_EMAIL' ]
  then
      export GIT_COMMITTER_NAME="$CORRECT_NAME"
      export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
  fi
  if [ "$GIT_AUTHOR_EMAIL" = '$OLD_EMAIL' ]
  then
      export GIT_AUTHOR_NAME="$CORRECT_NAME"
      export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
  fi
  ' --tag-name-filter cat -- --tags
done
echo "Authors list:"
git log --format='%cE' | sort -u
echo -n "Push to destination (y/n)? "
read answer
if echo "$answer" | grep -iq "^y" ;then
    git push
else
    echo Aborted
fi

cd ../

It pulls data from first repo, modifies committers info and pushes to second repo.

The problem arises if someone will commit directly to the second repo. How do i apply those changes to the first repo?

like image 382
stkvtflw Avatar asked Sep 14 '17 10:09

stkvtflw


People also ask

How do I change the author and committer email in git?

Using Rebase This will change both the committer and the author to your user.name / user. email configuration. If you did not want to change that config, you can use --author "New Author Name <[email protected]>" instead of --reset-author . Note that doing so will not update the committer -- just the author.

Can you change commit author?

Change the Author in an Older Commit The commit you want to change is now represented in the first line. To change its author, you want to change the verb in front of it, from pick into edit . As you have just changed the author, there will be no conflicts and your rebase will finish successfully.

How do I change a commit message?

On the command line, navigate to the repository that contains the commit you want to amend. Type git commit --amend and press Enter. In your text editor, edit the commit message, and save the commit. You can add a co-author by adding a trailer to the commit.


1 Answers

If I'm understanding your question correctly (after reading the comments), your repo currently looks something like this:

Initial State

The commits in the first repo (a-d) have been modified to create the alternate commits (a'-d') which were pushed into a second repo and then had additional commits added, (e-g).

Re-editing Your History

Because you don't have a 1:1 relationship between the identity information in both repos, attempting to modify a'-d' with filter-branch in order to restore the original history, while theoretically possible, will require a method that will positively identify the 'original commit' without the one piece of information required to positively identify a commit (its hash).

A commit is basically made up of a few pieces of information:

  1. The hash of the tree
  2. The hash(s) of the commit's parent(s)
  3. The author's identity information
  4. The timestamp of the authoring
  5. The committer's identity information
  6. The timestamp of the commit
  7. The commit message
  8. The size of all that information

All this is hashed to create the unique identifier for your commit. Having altered 2, 3, 5, and 8, we're left with the tree, which is not necessarily unique, the timestamps, which are not necessarily unique, and the commit message, which is not necessarily unique.

Odds are you could get a decent match from just comparing the tree and one of the timestamps, so let's write a little pseudo-code for that scenario.

# create a variable to hold the information from teh current commit
pseudoidentifier=$TREE + $AUTHOR_TIMESTAMP

# go to the first repo
cd /path/to/firstrepo

# output the log | grep to search | sed to remove everything after delimeter
oldhash=`git log --format="{hash}~{tree}{authortimestamp}" | grep pseudoidenfier | sed "s/~.+$//"`

# get the new identity using a custom formatted show command
newidentity=`git show -q --format="{formatted identity}" $oldhash`

# parse out the name and email, probably with sed
CORRECT_NAME=`sed 's/pattern//' $newidentity`
CORRECT_EMAIL=`sed 's/pattern//' $newidentity`

# go to the second repo
cd /path/to/secondrepo

export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"

Unfortunately, this would be slow to write and difficult and time-consuming to test. Probably requiring re-running the entire thing multiple times. Since your ultimate goal is to re-unite the code. There are several other options that will likely cause a lot less headache and be a lot faster. Especially if you indeed need to keep the second repo with the identity updates intact.

Alternate Methods

Without a common history, you can still bring the two into sync using somewhat more manual means. Here are three methods I would recommend in this situation.

A little pre-work

Before we begin, we can check to see if the code at d and d' are indeed identical. We can do this by using the git show command:

$ git show -q --format="%T" d
a017285da45ec06fc744815f33a2e22627f4a799
$ git show -q --format="%T" d'
a017285da45ec06fc744815f33a2e22627f4a799

This command will output the tree object the commit points to, if the two trees match, you're dealing with identical code. It is entirely possibly to perform the following procedure without a matching code base, but you're likely to have to resolve conflicts in that situation. This step really just tells you how easily the two will come together.

The Cherry-Pick method

If the repo you used to originally modify the commits is intact, you can fetch the branches from both into a single repo and attempt to use cherry-pick to copy the commits.

git checkout <branch at d>
git cherry-pick d'...g

(Note that the syntax is 3 dots) This will apply the changes from each commit after (but not including) d' up to and including g onto d. Creating new commits e'-g'.

History after cherry-pick

The Patch Method

If you don't have an easy way to bring the changes from both branches into a single repo, you can create a series of patches for the commits on the second repo and apply them to the first.

In the second repo

git checkout <branch of g>
git format-patch --output-directory <dir> d'...g

(Again, the syntax is 3 dots) This will output a series of patch files for each commit after (and not including) d' up to and including g. Then copy these files to where you can get at them from the first repo to apply that patches.

In the first repo

git checkout <branch of d>
git am /path/to/patches/*

You'll end up in the same place you did from the cherry pick method.

History after patch

Create a Graft

If there are a lot of conflicts and you don't need to keep the identity altered information, you can also use git replace to perform a graft.

git replace --graft e d

This will create a copy of commit e with d as the parent and add a reference that says to use the e' commit whenever it attempts to access e. Effectively making d the common ancestor for both and allowing you to perform a traditional merge (h).

enter image description here

Then what?

Keeping two repos without a common history in sync will consistently cause you problems like this, and they will get worse as the two slowly diverge (for example, as you resolve conflicts). Over time both of these methods will require more and more resources to maintain the two repos.

I would recommend that once the two repos are synchronized, pick one of them and use that one exclusively from then on. If you require two remotes, just push that repo to both of them. You can then easily use any of the many tried and true workflows to maintain the two repos.

If this is not an option, I'd recommend being meticulous about checking the trees of the heads of your two repos to verify that they are bit-for-bit identical frequently.

like image 164
LightBender Avatar answered Oct 13 '22 00:10

LightBender