I'm using this script to modify commits:
rm -rf repo
echo "clonning $1"
git clone $1 repo
cd repo
git checkout dev
echo "setting remote origin to $2"
git remote set-url origin $2
array=( '[email protected]' '[email protected]' )
for OLD_EMAIL in "${array[@]}"
do
echo $OLD_EMAIL
git filter-branch -f --env-filter '
CORRECT_NAME="New name"
CORRECT_EMAIL="[email protected]"
if [ "$GIT_COMMITTER_EMAIL" = '$OLD_EMAIL' ]
then
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
fi
if [ "$GIT_AUTHOR_EMAIL" = '$OLD_EMAIL' ]
then
export GIT_AUTHOR_NAME="$CORRECT_NAME"
export GIT_AUTHOR_EMAIL="$CORRECT_EMAIL"
fi
' --tag-name-filter cat -- --tags
done
echo "Authors list:"
git log --format='%cE' | sort -u
echo -n "Push to destination (y/n)? "
read answer
if echo "$answer" | grep -iq "^y" ;then
git push
else
echo Aborted
fi
cd ../
It pulls data from first repo, modifies committers info and pushes to second repo.
The problem arises if someone will commit directly to the second repo. How do i apply those changes to the first repo?
Using Rebase This will change both the committer and the author to your user.name / user. email configuration. If you did not want to change that config, you can use --author "New Author Name <[email protected]>" instead of --reset-author . Note that doing so will not update the committer -- just the author.
Change the Author in an Older Commit The commit you want to change is now represented in the first line. To change its author, you want to change the verb in front of it, from pick into edit . As you have just changed the author, there will be no conflicts and your rebase will finish successfully.
On the command line, navigate to the repository that contains the commit you want to amend. Type git commit --amend and press Enter. In your text editor, edit the commit message, and save the commit. You can add a co-author by adding a trailer to the commit.
If I'm understanding your question correctly (after reading the comments), your repo currently looks something like this:
The commits in the first repo (a-d) have been modified to create the alternate commits (a'-d') which were pushed into a second repo and then had additional commits added, (e-g).
Because you don't have a 1:1 relationship between the identity information in both repos, attempting to modify a'-d' with filter-branch in order to restore the original history, while theoretically possible, will require a method that will positively identify the 'original commit' without the one piece of information required to positively identify a commit (its hash).
A commit is basically made up of a few pieces of information:
All this is hashed to create the unique identifier for your commit. Having altered 2, 3, 5, and 8, we're left with the tree, which is not necessarily unique, the timestamps, which are not necessarily unique, and the commit message, which is not necessarily unique.
Odds are you could get a decent match from just comparing the tree and one of the timestamps, so let's write a little pseudo-code for that scenario.
# create a variable to hold the information from teh current commit
pseudoidentifier=$TREE + $AUTHOR_TIMESTAMP
# go to the first repo
cd /path/to/firstrepo
# output the log | grep to search | sed to remove everything after delimeter
oldhash=`git log --format="{hash}~{tree}{authortimestamp}" | grep pseudoidenfier | sed "s/~.+$//"`
# get the new identity using a custom formatted show command
newidentity=`git show -q --format="{formatted identity}" $oldhash`
# parse out the name and email, probably with sed
CORRECT_NAME=`sed 's/pattern//' $newidentity`
CORRECT_EMAIL=`sed 's/pattern//' $newidentity`
# go to the second repo
cd /path/to/secondrepo
export GIT_COMMITTER_NAME="$CORRECT_NAME"
export GIT_COMMITTER_EMAIL="$CORRECT_EMAIL"
Unfortunately, this would be slow to write and difficult and time-consuming to test. Probably requiring re-running the entire thing multiple times. Since your ultimate goal is to re-unite the code. There are several other options that will likely cause a lot less headache and be a lot faster. Especially if you indeed need to keep the second repo with the identity updates intact.
Without a common history, you can still bring the two into sync using somewhat more manual means. Here are three methods I would recommend in this situation.
Before we begin, we can check to see if the code at d and d' are indeed identical. We can do this by using the git show command:
$ git show -q --format="%T" d
a017285da45ec06fc744815f33a2e22627f4a799
$ git show -q --format="%T" d'
a017285da45ec06fc744815f33a2e22627f4a799
This command will output the tree object the commit points to, if the two trees match, you're dealing with identical code. It is entirely possibly to perform the following procedure without a matching code base, but you're likely to have to resolve conflicts in that situation. This step really just tells you how easily the two will come together.
If the repo you used to originally modify the commits is intact, you can fetch the branches from both into a single repo and attempt to use cherry-pick to copy the commits.
git checkout <branch at d>
git cherry-pick d'...g
(Note that the syntax is 3 dots) This will apply the changes from each commit after (but not including) d' up to and including g onto d. Creating new commits e'-g'.
If you don't have an easy way to bring the changes from both branches into a single repo, you can create a series of patches for the commits on the second repo and apply them to the first.
git checkout <branch of g>
git format-patch --output-directory <dir> d'...g
(Again, the syntax is 3 dots) This will output a series of patch files for each commit after (and not including) d' up to and including g. Then copy these files to where you can get at them from the first repo to apply that patches.
git checkout <branch of d>
git am /path/to/patches/*
You'll end up in the same place you did from the cherry pick method.
If there are a lot of conflicts and you don't need to keep the identity altered information, you can also use git replace
to perform a graft.
git replace --graft e d
This will create a copy of commit e with d as the parent and add a reference that says to use the e' commit whenever it attempts to access e. Effectively making d the common ancestor for both and allowing you to perform a traditional merge (h).
Keeping two repos without a common history in sync will consistently cause you problems like this, and they will get worse as the two slowly diverge (for example, as you resolve conflicts). Over time both of these methods will require more and more resources to maintain the two repos.
I would recommend that once the two repos are synchronized, pick one of them and use that one exclusively from then on. If you require two remotes, just push that repo to both of them. You can then easily use any of the many tried and true workflows to maintain the two repos.
If this is not an option, I'd recommend being meticulous about checking the trees of the heads of your two repos to verify that they are bit-for-bit identical frequently.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With