How can I fix a corrupted Git repository?

Tags:

git

Note:

If your repository has submodules, this process will mess them up somehow, and the only solution I've found so far is deleting them and then using git submodule update --init (or recloning the repository, but that seems too drastic).
This tries to determine the correct choice between 'main' and 'master' depending on local configuration settings, however there may be some issues if used on a repository that uses 'master', on a machine that has 'main' as the default branch.
This uses wget to check that the url is reachable before doing anything. This is not necessarily the best operation to determine that a site is reachable, and if you haven't got wget available, this can likely be replaced with ping -c 1 "${url_base}" (linux), ping -n 1 "${url_base}" (windows), or curl -Is "${url_base}"

Appendix A - Full script

Also published as a gist, though it is now out of date.

#!/bin/bash

# Usage: fix-git [REMOTE-URL]
#   Must be run from the root directory of the repository.
#   If a remote is not supplied, it will be read from .git/config
#
# For when you have a corrupted local repo, but a trusted remote.
# This script replaces all your history with that of the remote.
# If there is a .git, it is backed up as .git_old, removing the last backup.
# This does not affect your working tree.
#
# This does not currently work with submodules!
# This will abort if a suspected submodule is found.
# You will have to delete them first
# and re-clone them after (with `git submodule update --init`)
#
# Error codes:
# 1: If a URL is not supplied, and one cannot be read from .git/config
# 4: If the URL cannot be reached
# 5: If a Git submodule is detected


if [[ "$(find -name .git -not -path ./.git | wc -l)" -gt 0 ]] ;
then
    echo "It looks like this repo uses submodules" >&2
    echo "You will need to remove them before this script can safely execute" >&2
    echo "Then use \`git submodule update --init\` to re-clone them" >&2
    exit 5
fi

if [[ $# -ge 1 ]] ;
then
    url="$1"
else
    if ! url="$(git config --local --get remote.origin.url)" ;
    then
        echo "Unable to find remote 'origin': missing in '.git/config'" >&2
        exit 1
    fi
fi

if ! branch_default="$(git config --get init.defaultBranch)" ;
then
    # if the defaultBranch config option isn't present, then it's likely an old version of git that uses "master" by default
    branch_default="master"
fi

url_base="$(echo "${url}" | sed -E 's;^([^/]*://)?([^/]*)(/.*)?$;\2;')"
echo "Attempting to access ${url_base} before continuing"
if ! wget -p "${url_base}" -O /dev/null -q --dns-timeout=5 --connect-timeout=5 ;
then
    echo "Unable to reach ${url_base}: Aborting before any damage is done" >&2
    exit 4
fi

echo
echo "This operation will replace the local repo with the remote at:"
echo "${url}"
echo
echo "This will completely rewrite history,"
echo "but will leave your working tree intact"
echo -n "Are you sure? (y/N): "

read confirm
if ! [ -t 0 ] ; # i'm open in a pipe
then
    # print the piped input
    echo "${confirm}"
fi
if echo "${confirm}"|grep -Eq "[Yy]+[EeSs]*" ; # it looks like a yes
then
    if [[ -e .git ]] ;
    then
        # remove old backup
        rm -vrf .git_old | tail -n 1 &&
        # backup .git iff it exists
        mv -v .git .git_old
    fi &&
    git init &&
    git remote add origin "${url}" &&
    git config --local --get remote.origin.url | sed 's/^/Added remote origin at /' &&
    git fetch &&
    git reset "origin/${branch_default}" --mixed
else
    echo "Aborting without doing anything"
fi

TL;DR

Git doesn't really store history the way you think it does. It calculates history at run-time based on an ancestor chain. If your ancestry is missing blobs, trees, or commits then you may not be able to fully recover your history.

Restore Missing Objects from Backups

The first thing you can try is to restore the missing items from backup. For example, see if you have a backup of the commit stored as .git/objects/98/4c11abfc9c2839b386f29c574d9e03383fa589. If so you can restore it.

You may also want to look into git-verify-pack and git-unpack-objects in the event that the commit has already been packed up and you want to return it to a loose object for the purposes of repository surgery.

Surgical Resection

If you can't replace the missing items from a backup, you may be able to excise the missing history. For example, you might examine your history or reflog to find an ancestor of commit 984c11abfc9c2839b386f29c574d9e03383fa589. If you find one intact, then:

Copy your Git working directory to a temporary directory somewhere.
Do a hard reset to the uncorrupted commit.
Copy your current files back into the Git work tree, but make sure you don't copy the .git folder back!
Commit the current work tree, and do your best to treat it as a squashed commit of all the missing history.

If it works, you will of course lose the intervening history. At this point, if you have a working history log, then it's a good idea to prune your history and reflogs of all unreachable commits and objects.

Full Restores and Re-Initialization

If your repository is still broken, then hopefully you have an uncorrupted backup or clone you can restore from. If not, but your current working directory contains valid files, then you can always re-initialize Git. For example:

rm -rf .git
git init
git add .
git commit -m 'Re-initialize repository without old history.'

It's drastic, but it may be your only option if your repository history is truly unrecoverable. YMMV.

Before trying any of the fixes described on this page, I would advise to make a copy of your repository and work on this copy only. Then at the end if you can fix it, compare it with the original to ensure you did not lose any file in the repair process.

Another alternative which worked for me was to reset the Git head and index to its previous state using:

git reset --keep

You can also do the same manually by opening the Git GUI and selecting each "Staged changes" and click on "Unstage the change". When everything is unstaged, you should now be able to compress your database, check your database and commit.

I also tried the following commands, but they did not work for me. But they might for you depending on the exact issue you have:

git reset --mixed
git fsck --full
git gc --auto
git prune --expire now
git reflog --all

Finally, to avoid this problem of synchronization damaging your Git index (which can happen with Dropbox, SpiderOak, or any other cloud disk), you can do the following:

Convert your .git folder into a single "bundle" Git file by using: git bundle create my_repo.git --all, then it should work just the same as before, but since everything is in a single file you won't risk the synchronization damaging your git repo any more.
Disable instantaneous synchronization: SpiderOak allows you to set the scheduling for checking changes to "automatic" (which means that it is as soon as possible, being monitoring file changes thanks to the OS notifications). This is bad, because it will start to upload changes as soon as you are doing a change, and then download the change, so it might erase the latest changes you were just doing. A solution to fix this issue is to set the changes monitoring delay to 5 minutes or more. This also fixes issues with instant saving note taking applications (such as Notepad++).

Here's a script (Bash) to automate the first solution by @CodeGnome to restore from a backup (run from the top level of the corrupted repository). The backup doesn't need to be complete; it only needs to have the missing objects.

git fsck 2>&1 | grep -e missing -e invalid | awk '{print $NF}' | sort -u |
    while read entry; do
        mkdir -p .git/objects/${entry:0:2}
        cp ${BACKUP}/objects/${entry:0:2}/${entry:2} .git/objects/${entry:0:2}/${entry:2}
    done

Related questions
                            
                                git - Your branch is ahead of 'origin/master' by 1 commit
                            
                                Error while pull from git - insufficient permission for adding an object to repository database .git/objects
                            
                                Reset other branch to current without a checkout
                            
                                How to get git diff with full context?
                            
                                Disable git EOL Conversions
                            
                                git cherry-pick not working
                            
                                rsync exclude according to .gitignore & .hgignore & svn:ignore like --filter=:C
                            
                                Git: how to reverse-merge a commit?
                            
                                Is git-svn dcommit after merging in git dangerous?
                            
                                How to duplicate a git repository? (without forking)
                            
                                How to sort git tags by version string order of form rc-X.Y.Z.W?
                            
                                Git how to rollback a rebase
                            
                                How to check if remote branch exists on a given remote repository?
                            
                                env: bash\r: No such file or directory
                            
                                Difference between GIT and CVS
                            
                                Update an outdated branch against master in a Git repo
                            
                                How does git compute file hashes?
                            
                                Can't seem to discard changes in Git
                            
                                Visual Studio 2013 and BitBucket
                            
                                GitLab CI vs. Jenkins [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With