Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I recover from a missing blob in a Git repository?

Tags:

git

database

I'm running Git 1.6.4.2. Garbage collection is failing saying "error: unable to find <SHA1>".

I've managed to determine that the missing object is a blob, and there is no way that I can get the blob file back. It seems that two scripts that run "git add" and "git commit" were running at the same time and managed to interfere with each other so that one committed a newer version of a file than the other, and the older version's blob vanished.

So I'm trying to roll back my repository to take out the commit that refers to the tree that refers to the missing blob.

I know which branch the commit was on, so I ran "git reset" on it to rewind to the parent of the duff commit. And I know that the branch was merged somewhere else, so I rewound that branch too. So as far as I know, the duff commit/tree/blob are not referenced by anything. But if I run git prune --expire=now followed by git gc then I still get an error about the missing object.

How can I query the Git database to find every tree object that contains the duff blob id? And how do I then find out what is causing Git prune to retain it?

like image 694
kbro Avatar asked Sep 08 '11 11:09

kbro


People also ask

What does missing blob mean?

BLOBs are one of the parts composing every Zimbra item, the other one being the item's metadata which is stored in Zimbra's Database. Specifically, a BLOB is a file within a volume containing the content of an item. The most common causes for BLOB loss are: Storage corruption after a power outage.

Where are git BLOBs stored?

Blobs are stored in Git's object database, which is located in the path . git/objects/ in the root directory of your project.

What is git fsck?

The git fsck command checks the connectivity and validity of objects in the git repository. Using this command, users can confirm the integrity of the files in their repository and identify any corrupted objects. This command will also inform the user if there are any dangling objects in the repository.

What is blob in git?

A Git blob (binary large object) is the object type used to store the contents of each file in a repository. The file's SHA-1 hash is computed and stored in the blob object. These endpoints allow you to read and write blob objects to your Git database on GitHub.


2 Answers

After a bit more digging it turns out that my question is answered here: How to delete a blob from a Git repo - git prune wasn't pruning the stuff I'd wound back because the reflog was still referring to it. Running

git reflog expire --expire=now --all

fixed that. Also, the referenced post gives a mechanism for running git lstree on every commit to find the referenced blob.

like image 108
kbro Avatar answered Oct 22 '22 09:10

kbro


I had the same problem (missing blob) and the solution with

git reflog expire --expire=now --all

didn't do the trick. I found my solution in How can I fix a broken repository?.

This simple line,

git hash-object -w <file>

Fixed the missing blob.

like image 44
lolo101 Avatar answered Oct 22 '22 08:10

lolo101