Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I corrupt a Git repository?

What are some ways to create a corrupt git repository? Are there ways to permanently damage a git repository interestingly? Can you cripple a git repository such that it behaves somewhat normally but does strange things?

My interest comes from when someone is worried about if they've truly created an unrecoverable state. It usually turns out to be something easy to fix or at least to piece together. Are there hidden (evil) gems in git?

like image 670
Kyle Kelley Avatar asked Nov 03 '13 23:11

Kyle Kelley


People also ask

How do I know if my git repo is corrupted?

Git has a command to manually check integrity of the repository: git fsck . Running it lists all the errors. Luckily in my case the list was quite short so I went ahead and deleted all the objects that were listed as corrupted.

What is git fsck?

git-fsck - Verifies the connectivity and validity of the objects in the database.

What is the effect if file in main repository become corrupt?

If Repository becomes corrupted, then the WMI service will not be able to function correctly.


1 Answers

Well, the most straightforward corruption that can happen is the loss of data or data integrity inside the .git/objects directory. Since it's designed to be an immutable, write-only storage mechanism, once you violate that assumption, lots of other things will fall apart. Most commonly this would be caused by packfiles that were corrupted in network transmission, say. Unless you're very (read: astronomically) unlucky, though, git will detect this as a matter of course and complain loudly. To get a silent failure this way, you'd need to corrupt a blob in such a way that it preserves its SHA1 hash... under deflate compression... with an accurate type-and-size header.

So, git is pretty good at verifying its own data integrity. What else can we do? To really make state unrecoverable, you need:

  1. The commits and other objects associated with that state to be unreferenced (that is, not reachable by any named ref under .git/refs or any reflog); then
  2. Garbage collection to actually delete the state forever, or to take a fresh clone and delete the original.

Otherwise, you'll always be able to run git checkout <sha> && git branch recovered and get all of your work back, no matter whatever else you've done. Commits are orphaned like this during normal git usage when you rebase, cherry-pick, or filter-branch, all of which create new commit objects based on the old ones, or if you git reset --hard a branch around. By default you have a grace period of about two weeks before they get deleted, then, although you can always truncate your reflog and prune manually to nuke something early.

Far more often, I've seen data loss when users never add their data to git in the first place. New users are sometimes hesitant to commit frequently, and attempt to use commands with a dirty working copy, for example. If you never record a state in git, git can't bring it back for you!

If you're okay with recoverable but hard-to-notice chicanery, you can do some evilness with git replace or graft points to fool git into operating on a fake history with merges or filter-branch operations. Replaced commits still count as reachable, though, so it won't be permanent damage.

like image 136
Ash Wilson Avatar answered Oct 31 '22 15:10

Ash Wilson