Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

git fetch fails due to pack-object failure

When I add our remote repository as upstream and try to fetch it , it fails as below :

    $ git fetch upstream
    remote: Counting objects: 11901, done.
    remote: aborting due to possible repository corruption on the remote side.
    error: pack-objects died of signal 9
    error: git upload-pack: git-pack-objects died with error.
    fatal: git upload-pack: aborting due to possible repository corruption on the re
    mote side.
    fatal: protocol error: bad pack header

I understand that it fails due to having huge files in the repository( which we do have) , but why does it Not fail when I clone the same repository? Because I am able to clone the repository successfully. Shouldn't The same objects be packed at the time of a clone request?

like image 664
nikel Avatar asked Jan 26 '14 04:01

nikel


People also ask

Why does gitgit clone-Git fetch fail?

git clone - git fetch fails due to pack-object failure - Stack Overflow When I add our remote repository as upstream and try to fetch it , it fails as below : $ git fetch upstream remote: Counting objects: 11901, done. remote: aborting due to possible repo... Stack Overflow About Products For Teams

How to fix Git fetch failed with exit code 1 error?

In short, we can fix the git fetch failed with exit code 1 error by adding an explicit --force, checking for syntax error, and verifying the configuration file for its properness in case sensitivity. Also, we saw how our Support Engineers find a fix for this Git fetch error.

Why does gitfetch--tags fail to work?

But now it’s not, so git fetch --tags can fail if the upstream tags changed without manually specifying --force. Therefore, to fix the error, we added an explicit --force to the list of options in GitCommandManager. Also, the GitFetch should match the old behavior to prevent this particular error from popping again.

How to avoid Git unpack-objects on received data?

Do not invoke git unpack-objects on received data, but create a single packfile out of it instead, and store it in the object database. If provided twice then the pack is locked against repacking. Fetch a "thin" pack, which records objects in deltified form based on objects not included in the pack to reduce network traffic.


2 Answers

To expand a bit on VonC's answer...

First, it may help to note that signal 9 refers to SIGKILL and tends to occur because the remote in question is a Linux host and the process is being destroyed by the Linux "OOM killer" (although some non-Linux systems behave similarly).

Next, let's talk about objects and pack-files. A git "object" is one of the four types of items that are found in a git repository: a "blob" (a file); a "tree" (a list of blobs, their modes, and their names-as-stored-in-a-directory: i.e., what will become a directory or folder on when a commit is unpacked); a "commit" (which gives the commit author, message, and top level tree among other data); and a "tag" (an annotated tag). Objects can be stored as "loose objects", with one object in a file all by itself; but these can take up a lot of disk space, so they can instead be "packed", many objects into one file with extra compression added.

Making a pack out of a lot of loose objects, doing this compression, is (or at least can be) a cpu- and memory-intensive operation. The amount of memory required depends on the number of objects and their underlying sizes: large files take more memory. Many large files take a whole lot of memory.

Next, as VonC noted, git clone skips the attempt to use "thin" packs (well, normally anyway). This means the server just delivers the pack-files it already has. This is a "memory-cheap" operation: the files already exist and the server need only deliver them.

On the other hand, git fetch tries, if it can, to avoid sending a lot of data that the client already has. Using a "smart" protocol, the client and server engage in a sort of conversation, which you can think of as going something like this:

  • "I have object A, which needs B and C; do you have B and C? I also have D, E, and F."
  • "I have B but I need C, and I have D and E; please send me A, C, and F."

Thus informed, the server extracts the "interesting" / "wanted" objects out of the original packs, and then attempts to compress them into a new (but "thin") pack. This means the server will invoke git-pack-objects.

If the server is low on memory (with "low" being relative to the amount that git-pack-objects is going to need), it's likely to invoke the "OOM killer". Since git-pack-objects is memory-intensive, that process is a likely candidate for the "OOM killer" to kill. You then see, on your client end, a message about git-pack-objects dying from signal 9 (SIGKILL).

(Of course it's possible the server's OOM killer kills something else entirely, such as the bug database server. :-) )

like image 138
torek Avatar answered Oct 22 '22 21:10

torek


It can depends on the protocol, but Documentation/technical/pack-heuristics.txt points out a first difference between clone and fetch.

In the other direction, fetch, git-fetch-pack and git-clone-pack invokes git-upload-pack on the other end (via ssh or by talking to the daemon).

There are two cases:

  • clone-pack and fetch-pack with -k will keep the downloaded packfile without expanded, so we do not use thin pack transfer.
  • Otherwise, the generated pack will have delta without base object in the same pack.

But fetch-pack without -k will explode the received pack into individual objects, so we automatically ask upload-pack to give us a thin pack if upload-pack supports it.

So in term of protocols, Documentation/technical/pack-protocol.txt illustrates that a fetch can return a lot more data than a git clone.

like image 3
VonC Avatar answered Oct 22 '22 22:10

VonC