Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Jenkins git plugin - soooo slow sometimes

The following is taken from the Jenkins log:

00:00:03.135  > git fetch --tags --progress [email protected]:some_org/some_repo.git +refs/heads/*:refs/remotes/origin/*
00:03:49.659  > git rev-parse origin/master^{commit} # timeout=10

I'm confused as to why this timeout is occurring, because running git fetch on the same machine, with the same user, takes about 5 to 10 seconds.

I'm using the latest (as of this writing) version of Git (2.1.2) and the latest version of the gitplugin.

Thoughts?

like image 928
JAR.JAR.beans Avatar asked Nov 04 '14 09:11

JAR.JAR.beans


2 Answers

At least in our case, the issue was git version. We upgraded from 1.9 to 2.1.2 and issue got resolved. When I first posted the question, I was under the wrong impression that the upgrade already took place..

like image 63
JAR.JAR.beans Avatar answered Sep 28 '22 08:09

JAR.JAR.beans


Note: git fetch speed should improve again with Git 2.2+ (November 2014)

See commit cbe7333, by Jeff King (peff):

refs: speed up is_refname_available

Our filesystem ref storage does not allow D/F (directory/file) conflicts; so if "refs/heads/a/b" exists, we do not allow "refs/heads/a" to exist (and vice versa).
This falls out naturally for loose refs, where the filesystem enforces the condition. But for packed-refs, we have to make the check ourselves.

We do so by iterating over the entire packed-refs namespace and checking whether each name creates a conflict. If you have a very large number of refs, this is quite inefficient, as you end up doing a large number of comparisons with uninteresting bits of the ref tree (e.g., we know that all of "refs/tags" is uninteresting in the example above, yet we check each entry in it).

Instead, let's take advantage of the fact that we have the packed refs stored as a trie of ref_entry structs.
We can find each component of the proposed refname as we walk through the tree, checking for D/F conflicts as we go. For a refname of depth N (i.e., 4 in the above example), we only have to visit N nodes. And at each visit, we can binary search the M names at that level, for a total complexity of O(N lg M). ("M" is different at each level, of course, but we can take the worst-case "M" as a bound).

In a pathological case of fetching 30,000 fresh refs into a repository with 8.5 million refs, this dropped the time to run "git fetch" from tens of minutes to ~30s.

This may also help smaller cases in which we check against loose refs (which we do when renaming a ref), as we may avoid a disk access for unrelated loose directories.

Note that the tests we add appear at first glance to be redundant with what is already in t3210. However, the early tests are not robust; they are run with reflogs turned on, meaning that we are not actually testing is_refname_available at all!
The operations will still fail because the reflogs will hit D/F conflicts in the filesystem.
To get a true test, we must turn off reflogs (but we don't want to do so for the entire script, because the point of turning them on was to cover some other cases).

like image 22
VonC Avatar answered Sep 28 '22 08:09

VonC