Inside .git/objects
directory there's the info
subdirectory. What is it used for? I know what .git/objects
directory used for and what is .git/objects/pack
directory. But I can't find information on the .git/objects/info
directory. It can be somewhere on the surface but info
is too overly generic name to search in google - too many irrelevant results.
Repository layout documentation:
objects/info
Additional information about the object store is recorded in this directory.objects/info/packs
This file is to help dumb transports discover what packs are available in this object store. Whenever a pack is added or removed, git update-server-info should be run to keep this file up-to-date if the repository is published for dumb transports. git repack does this by default.objects/info/alternates
This file records paths to alternate object stores that this object store borrows objects from, one pathname per line. Note that not only native Git tools use it locally, but the HTTP fetcher also tries to use it remotely; this will usually work if you have relative paths (relative to the object database, not to the repository!) in your alternates file, but it will not work if you use absolute paths unless the absolute path in filesystem and web URL is the same. See also objects/info/http-alternates.objects/info/http-alternates
This file records URLs to alternate object stores that this object store borrows objects from, to be used when the repository is fetched over HTTP.
So it's purely internal to git.
For example:
$ cat .git/objects/info/packs
P pack-fac58f9273f12d454896cdc6070b9607e271e530.pack
$ ls -1 .git/objects/pack/
pack-597bfea331852c930d2cd014e0328c458417ea05.pack
pack-d5589be9a1ca818d38efb0e9f185cc816f4749ad.pack
pack-fac58f9273f12d454896cdc6070b9607e271e530.idx
pack-fac58f9273f12d454896cdc6070b9607e271e530.pack
It's used in http.c#http_get_info_packs used by https-push.c#fetch_indices.
The notion of info/alternates goes back to With Git 1.4 (Q2 2006),
See commit 0438402, commit dd05ea1, commit c2f493a, commit 178613c, commit cf9dc65 (07 May 2006) by Martin Waitz (tali
).
See commit fd60aca, commit 6fe31e2 (07 May 2006) by Junio C Hamano (gitster
).
See commit d92f1dc (07 May 2006) by Peter Hagervall (phagervall
).
See commit 5d8ee9c (07 May 2006) by Pavel Roskin (proski
).
See commit 245f102 (07 May 2006) by Matthias Lederhofer (matled
).
See commit be65e7d (07 May 2006) by Johannes Schindelin (dscho
).
(Merged by Junio C Hamano -- gitster
-- in commit 7f49806, 07 May 2006)
c2f493a4ae
:Transitively read alternativesSigned-off-by: Martin Waitz
When adding an alternate object store then add entries from its info/alternates files, too.
Relative entries are only allowed in the current repository.
Loops and duplicate alternates through multiple repositories are ignored.
Just to be sure that nothing breaks it is not allow to build deep nesting levels using info/alternates.
But recently (2021), it evolved with Git 2.33 (Q3 2021), which adds optimization for repositories with many alternate object store.
See commit 92d8ed8, commit 90e07f0, commit 33f379e, commit 407532f, commit cf2dc1c (07 Jul 2021) by Eric Wong (ele828
).
(Merged by Junio C Hamano -- gitster
-- in commit e5cc59c, 28 Jul 2021)
oidtree
: acrit-bit
tree forodb_loose_cache
Signed-off-by: Eric Wong
This saves 8K per
struct
object_directory`', meaning it saves around 800MB in my case involving 100K alternates (half or more of those alternates are unlikely to hold loose objects).This is implemented in two parts: a generic, allocation-free
cbtree
and theoidtree
wrapper on top of it.
The latter provides allocation usingalloc_state
as a memory pool to improve locality and reduce free(3) overhead.Unlike
oid-array
, thecrit-bit
tree does not require sorting.
Performance is bound by the key length, foroidtree
that is fixed atsizeof(struct object_id)
.
There's no need to have 256oidtrees
to mitigate the O(n log n) overhead like we did withoid-array
.Being a prefix trie, it is natively suited for expanding short object IDs via prefix-limited iteration in
find_short_object_filename
.On my busy workstation, p4205 performance seems to be roughly unchanged (+/-8%).
Startup with 100K total alternates with no loose objects seems around 10-20% faster on a hot cache.
(800MB in memory savings means more memory for the kernel FS cache).The generic
cbtree
implementation does impose some extra overhead foroidtree
in that it usesmemcmp(3)
on "struct object_id
" so it wastes cycles comparing 12 extra bytes on SHA-1 repositories.
I've not yet explored reducing this overhead, but I expect there are many places in our code base where we'd want to investigate this.
And, still with Git 2.33 (Q3 2021), a build fix:
See commit 581a3bb (06 Aug 2021) by René Scharfe (rscharfe
).
See commit dd3c8a7, commit 1482594 (08 Aug 2021) by Carlo Marcelo Arenas Belón (carenas
).
(Merged by Junio C Hamano -- gitster
-- in commit 7cfaa86, 11 Aug 2021)
object-file
: use unsigned arithmetic with bit maskSigned-off-by: René Scharfe
33f379e ("make
object_directory.loose_objects_subdir_seen
a bitmap", 2021-07-07, Git v2.33.0-rc0 -- merge listed in batch #7) replaced a wasteful 256-byte array with a 32-byte array and bit operations.
The mask calculation shifts a literal 1 of type int left by anything between 0 and 31.UndefinedBehaviorSanitizer
doesn't like that and reports:
object-file.c
:2477:18: runtime error: left shift of 1 by 31 places cannot be represented in type 'int'Make sure to use an unsigned 1 instead to avoid the issue.
size_t mask = 1 << (subdir_nr % word_bits);
size_t mask = 1u << (subdir_nr % word_bits); <==
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With