I noticed a default option in git core.repositoryFormatVersion
which defaults to 0, but what are "repository format versions" and what functional difference do they make?
There are over 128 million public repositories on GitHub.
GitHub — Primary function. Git is a distributed version control system that records different versions of a file (or set of files). It lets users access, compare, update, and distribute any of the recorded version(s) at any time. However, GitHub is mainly a hosting platform for hosting Git repositories online.
It's for future compatibility -- if the git developers ever find it necessary to change the way that repos are stored on disk to enable some new feature, then they can make upgraded repos have a core.repositoryformatversion
of 1
. Then newer versions of git that know about that new format will trigger the code to deal with it, and older versions of git that don't will gracefully error with "Expected git repo version <= 0, found 1. Please upgrade Git"
.
As of now, the only repo format version defined or recognized is 0
, which denotes the format that every public release of git has used.
git 2.7 (Nov. 2015) adds a lot more information in the new Documentation/technical/repository-version.txt
.
See commit 067fbd4, commit 00a09d5 (23 Jun 2015) by Jeff King (peff
).
(Merged by Junio C Hamano -- gitster
-- in commit fa46579, 26 Oct 2015)
You now can define "extensions", and use core.repositoryformatversion
as a "marker" to signal the existence of said extensions, instead of having to bump the Git version number itself:
If we were to bump the repository version for every such change, then any implementation understanding version
X
would also have to understandX-1
,X-2
, and so forth, even though the incompatibilities may be in orthogonal parts of the system, and there is otherwise no reason we cannot implement one without the other (or more importantly, that the user cannot choose to use one feature without the other, weighing the tradeoff in compatibility only for that particular feature).This patch documents the existing
repositoryformatversion
strategy and introduces a new format, "1", which lets a repository specify that it must run with an arbitrary set of extensions.
Extracts from the doc:
Every git repository is marked with a numeric version in the
core.repositoryformatversion
key of itsconfig
file. This version specifies the rules for operating on the on-disk repository data.Note that this applies only to accessing the repository's disk contents directly.
An older client which understands only format0
may still connect viagit://
to a repository using format1
, as long as the server process understands format1
.Version
0
This is the format defined by the initial version of git, including but not limited to the format of the repository directory, the repository configuration file, and the object and ref storage.
Version
1
This format is identical to version
0
, with the following exceptions:
When reading the
core.repositoryformatversion
variable, a git implementation which supports version 1 MUST also read any configuration keys found in theextensions
section of the configuration file.If a version-1 repository specifies any
extensions.*
keys that the running git has not implemented, the operation MUST NOT proceed.
Similarly, if the value of any known key is not understood by the implementation, the operation MUST NOT proceed.This can be used, for example:
to inform git that the objects should not be pruned based only on the reachability of the ref tips (e.g, because it has "clone --shared" children)
that the refs are stored in a format besides the usual "refs" and "packed-refs" directories
Now that is really an original approach to all the release version number policy and its semver policy.
Because we bump to format "1", and because format "1" requires that a running git knows about any extensions mentioned, we know that older versions of the code will not do something dangerous when confronted with these new formats.
For example, if the user chooses to use database storage for refs, they may set the "extensions.refbackend" config to "db".
Older versions of git will not understand format "1" and bail.
Versions of git which understand "1" but do not know about "refbackend", or which know about "refbackend" but not about the "db" backend, will refuse to run.
This is annoying, of course, but much better than the alternative of claiming that there are no refs in the repository, or writing to a location that other implementations will not read.Note that we are only defining the rules for format 1 here.
We do not ever write format 1 ourselves; it is a tool that is meant to be used by users and future extensions to provide safety with older implementations.
As a first extension, you will have with git 2.7 preciousObjects
:
If this extension is used in a repository, then no operations should run which may drop objects from the object storage. This can be useful if you are sharing that storage with other repositories whose refs you cannot see.
The doc mentions:
When the config key
extensions.preciousObjects
is set totrue
, objects in the repository MUST NOT be deleted (e.g., bygit-prune
orgit repack -d
).
That is:
For instance, if you do:
$ git clone -s parent child $ git -C parent config extensions.preciousObjects true $ git -C parent config core.repositoryformatversion 1
you now have additional safety when running git in the parent repository.
Prunes and repacks will bail with an error, andgit gc
will skip those operations (it will continue to pack refs and do other non-object operations).
Older versions of Git, when run in the repository, will fail on every operation.Note that we do not set the
preciousObjects
extension by default when doing a "clone -s
", as doing so breaks backwards compatibility. It is a decision the user should make explicitly.
Note that this core.repositoryformatversion
business is old. Really old. commit ab9cb76, Nov. 2005, Git 0.99.9l.
It was done initially for the db version:
This makes
init-db
repository version aware.It checks if an existing config file says the repository being reinitialized is of a wrong version and aborts before doing further harm.
Git 2.22 (Q2 2019) will avoid leaks around the
repository_format
structure.
See commit e8805af (28 Feb 2019), and commit 1301997 (22 Jan 2019) by Martin Ågren (``).
(Merged by Junio C Hamano -- gitster
-- in commit 6b5688b, 20 Mar 2019)
setup
: fix memory leaks withstruct repository_format
After we set up a
struct repository_format
, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space.
In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory.Introduce an initialization macro
REPOSITORY_FORMAT_INIT
and a functionclear_repository_format()
, to be used on each side ofread_repository_format()
. To have a clear and simple memory ownership, let all users ofstruct repository_format
duplicate the strings that they take from it, rather than stealing the pointers.Call
clear_...()
at the start ofread_...()
instead of just zeroing the struct, since we sometimes enter the function multiple times.
Thus, it is important to initialize the struct before callingread_...()
, so document that.
It's also important because we might not even callread_...()
before we callclear_...()
, see, e.g.,builtin/init-db.c
.Teach
read_...()
to clear the struct on error, so that it is reset to a safe state, and document this. (Insetup_git_directory_gently()
, we look atrepo_fmt.hash_algo
even ifrepo_fmt.version
is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.)
With Git 2.28 (Q3 2020), the runtime itself can upgrade the repository format version automatically, for example on an unshallow fetch.
See commit 14c7fa2, commit 98564d8, commit 01bbbbd, commit 16af5f1 (05 Jun 2020) by Xin Li (livid
).
(Merged by Junio C Hamano -- gitster
-- in commit 1033b98, 29 Jun 2020)
fetch
: allow adding a filter after initial cloneSigned-off-by: Xin Li
Retroactively adding a filter can be useful for existing shallow clones as they allow users to see earlier change histories without downloading all git objects in a regular
--unshallow
fetch.Without this patch, users can make a clone partial by editing the repository configuration to convert the remote into a promisor, like:
git config core.repositoryFormatVersion 1 git config extensions.partialClone origin git fetch --unshallow --filter=blob:none origin
Since the hard part of making this work is already in place and such edits can be error-prone, teach Git to perform the required configuration change automatically instead.
Note that this change does not modify the existing Git behavior which recognizes setting
extensions.partialClone
without changingrepositoryFormatVersion
.
Warning: In 2.28-rc0, we corrected a bug that some repository extensions are honored by mistake even in a version 0 repositories (these configuration variables in extensions.*
namespace were supposed to have special meaning in repositories whose version numbers are 1 or higher), but this was a bit too big a change.
See commit 62f2eca, commit 1166419 (15 Jul 2020) by Jonathan Nieder (artagnon
).
(Merged by Junio C Hamano -- gitster
-- in commit d13b7f2, 16 Jul 2020)
Revert "check_repository_format_gently()
: refuse extensions for old repositories"Reported-by: Johannes Schindelin
Signed-off-by: Jonathan Nieder
This reverts commit 14c7fa269e42df4133edd9ae7763b678ed6594cd.
The
core.repositoryFormatVersion
field was introduced in ab9cb76f661 ("Repository format version check.", 2005-11-25, Git v0.99.9l -- merge), providing a welcome bit of forward compatibility, thanks to some welcome analysis by Martin Atukunda.The semantics are simple: a repository with
core.repositoryFormatVersion
set to 0 should be comprehensible by all Git implementations in active use; and Git implementations should error out early instead of trying to act on Git repositories with highercore.repositoryFormatVersion
values representing new formats that they do not understand.A new repository format did not need to be defined until 00a09d57eb8 (introduce "extensions" form of
core.repositoryformatversion
, 2015-06-23).This provided a finer-grained extension mechanism for Git repositories.
In a repository with
core.repositoryFormatVersion
set to 1, Git implementations can act on "extensions.*
" settings that modify how a repository is interpreted.In repository format version 1, unrecognized extensions settings cause Git to error out.
What happens if a user sets an extension setting but forgets to increase the repository format version to 1?
The extension settings were still recognized in that case; worse, unrecognized extensions settings do not cause Git to error out.So combining repository format version 0 with extensions settings produces in some sense the worst of both worlds.
To improve that situation, since 14c7fa269e4
(check_repository_format_gently()
: refuse extensions for old repositories, 2020-06-05) Git instead ignores extensions in v0 mode. This way, v0 repositories get the historical (pre-2015) behavior and maintain compatibility with Git implementations that do not know about the v1 format.Unfortunately, users had been using this sort of configuration and this behavior change came to many as a surprise:
- users of "
git config --worktree
" that had followed its advice to enableextensions.worktreeConfig
(without also increasing the repository format version) would find their worktree configuration no longer taking effect- tools such as copybara that had set
extensions.partialClone
in existing repositories (without also increasing the repository format version) would find that setting no longer taking effectThe behavior introduced in 14c7fa269e4 might be a good behavior if we were traveling back in time to 2015, but we're far too late.
For some reason I thought that it was what had been originally implemented and that it had regressed.
Apologies for not doing my research when 14c7fa269e4 was under development.
Let's return to the behavior we've had since 2015: always act on
extensions.*
settings, regardless of repository format version.While we're here, include some tests to describe the effect on the "upgrade repository version" code path.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With