Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Where does "git update-index --assume-unchanged file" actually save this information to?

Tags:

git

git-index

I like to modify config files directly (like .gitignore and .git/config) instead of remembering arbitrary commands, but I don't know where Git stores the file references that get passed to "git update-index --assume-unchanged file".

If you know, please do tell!

like image 664
Mauvis Ledford Avatar asked Aug 18 '11 22:08

Mauvis Ledford


People also ask

What is assume unchanged in Git?

When the "assume unchanged" bit is on, the user promises not to change the file and allows Git to assume that the working tree file matches what is recorded in the index. If you want to change the working tree file, you need to unset the bit to tell Git.

What is the Git index?

The Git index is a critical data structure in Git. It serves as the “staging area” between the files you have on your filesystem and your commit history. When you run git add , the files from your working directory are hashed and stored as objects in the index, leading them to be “staged changes”.

What is index modified in Git?

The Git index is a staging area between the working directory and repository. It is used to build up a set of changes that you want to commit together. To better understand the Git index, then first understand the working directory and repository.


2 Answers

It says where in the command - git update-index

So you can't really be editing the index as it is not a text file.

Also, to give more detail on what is stored with the git update-index --assume-unchanged command, see the Using “assume unchanged” bit section in the manual

like image 84
manojlds Avatar answered Oct 13 '22 10:10

manojlds


As others said, it's stored in the index, which is located at .git/index.

After some detective work, I found that it is located at the: assume valid bit of each index entry.

Therefore, before understanding what follows, you should first understand the global format of the index, as explained in my other answer.

Next, I will explain how I verified that the "assume valid" bit is the culprit:

  • empirically
  • by reading the source

Empirical

Time to hd it up.

Setup:

git init
echo a > b
git add b

Then:

hd .git/index

Gives:

00000000  44 49 52 43 00 00 00 02  00 00 00 01 54 e9 b6 f3  |DIRC........T...|
00000010  2d 4f e1 2f 54 e9 b6 f3  2d 4f e1 2f 00 00 08 05  |-O./T...-O./....|
00000020  00 de 32 ff 00 00 81 a4  00 00 03 e8 00 00 03 e8  |..2.............|
00000030  00 00 00 00 e6 9d e2 9b  b2 d1 d6 43 4b 8b 29 ae  |...........CK.).|
00000040  77 5a d8 c2 e4 8c 53 91  00 01 62 00 c9 a2 4b c1  |wZ....S...b...K.|
00000050  23 00 1e 32 53 3c 51 5d  d5 cb 1a b4 43 18 ad 8c  |#..2S<Q]....C...|
00000060

Now:

git update-index --assume-unchanged b
hd .git/index

Gives:

00000000  44 49 52 43 00 00 00 02  00 00 00 01 54 e9 b6 f3  |DIRC........T...|
00000010  2d 4f e1 2f 54 e9 b6 f3  2d 4f e1 2f 00 00 08 05  |-O./T...-O./....|
00000020  00 de 32 ff 00 00 81 a4  00 00 03 e8 00 00 03 e8  |..2.............|
00000030  00 00 00 00 e6 9d e2 9b  b2 d1 d6 43 4b 8b 29 ae  |...........CK.).|
00000040  77 5a d8 c2 e4 8c 53 91  80 01 62 00 17 08 a8 58  |wZ....S...b....X|
00000050  f7 c5 b3 e1 7d 47 ac a2  88 d9 66 c7 5c 2f 74 d7  |....}G....f.\/t.|
00000060

By comparing the two indexes, and looking at the global structure of the index, see that the only differences are:

  • byte number 0x48 (9th on line 40) changed from 00 to 80. That is our flag, the first bit of the cache entry flags.
  • the 20 bytes from 0x4C to 0x5F. This is expected since that is a SHA-1 over the entire index.

This has also though me that the SHA-1 of the index entry in bytes from 0x34 to 0x47 does not take into account the flags, since it did not changed between both indexes. This is probably why the flags are placed after the SHA, which only considers what comes before it.

Source code

Now let's see if that is coherent with source code of Git 2.3.

First look at the source of update-index, grep assume-unchanged.

This leads to the following line:

{OPTION_SET_INT, 0, "assume-unchanged", &mark_valid_only, NULL,
  N_("mark files as \"not changing\""),
  PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, MARK_FLAG},
{OPTION_SET_INT, 0, "no-assume-unchanged", &mark_valid_only, NULL,
  N_("clear assumed-unchanged bit"),
  PARSE_OPT_NOARG | PARSE_OPT_NONEG, NULL, UNMARK_FLAG},

so the value is stored at mark_valid_only. Grep it, and find that it is only used at one place:

if (mark_valid_only) {
  if (mark_ce_flags(path, CE_VALID, mark_valid_only == MARK_FLAG))
    die("Unable to mark file %s", path);
  return;
}

CE means Cache Entry.

By quickly inspecting mark_ce_flags, we see that:

if (mark)
  active_cache[pos]->ce_flags |= flag;
else
  active_cache[pos]->ce_flags &= ~flag;

So the function basically sets or unsets the CE_VALID bit, depending on mark_valid_only, which is a tri-state:

  • mark: --assume-unchanged
  • unmark: --no-assume-unchanged
  • do nothing: the default value 0 of the option set at {OPTION_SET_INT, 0

Next, by grepping under builtin/, we see that no other place sets the value of CE_VALID, so --assume-unchanged must be the only command that sets it.

The flag is however used in many places of the source code, which should be expected as it has many side-effects, and it is used every time like:

ce->ce_flags & CE_VALID

so we conclude that it is part of the ce_flags field of struct cache_entry.

The index is specified at cache.h because one of its functions is to be a cache for creating commits faster.

By looking at the definition of CE_VALID under cache.h and surrounding lines we have:

#define CE_STAGEMASK (0x3000)
#define CE_EXTENDED (0x4000)
#define CE_VALID (0x8000)
#define CE_STAGESHIFT 12

So we conclude that it is the very first bit of that integer (0x8000), just next to the CE_EXTENDED, which is coherent with my earlier experiment.