Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I manipulate (dump and load) Git index as text?

Tags:

git

I've found myself in a need to perform a complex git history rewrite. Specifically, I need to remove most of the files and toss around a few remaining ones. E. g.:

$ pr -m -t <(/bin/tree a) <(/bin/tree b)
a                                   b
├── 5                               ├── renamed-3
├── 6                               └── renamed-4
└── foo                             
    └── bar                         0 directories, 2 files
        ├── 3                       
        ├── 4                       
        └── baz                     
            ├── 1                           
            └── 2                           

3 directories, 6 files

This can be easily done with git filter-branch --tree-filter. However, --tree-filter is slow, with a much faster alternative in form of --index-filter.

I tried to express my desired operations in terms of git rm --cached and git mv --cached, but it turned out to be quite ugly. It would be a lot easier if I could manipulate the contents of the index as text, using standard Unix tools (sed/grep/awk).

Is there a pair of Git commands that would allow to

  • dump current contents of Git index to stdout as text in some format (one entry per line, including full path to the file);
  • read text in the above format from stdin into Git index, fully replacing its current contents?

Desired usage example:

git filter-branch --index-filter 'git magic-index-dump | awk "..." | git magic-index-replace'
like image 896
intelfx Avatar asked Oct 19 '25 04:10

intelfx


2 Answers

Using git ls-files --stage and git update-index --index-info can get you all the way there, though it's a bit clumsy in spots: removing a file means setting its mode to zero, and renaming a file amounts to duplicate the line (to the end of the instructions) while changing the name; then set the mode to zero in the original line. The point of putting the new entry at the end is that it's possible that your desired new name matches the name of some existing file you plan to delete—if that can't happen, you don't need to be quite this tricky.

Inside a filter-branch operation there should never be any ongoing merge, so all stage numbers should be (and remain) zero all the way through the operation.

like image 92
torek Avatar answered Oct 21 '25 23:10

torek


[This is an improvement on torek's answer, which was rejected as an edit.]

It is possible to use git ls-files --stage and git update-index --index-info to dump and append to index, respectively, using the same text format.

The format is described in git-update-index(1):

mode SP sha1 SP stage TAB path

This format <...> matches git ls-files --stage output.

The "stage" field is used to represent a conflicting merge in the index. When there is no merge in progress, it will be always equal to 0.

Note that git update-index only appends to index. To replace the index contents, it is possible to clear it before updating, or rather write to a different index file to avoid any kinds of data races caused by reading and updating/deleting the same file in the same pipeline.

Putting this into a one-liner would look rather messy, so we'll write a custom Git command git-replace-index:

#!/bin/sh -e

: ${GIT_DIR:="$(git rev-parse --git-dir)"}
: ${GIT_INDEX_FILE:="$GIT_DIR/index"}
GIT_INDEX_NEW="$GIT_INDEX_FILE.new"

GIT_INDEX_FILE="$GIT_INDEX_NEW" git update-index "$@"
rm -f "$GIT_INDEX_FILE"
if [ -e "$GIT_INDEX_NEW" ]; then
    mv "$GIT_INDEX_NEW" "$GIT_INDEX_FILE"   
fi

With this helper in place, the resulting pipeline would be:

git ls-files --stage | ... | git replace-index --index-info
like image 22
intelfx Avatar answered Oct 21 '25 23:10

intelfx



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!