How do I git add only lines matching a pattern?

Tags:

I'm tracking with git some configuration files. I usually do an interactive git add -p but I'm looking at a way to automatically add all new/modified/deleted lines that match a pattern. Otherwise it's going to take me ages to do all the interactive split and add. git add has a pattern matching for filenames, but I can't find anything about the content.

393

asked Apr 29 '16 07:04

Benoît

2 Answers

here's a way:

use git diff > patch to make a patch of the current diff.
use gawk to make a second patch only of +/- lines matching the pattern: remove - from deleted lines not matching the pattern, delete + lines not matching the pattern, modify the hunk header line numbers, output each modified hunk, but don't output any modified hunks that no longer have any changes in them.
use git stash save, apply patch, add -u, and stash pop to apply and stage the modified patch and leave the rest of the changes unstaged.

this worked for several test cases, it works on the entire diff at once (all files), and it's quick.

#!/bin/sh  diff=`mktemp` git diff > $diff [ -s $diff ] || exit  patch=`mktemp`  gawk -v pat="$1" ' function hh(){   if(keep && n > 0){     for(i=0;i<n;i++){       if(i==hrn){         printf "@@ -%d,%d +%d,%d @@\n", har[1],har[2],har[3],har[4];       }       print out[i];     }   } } {   if(/^diff --git a\/.* b\/.*/){     hh();     keep=0;     dr=NR;     n=0;     out[n++]=$0   }   else if(NR == dr+1 && /^index [0-9a-f]+\.\.[0-9a-f]+ [0-9]+$/){     ir=NR;     out[n++]=$0   }   else if(NR == ir+1 && /^\-\-\- a\//){     mr=NR;     out[n++]=$0   }   else if(NR == mr+1 && /^\+\+\+ b\//){     pr=NR;     out[n++]=$0   }   else if(NR == pr+1 && match($0, /^@@ \-([0-9]+),?([0-9]+)? \+([0-9]+),?([0-9]+)? @@/, har)){     hr=NR;     hrn=n   }   else if(NR > hr){     if(/^\-/ && $0 !~ pat){       har[4]++;       sub(/^\-/, " ", $0);       out[n++] = $0     }     else if(/^\+/ && $0 !~ pat){       har[4]--;     }     else{       if(/^[+-]/){         keep=1       }       out[n++] = $0     }   } } END{   hh() }' $diff > $patch  git stash save &&   git apply $patch &&   git add -u &&   git stash pop  rm $diff rm $patch

refs:

git diff apply

unified diff format

gawk match groups to array

git add -u

answered Oct 06 '22 01:10

webb

I cranked out this experimental and poorly tested program in TXR:

Sample run: first where are we at in the repo:

$ git diff diff --git a/lorem.txt b/lorem.txt index d5d20a4..58609a7 100644 --- a/lorem.txt +++ b/lorem.txt @@ -2,10 +2,14 @@ Lorem ipsum dolor sit amet,  consectetur adipiscing elit,  sed do eiusmod tempor  incididunt ut labore et dolore -magna aliqua. Ut enim ad minim +minim +minim  veniam, quis nostrud  exercitation ullamco laboris +maxim +maxim  nisi ut aliquip ex ea commodo +minim  consequat.  Duis aute irure  dolor in reprehenderit in  voluptate velit esse cillum

And:

$ git diff --cached  # nothing staged in the index

The goal is to just commit the lines containing a match for min:

$ txr addmatch.txr min lorem.txt patching file .merge_file_BilTfQ

Now what is the state?

$ git diff diff --git a/lorem.txt b/lorem.txt index 7e1b4cb..58609a7 100644 --- a/lorem.txt +++ b/lorem.txt @@ -6,6 +6,8 @@ minim  minim  veniam, quis nostrud  exercitation ullamco laboris +maxim +maxim  nisi ut aliquip ex ea commodo  minim  consequat.  Duis aute irure

And:

$ git diff --cached diff --git a/lorem.txt b/lorem.txt index d5d20a4..7e1b4cb 100644 --- a/lorem.txt +++ b/lorem.txt @@ -2,10 +2,12 @@ Lorem ipsum dolor sit amet,  consectetur adipiscing elit,  sed do eiusmod tempor  incididunt ut labore et dolore -magna aliqua. Ut enim ad minim +minim +minim  veniam, quis nostrud  exercitation ullamco laboris  nisi ut aliquip ex ea commodo +minim  consequat.  Duis aute irure  dolor in reprehenderit in  voluptate velit esse cillum

The matching stuff is in the index, and the nonmatching +maxim lines are still unstaged.

Code in addmatch.txr:

@(next :args) @(assert) @pattern @file @(bind regex @(regex-compile pattern)) @(next (open-command `git diff @file`)) diff @diffjunk index @indexjunk --- a/@file +++ b/@file @(collect) @@@@ -@bfline,@bflen +@afline,@aflen @@@@@(skip) @  (bind (nminus nplus) (0 0)) @  (collect) @    (cases)  @line @      (bind zerocol " ") @    (or) +@line @      (bind zerocol "+") @      (require (search-regex line regex)) @      (do (inc nplus)) @    (or) -@line @      (bind zerocol "-") @      (require (search-regex line regex)) @      (do (inc nminus)) @    (or) -@line @;;    unmatched - line becomes context line @      (bind zerocol " ") @    (end) @  (until) @/[^+\- ]/@(skip) @  (end) @  (set (bfline bflen afline aflen)         @[mapcar int-str (list bfline bflen afline aflen)]) @  (set aflen @(+ bflen nplus (- nminus))) @(end) @(output :into stripped-diff) diff @diffjunk index @indexjunk --- a/@file +++ b/@file @  (repeat) @@@@ -@bfline,@bflen +@afline,@aflen @@@@ @    (repeat) @zerocol@line @    (end) @  (end) @(end) @(next (open-command `git checkout-index --temp @file`)) @tempname@\t@file @(try) @  (do      (with-stream (patch-stream (open-command `patch -p1 @tempname` "w"))        (put-lines stripped-diff patch-stream))) @  (next (open-command `git hash-object -w @tempname`)) @newsha @  (do (sh `git update-index --cacheinfo 100644 @newsha @file`)) @(catch) @  (fail) @(finally) @  (do      (ignerr [mapdo remove-path #`@tempname @tempname.orig @tempname.rej`])) @(end)

Basically the strategy is:

do some pattern matching on the git diff output to filter the hunks down to the matching lines. We must re-compute the "after" line count in the hunk header, and preserve the context lines.
output the filtered diff into a variable.
obtain a pristine copy of the file from the index using git checkout-index --temp. This command outputs the temporary name it has generated, and we capture it.
Now send the filtered/reduced diff to patch -p1, targetting this temporary file holding the pristine copy from the index. Okay, we now have just the changes we wanted, applied to the original file.
Next, create a Git object out of the patched file, using git hash-object -w. Capture the hash which this command outputs.
Lastly, use git update-index --cacheinfo ... to enter this new object into the index under the original file name, effectively staging a change for the file.

If this screws up, we can just do git reset to wipe the index, fix our broken scriptology and try again.

Just blindly matching through + and - lines has obvious issues. It should work in the case when the patterns match variable names in config files, rather than content. E.g.

Replacement:

-CONFIG_VAR=foo +CONFIG_VAR=bar

Here, if we match on CONFIG_VAR, then both lines are included. If we match on foo in the right hand side, we break things: we end up with a patch that just subtracts the CONFIG_VAR=foo line!

Obviously, this could be made clever, taking into account the syntax and semantics of the config file.

How I would solve this "for real" would be to write a robust config file parser and re-generator (which preserves comments, whitespace and all). Then parse the new and original pristine file to config objects, migrate the matching changes from one object to the other, and generate an updated file to go to the index. No messing around with patches.

answered Oct 06 '22 00:10

Kaz

Related questions
                            
                                Heroku & Django: "OSError: No such file or directory: '/app/{myappname}/static'"
                            
                                In Git how can you check which repo in Github you are pushing to from the command line?
                            
                                How to run stash in Github Desktop?
                            
                                How to see changes to a file before commit?
                            
                                How to set up your avatar in SourceTree.app?
                            
                                Clone specific branch from git
                            
                                scrolling down git diff from mac terminal
                            
                                Merge branches without checking out branch
                            
                                Force my local master to be origin/master
                            
                                Git push results in fatal: protocol error: bad line length character: This
                            
                                gitignore not ignoring file
                            
                                GIT -- Exclude / Ignore Files from commit [duplicate]
                            
                                File not shown in git diff after a git add. How do I know it will be committed?
                            
                                git svn clone of a single directory of SVN repository
                            
                                Exclude folder from git in Visual Studio Code
                            
                                "git checkout <commit id>" is changing branch to "no branch"
                            
                                Git pull error: unable to create temporary sha1 filename
                            
                                git "revert" current directory
                            
                                How to get deleted files back with git pull?
                            
                                Strange xml error: Incorrect line ending

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How do I git add only lines matching a pattern?

Tags:

git

git-add

Benoît

People also ask

2 Answers

webb

Kaz

Recent Activity

Donate For Us