Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Definitive retroactive .gitignore (how to make Git completely/retroactively forget about a file now in .gitignore)

Tags:

git

gitignore

Preface

This question attempts to clear the confusion regarding applying .gitignore retroactively, not just to the present/future.1

Rationale

I've been searching for a way to make my current .gitignore be retroactively enforced, as if I had created .gitignore in the first commit.

The solution I am seeking:

  • Will not require manually specifying files
  • Will not require a commit
  • Will apply retroactively to all commits of all branches
  • Will ignore .gitignore-specified files in working dir, not delete them (just like an originally root-committed .gitignore file would)
  • Will use git, not BFG
  • Will apply to .gitignore exceptions like:
 *.ext
 !*special.ext

Not solutions

git rm --cached *.ext
git commit

This requires 1. manually specifying files and 2. an additional commit, which will result in newly-ignored file deletion when pulled by other developers. (It is effectively just a git rm - which is a deletion from git tracking - but it leaves the file alone in the local (your) working directory. Others who git pull afterwards will receive the file deletion commit)

git filter-branch --index-filter 'git rm --cached *.ext'

While this does purge files retroactively, it 1. requires manually specifying files and 2. deletes the specified files from the local working directory just like plain git rm (and so also for others who git pull)!


Footnotes

1There are many similar posts here on SO, with less-than-specifically-defined questions and even more less-than-accurate answers. See this question with 23 answers where the accepted answer with ~4k votes is incorrect according to the standard definition of "forget" as noted by one mostly-correct answer, and only 2 answers include the required git filter-branch command.

This question with 21 answers is was marked as a duplicate of the previous one, but the question is defined differently (ignore vs forget), so while the answers may be appropriate, it is not a duplicate.

This question is the closest I've found to what I'm looking for, but the answers don't work in all cases (paths with spaces...) and perhaps are a bit more complex than necessary regarding creating an external-to-repository .gitignore file and copying it into every commit.

like image 933
goofology Avatar asked Aug 08 '19 18:08

goofology


People also ask

How can I make Git forget about a file that was tracked?

To stop tracking a file you need to remove the file from index. Note: This command will not delete a physical file in local it will delete the file from other developers from the next $git pull command. git rm -r -- cached. git add .

How do I ignore an already committed file in Git?

If you want to ignore a file that you've committed in the past, you'll need to delete the file from your repository and then add a .gitignore rule for it. Using the --cached option with git rm means that the file will be deleted from your repository, but will remain in your working directory as an ignored file.

Does Gitignore work retroactively?

gitignore be retroactively enforced, as if I had created . gitignore in the first commit. The solution I am seeking: Will not require manually specifying files.


2 Answers

EDIT: I've recently found git-filter-repo. It may be a better choice. Perhaps a good idea to investigate the rationale and filter-branch gotchas for yourself, but they wouldn't have affected my use-case below.


This method makes Git completely forget ignored files (past/present/future), but does not delete anything from working directory (even when re-pulled from remote).

This method requires usage of /.git/info/exclude (preferred) OR a pre-existing .gitignore in all the commits that have files to be ignored/forgotten. 1

This method avoids removing the newly-ignored files from other developers machines on the next git pull 2

All methods of enforcing Git ignore behavior after-the-fact effectively re-write history and thus have significant ramifications for any public/shared/collaborative repos that might be pulled after this process. 3

General advice: start with a clean repo - everything committed, nothing pending in working directory or index, and make a backup!

Also, the comments/revision history of this answer (and revision history of this question) may be useful/enlightening.

#commit up-to-date .gitignore (if not already existing)
#these commands must be run on each branch
#these commands are not strictly necessary if you don't want/need a .gitignore file.  .git/info/exclude can be used instead

git add .gitignore
git commit -m "Create .gitignore"

#apply standard git ignore behavior only to current index, not working directory (--cached)
#if this command returns nothing, ensure /.git/info/exclude AND/OR .gitignore exist
#this command must be run on each branch
#if using .git/info/exclude, it will need to be modified per branch run, if the branches have differing (per-branch) .gitignore requirements.

git ls-files -z --ignored --exclude-standard | xargs -r0 git rm --cached

#Commit to prevent working directory data loss!
#this commit will be automatically deleted by the --prune-empty flag in the following command
#this command must be run on each branch
#optionally use the --amend flag to merge this commit with the previous one instead of creating 2 commits.

git commit -m "ignored index"

#Apply standard git ignore behavior RETROACTIVELY to all commits from all branches (--all)
#This step WILL delete ignored files from working directory UNLESS they have been dereferenced from the index by the commit above
#This step will also delete any "empty" commits.  If deliberate "empty" commits should be kept, remove --prune-empty and instead run git reset HEAD^ immediately after this command

git filter-branch --tree-filter 'git ls-files -z --ignored --exclude-standard | xargs -r0 git rm -f --ignore-unmatch' --prune-empty --tag-name-filter cat -- --all

#List all still-existing files that are now ignored properly
#if this command returns nothing, it's time to restore from backup and start over
#this command must be run on each branch

git ls-files --other --ignored --exclude-standard

Finally, follow the rest of this GitHub guide (starting at step 6) which includes important warnings/information about the commands below.

git push origin --force --all
git push origin --force --tags
git for-each-ref --format="delete %(refname)" refs/original | git update-ref --stdin
git reflog expire --expire=now --all
git gc --prune=now

Other devs that pull from now-modified remote repo should make a backup and then:

#fetch modified remote

git fetch --all

#"Pull" changes WITHOUT deleting newly-ignored files from working directory
#This will overwrite local tracked files with remote - ensure any local modifications are backed-up/stashed

git reset FETCH_HEAD

Footnotes

1 Because /.git/info/exclude can be applied to all historical commits using the instructions above, perhaps details about getting a .gitignore file into the historical commit(s) that need it is beyond the scope of this answer. I wanted a proper .gitignore to be in the root commit, as if it was the first thing I did. Others may not care since /.git/info/exclude can accomplish the same thing regardless where the .gitignore exists in the commit history, and clearly re-writing history is a very touchy subject, even when aware of the ramifications.

FWIW, potential methods may include git rebase or a git filter-branch that copies an external .gitignore into each commit, like the answers to this question

2 Enforcing git ignore behavior after-the-fact by committing the results of a standalone git rm --cached command may result in newly-ignored file deletion in future pulls from the force-pushed remote. The --prune-empty flag in the git filter-branch command (or git reset HEAD^ afterwards) avoids this problem by automatically removing the previous "delete all ignored files" index-only commit.

3 Re-writing git history also changes commit hashes, which will wreak havoc on future pulls from public/shared/collaborative repos. Please understand the ramifications fully before doing this to such a repo. This GitHub guide specifies the following:

Tell your collaborators to rebase, not merge, any branches they created off of your old (tainted) repository history. One merge commit could reintroduce some or all of the tainted history that you just went to the trouble of purging.

Alternative solutions that do not affect the remote repo are git update-index --assume-unchanged </path/file> or git update-index --skip-worktree <file>, examples of which can be found here.

like image 113
goofology Avatar answered Sep 21 '22 05:09

goofology


This may be only a partial answer but here is how I accomplished retroactively removing files from previous git commits based on my current .gitignore file:

  1. Make a backup of the repo folder you are working on. I just made a .7z archive of the entire folder.
  2. Install git-filter-repo
  3. Copy your .gitignore file somewhere else temporarily. Since I'm on Windows and using Command Prompt, I ran copy .gitignore ..\ and just made the temp copy only directory level up
  4. If your .gitignore file has wildcard filters (like nbproject/Makefile-*), you'll need to edit your temp copied .gitignore file so those lines read glob:nbproject/Makefile-*
  5. Run git filter-repo --invert-paths --paths-from-file ..\.gitignore. My understanding is that this uses the temp copy as a list of files/directories to remove. Note: if you receive an error regarding your repo not being a clean clone, search for "FRESH CLONE SAFETY CHECK AND --FORCE" in the git-filter-repo help. Be careful.

For more info see: git-filter-repo help (Search for "Filtering based on many paths")

Disclaimer: I have no idea what I'm doing but this worked for me.

like image 31
wreckfix Avatar answered Sep 22 '22 05:09

wreckfix