Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why did --cached option on filter-branch remove files from working directory?

I needed to remove some Xcode files from an old repo that should have been ignored. So I ran the following command

git filter-branch --index-filter 'git rm -f --cached --ignore-unmatch *mode1v3 *pbxuser' HEAD

My understanding was that adding --cached would not affect the current working directory, but git deleted those matching files too. Luckily I had a backup(!) but I'm curious why it does this, or am I misunderstanding what --cached does?

like image 748
martinjbaker Avatar asked Aug 07 '11 10:08

martinjbaker


People also ask

What does git remove cached do?

The Git rm –cached flag removes a file from the staging area. The files from the working directory will remain intact. This means that you'll still have a copy of the file locally. The file will be removed from the index tracking your Git project.

Does git filter branch rewrite history?

Lets you rewrite Git revision history by rewriting the branches mentioned in the <rev-list options>, applying custom filters on each revision. Those filters can modify each tree (e.g. removing a file or running a perl rewrite on all files) or information about each commit.

How do I delete a file from Github history?

To entirely remove unwanted files from a repository's history you can use either the git filter-repo tool or the BFG Repo-Cleaner open source tool. The git filter-repo tool and the BFG Repo-Cleaner rewrite your repository's history, which changes the SHAs for existing commits that you alter and any dependent commits.

How to delete a file in git?

The git rm command can be used to remove individual files or a collection of files. The primary function of git rm is to remove tracked files from the Git index. Additionally, git rm can be used to remove files from both the staging index and the working directory.


2 Answers

The culprit is not the git rm command. Its --cached option works indeed as you say. You can easily try that in a small git repo.

Although the man page does not mention it, git filter-branch does not seem to preserve your working area. Actually the command refuses to run if your working area is not clean, which is an indication already.

But even if the files are gone from the working area, they are not gone from the repo. They are just no longer in any commit reachable in your current branch. But filter-branch stores are reference to your branch before rewriting to reference name space refs/original/.

Use command git show-ref to see it.

You could check out the old version to access your removed files. You could use command git cat-file blob refs/original/refs/heads/master:foo to get the contents of the file without checking out (use the reference shown by show-ref, foo is the name of the desired file). There are plenty of possibilities

You can use gitk --all to navigate through both your rewritten and your current branches and you will see that nothing is really gone.

like image 101
Uwe Geuder Avatar answered Oct 17 '22 21:10

Uwe Geuder


The behaviour of git-filter-branch can be surprising, as you've discovered - and it won't protect you from unintended consequences when you run it.

Instead I'd recommend using the BFG Repo-Cleaner, a simpler, faster alternative specifically designed for deleting files from Git history. One way in which it makes your life easier here is that it will not delete, or change in any way, files in your latest commit.

You should follow the usage instructions - but the core bit is just this: download the BFG's jar (requires Java 6 or above) and run this command:

$ java -jar bfg.jar  --delete-files *{mode1v3,pbxuser}  my-repo.git

Any file matching that expression in your repository history - which isn't also in your latest commit - will be deleted. You can then use git gc to clean away the dead data:

$ git gc --prune=now --aggressive

The BFG is generally much simpler to use than git-filter-branch - the options are tailored around these two common use-cases:

  • Removing Crazy Big Files
  • Removing Passwords, Credentials & other Private data

Full disclosure: I'm the author of the BFG Repo-Cleaner.

like image 35
Roberto Tyley Avatar answered Oct 17 '22 21:10

Roberto Tyley