Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

remove all binary files recursively from git repo and commit history

I have read a few different threads on removing large binary files from git commit history, but my problem is just a little bit different. Hence my question here to understand and confirm the steps--

My git repo is ~/foo. I want to remove all *.jpg, *.png, *.mp4, *.ogv (and so on) from one of the directories inside the repo, specifically from ~/foo/public/data.

Step 1. Remove the files

~/foo/data > find -E . -regex ".*\.(jpg|png|mp4|m4v|ogv|webm)" \
    -exec git filter-branch --force --index-filter \
    'git rm --cached --ignore-unmatch {}' \
    --prune-empty --tag-name-filter cat -- --all \;

Step 2. Add the binary file extensions to .gitignore and commit .gitignore

~/foo/data > cd ..
~/foo > git add .gitignore
~/foo > git commit -m "added binary files to .gitignore"

Step 3. Push everything

~/foo > git push origin master --force

Am I on the right track above? I want to measure twice before I cut once, so to say.

Update: Well, the above gives me the error

You need to run this command from the toplevel of the working tree.
You need to run this command from the toplevel of the working tree.
..

So I went up the tree to the top level and re-ran the command, and it all worked.

like image 219
punkish Avatar asked Jul 02 '13 06:07

punkish


People also ask

How do I clear my git repository history?

Steps to get to a clean commit history:understand rebase and replace pulling remote changes with rebase to remove merge commits on your working branch. use fast-forward or squash merging option when adding your changes to the target branch. use atomic commits — learn how to amend, squash or restructure your commits.

Can you delete git commit history?

If you commit sensitive data, such as a password or SSH key into a Git repository, you can remove it from the history.

How do I remove all files from a remote git repository?

In order to delete files recursively on Git, you have to use the “git rm” command with the “-r” option for recursive and specify the list of files to be deleted. This is particularly handy when you need to delete an entire directory or a subset of files inside a directory.


1 Answers

The process seems right.

You can also test your clean process with a tool like bfg repo cleaner, as in this answer:

java -jar bfg.jar --delete-files *.{jpg,png,mp4,m4v,ogv,webm} ${bare-repo-dir};

(Except BFG makes sure it doesn't delete anything in your latest commit, so you need to remove those files in the current index and make a "clean" commit. All other previous commits will be cleaned by BFG)

Update 2020: for removing files, you would now use git filter-repo (Git 2.22+, Q4 2019), since git filter-branch or BFG are now, 7 years later, obsolete.

git filter-repo --path fileToRemove --invert-paths
like image 105
VonC Avatar answered Oct 20 '22 15:10

VonC