Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to list all text (non-binary) files in a git repository?

Tags:

I have a repository with a lot of autogenerated source files I've marked as "binary" in .gitattributes (they are checked in because not everyone has access to the generator tools). Additionally, the repo has a lot of source-ish files in ignored directories (again, generated as part of the build processes), and a number of actual binary files (e.g. little resource files like icons).

I'd now like to find all the non-auto-generated and non-ignored files in the repo. I thought I'd just do this with find and a bunch of exclusions, but now I have a horrendous find statement with a dozen clauses (and it still doesn't perfectly do the job). git ls-files works but shows me all the binary files without differentiation, which I have to filter out.

So, I'm wondering: is there a simple command I can run which lists every file checked into the the repo, and which git considers a "text" file?

like image 344
nneonneo Avatar asked Sep 24 '13 04:09

nneonneo


People also ask

How do I list files in a git repository?

The Git Show command allows us to view files as they existed in a previous state. The version can be a commit ID, tag, or even a branch name. The file must be the path to a file. For example, the following would output a contents of a file named internal/example/module.go file from a tagged commit called “release-23”.

Does Git work with non text files?

Many people want to version control non-text files, such as images, PDFs and Microsoft Office or LibreOffice documents. It is true that Git can handle these filetypes (which fall under the banner of “binary” file types).

Does git handle binary files?

Git LFS is a Git extension used to manage large files and binary files in a separate Git repository. Most projects today have both code and binary assets. And storing large binary files in Git repositories can be a bottleneck for Git users. That's why some Git users add Git Large File Storage (LFS).


1 Answers

git grep --cached -Il '' 

lists all non-empty regular (no symlinks) text files:

  • -I: don't match the pattern in binary files
  • -l: only show the matching file names, not matching lines
  • '': the empty string makes git grep match any non-empty file
  • --cached: also find files added with git add but not yet committed (optional)

Or you could use How to determine if Git handles a file as binary or as text? in a for loop with git ls-files.

TODO empty files.

Find all binary files instead: Find all binary files in git HEAD

Tested on Git 2.16.1 with this test repo.