Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

zsh glob qualifier to exclude binary files

Tags:

zsh

glob

I am looking for files that contain the string "abc" in the current directory and all subdirectories:

grep abc **/*(.)

The output contains lines such as:

...
Binary file test.pdf matches
...

Is it possible to exclude binary files in the glob qualifier?

EDIT: The use of grep here is just an example. I am interested in excluding binary files by zsh globbing qualifiers, rather than in the appropriate grep options.

like image 993
sieste Avatar asked May 01 '14 12:05

sieste


2 Answers

The message "Binary file test.pdf matches" is not print by zsh but by grep itself.

The reason is that most of the time, if you would print the line of the binary file that contains the pattern, it would print also print "garbage"(i.e. non printable characters, really long lines, etc).

In your example **/*(.) is an zsh expansion. you can check what it expands to with echo:

$ echo **/*(.)

Please note that **/*(.) doesn't match with files stating with a dot in the top directory.

$ mkdir test
$ cd test
$ touch .mytest
$ echo  **/*(.)
zsh: no matches found: **/*(.)

Now, if you want to find files which contain a certain pattern recursively in the current directory there's a very easy way:

$ grep -rI .

If you want to ignore files that start with a dot in the current directory:

$ grep -r *

About using zsh globbing to filter out binary files. This is part of zshexpn(1):

A qualifier may be any one of the following:

   /      directories
   F      `full'  (i.e.  non-empty)  directories.  
   .      plain files
   @      symbolic links
   =      sockets
   (...)

Please note that although the manual says "plain files" it doesn't mean "plain text files". It means regular files.

AFAIK, zsh has no option to glob files based if their content is binary or not.

Zsh doesn't read the content of files when globbing, insted it works with the filesystem metadata available.

Because of that, were zsh to implemente this feature, the globbing time would get considerably slower than the globbing currently avalable(unless of course filesystems implement a way to "tag" binary files, which IMO is unlikely).

You could try filtering out files with the execution flag, but it would be hugely imprecise (i.e. executable scripts would get out, and non executable binary would get in).

This task is better suited to grep itself, since he'll be reading the files anyway.

like image 94
diogovk Avatar answered Jan 03 '23 10:01

diogovk


You can execute arbitrary code as a glob qualifier. Look for estring and +cmd in zshexpn(1).

Without any setup:

ls **/*(.e:'file --mime $REPLY | grep -iqv binary':)

or to make it less awkward:

notbinary() { file --mime $REPLY | grep -iqv binary }
ls **/*(.+notbinary)
like image 25
wilywampa Avatar answered Jan 03 '23 11:01

wilywampa