Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Recursively find all files that match a certain pattern

I need to find (or more specifically, count) all files that match this pattern:

*/foo/*.doc

Where the first wildcard asterisk includes a variable number of subdirectories.

like image 372
pw222 Avatar asked Apr 21 '14 23:04

pw222


People also ask

How will you find files recursively that contains specific words in their filename?

You can use grep command or find command as follows to search all files for a string or words recursively.

How do I find all files containing specific text?

Without a doubt, grep is the best command to search a file (or files) for a specific text. By default, it returns all the lines of a file that contain a certain string. This behavior can be changed with the -l option, which instructs grep to only return the file names that contain the specified text.

How do you find a recursive pattern in Unix?

To recursively search for a pattern, invoke grep with the -r option (or --recursive ). When this option is used grep will search through all files in the specified directory, skipping the symlinks that are encountered recursively.


4 Answers

With gnu find you can use regex, which (unlike -name) match the entire path:

find . -regex '.*/foo/[^/]*.doc'

To just count the number of files:

find . -regex '.*/foo/[^/]*.doc' -printf '%i\n' | wc -l

(The %i format code causes find to print the inode number instead of the filename; unlike the filename, the inode number is guaranteed to not have characters like a newline, so counting is more reliable. Thanks to @tripleee for the suggestion.)

I don't know if that will work on OSX, though.

like image 158
rici Avatar answered Oct 03 '22 05:10

rici


how about:

find BASE_OF_SEARCH/*/foo -name \*.doc -type f | wc -l

What this is doing:

  • start at directory BASE_OF_SEARCH/
  • look in all directories that have a directory foo
  • look for files named like *.doc
  • count the lines of the result (one per file)

The benefit of this method:

  • not recursive nor iterative (no loops)
  • it's easy to read, and if you include it in a script it's fairly easy to decipher (regex sometimes is not).

UPDATE: you want variable depth? ok:

find BASE_OF_SEARCH -name \*.doc -type f | grep foo | wc -l

  • start at directory BASE_OF_SEARCH
  • look for files named like *.doc
  • only show the lines of this result that include "foo"
  • count the lines of the result (one per file)

Optionally, you could filter out results that have "foo" in the filename, because this will show those too.

like image 20
MonkeyWidget Avatar answered Oct 03 '22 04:10

MonkeyWidget


Based on the answers on this page on other pages I managed to put together the following, where a search is performed in the current folder and all others under it for all files that have the extension pdf, followed by a filtering for those that contain test_text on their title.

find . -name "*.pdf" | grep test_text | wc -l
like image 35
Tsitsi_Catto Avatar answered Oct 03 '22 05:10

Tsitsi_Catto


Untested, but try:

find . -type d -name foo -print | while read d; do echo "$d/*.doc" ; done | wc -l

find all the "foo" directories (at varying depths) (this ignores symlinks, if that's part of the problem you can add them); use shell globbing to find all the ".doc" files, then count them.

like image 25
mpez0 Avatar answered Oct 03 '22 03:10

mpez0