Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the best way to count "find" results?

Tags:

find

bash

My current solution would be find <expr> -exec printf '.' \; | wc -c, but this takes far too long when there are more than 10000 results. Is there no faster/better way to do this?

like image 576
MechMK1 Avatar asked Mar 27 '13 16:03

MechMK1


People also ask

How do I count files in Linux?

The easiest way to count files in a directory on Linux is to use the “ls” command and pipe it with the “wc -l” command. The “wc” command is used on Linux in order to print the bytes, characters or newlines count.

How do I count the number of files in a directory?

To determine how many files there are in the current directory, put in ls -1 | wc -l. This uses wc to do a count of the number of lines (-l) in the output of ls -1.

Which command is used to count the number of lines in a file?

Use the wc command to count the number of lines, words, and bytes in the files specified by the File parameter. If a file is not specified for the File parameter, standard input is used. The command writes the results to standard output and keeps a total count for all named files.

How do I count the number of records in a Unix file?

The tool wc is the "word counter" in UNIX and UNIX-like operating systems, but you can also use it to count lines in a file by adding the -l option. wc -l foo will count the number of lines in foo .


3 Answers

Why not

find <expr> | wc -l

as a simple portable solution? Your original solution is spawning a new process printf for every individual file found, and that's very expensive (as you've just found).

Note that this will overcount if you have filenames with newlines embedded, but if you have that then I suspect your problems run a little deeper.

like image 182
Brian Agnew Avatar answered Oct 17 '22 12:10

Brian Agnew


Try this instead (require find's -printf support):

find <expr> -type f -printf '.' | wc -c

It will be more reliable and faster than counting the lines.

Note that I use the find's printf, not an external command.


Let's bench a bit :

$ ls -1
a
e
l
ll.sh
r
t
y
z

My snippet benchmark :

$ time find -type f -printf '.' | wc -c
8

real    0m0.004s
user    0m0.000s
sys     0m0.007s

With full lines :

$ time find -type f | wc -l
8

real    0m0.006s
user    0m0.003s
sys     0m0.000s

So my solution is faster =) (the important part is the real line)

like image 36
Gilles Quenot Avatar answered Oct 17 '22 10:10

Gilles Quenot


This solution is certainly slower than some of the other find -> wc solutions here, but if you were inclined to do something else with the file names in addition to counting them, you could read from the find output.

n=0
while read -r -d ''; do
    ((n++)) # count
    # maybe perform another act on file
done < <(find <expr> -print0)
echo $n

It is just a modification of a solution found in BashGuide that properly handles files with nonstandard names by making the find output delimiter a NUL byte using print0, and reading from it using '' (NUL byte) as the loop delimiter.

like image 7
John B Avatar answered Oct 17 '22 12:10

John B