Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

quickest way to count the number of files in a directory containing hundreds of thousands of files

In a solaris system that processes large numbers of files and stores their information in a database (yes i know that using the database is the quickest way to get information about the number of files we have). I need fast way to monitor the files as they progress through the system on their way to being stored in a database.

Currently I use a perl script that reads in the directory to an array and then grabs the size of the array and sends it to a monitoring script. Unfortunately as our system grows this monitor is getting more and more slow.

I am looking for a method that will operate much more quickly instead of pausing and updating every 15-20 seconds after performing the count operation on all the directories involved.

I am relatively certain that my bottleneck is the read directory into array operation.

I don't need any information about the files, I don't need sizes or file names, just the number of files in the directory.

In my code I do not count hidden files or the text files I use to hold configuration information. It would be great if this functionality was preserved but is certainly not mandatory.

I have found some references to counting inodes with C code or something along those lines but I am not very experienced in that area.

I would like to make this monitor as real-time as possible.

The perl code I use looks like this:

opendir (DIR, $currentDir) or die "Cannot open directory: $!";
@files = grep ! m/^\./ && ! /config_file/, readdir DIR; # skip hidden files and config files
closedir(DIR);
$count = @files;
like image 494
Andrew Avatar asked Jul 18 '13 19:07

Andrew


People also ask

How can I quickly count files in a folder?

You can also use the Command Prompt. To count the folders and files in a folder, open the Command Prompt and run the following command: dir /a:-d /s /b "Folder Path" | find /c ":".

How do I count the number of files in a directory?

Open Windows Explorer. Navigate to the folder containing the files you want to count. In the bottom left portion of the window, it displays how many items (files and folders) are in the current directory.

Which command is used to check the count of number of files in a directory?

To determine how many files there are in the current directory, put in ls -1 | wc -l. This uses wc to do a count of the number of lines (-l) in the output of ls -1. It doesn't count dotfiles.

How do I count the number of files in multiple folders?

Use File Explorer Open the folder and select all the subfolders or files either manually or by pressing CTRL+A shortcut. If you choose manually, you can select and omit particular files. You can now see the total count near the left bottom of the window.


1 Answers

What you do right now reads the whole directory (more or less) into memory only to discard that content for its count. Avoid that by streaming the directory instead:

my $count;
opendir(my $dh, $curDir) or die "opendir($curdir): $!";
while (my $de = readdir($dh)) {
  next if $de =~ /^\./ or $de =~ /config_file/;
  $count++;
}
closedir($dh);

Importantly, don't use glob() in any of its forms. glob() will expensively stat() every entry, which is not overhead you want.

Now, you might have much more sophisticated and lighter weight ways of doing this depending on OS capabilities or filesystem capabilities (Linux, by way of comparison, offers inotify), but streaming the dir as above is about as good as you'll portably get.

like image 192
pilcrow Avatar answered Nov 04 '22 00:11

pilcrow