Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What is the fastest / easiest way to count large number of files in a directory (in Linux)?

I had some directory, with large number of files. Every time I tried to access the list of files within it, I was not able to do that or there was significant delay. I was trying to use ls command within command-line on Linux and web interface from my hosting provider did not help also.

The problem is, that when I just do ls, it takes significant amount of time to even start displaying something. Thus, ls | wc -l would not help also.

After some research I came up with this code (in this example it counts number of new emails on some server):

print sum([len(files) for (root, dirs, files) in walk('/home/myname/Maildir/new')])

The above code is written in Python. I used Python's command-line tool and it worked pretty fast (returned result instantly).

I am interested in the answer to the following question: is it possible to count files in a directory (without subdirectories) faster? What is the fastest way to do that?

like image 751
Tadeck Avatar asked May 21 '11 16:05

Tadeck


People also ask

How can I quickly count files in a folder?

To count the folders and files in a folder, open the Command Prompt and run the following command: dir /a:-d /s /b "Folder Path" | find /c ":". For example, we wanted to count the files and subfolders in our "E:\OneDrive\Documents" folder, so we had to run dir /a:-d /s /b "E:\OneDrive\Documents" | find /c ":".

How do I count the number of files in a directory?

Open Windows Explorer. Navigate to the folder containing the files you want to count. In the bottom left portion of the window, it displays how many items (files and folders) are in the current directory.

Which command can be used to count the number of files in a directory?

To determine how many files there are in the current directory, put in ls -1 | wc -l. This uses wc to do a count of the number of lines (-l) in the output of ls -1.

How do I find the largest 10 files and directories in Linux?

du command -h option : Display sizes in human readable format (e.g., 1K, 234M, 2G). du command -s option : It shows only a total for each argument (summary). du command -x option : Skip directories on different file systems. sort command -r option : Reverse the result of comparisons.


3 Answers

I'm not sure about speed, but if you want to just use shell builtins this should work:

#!/bin/sh
COUNT=0;
for file in /path/to/directory/*
do
COUNT=$(($COUNT+1));
done
echo $COUNT
like image 118
Shea Levy Avatar answered Oct 06 '22 00:10

Shea Levy


Total number of files in the given directory

find . -maxdepth 1 -type f | wc -l

Total number of files in the given directory and all subdirectories under it

find . -type f | wc -l

For more details drop into a terminal and do man find

like image 36
Praveen Lobo Avatar answered Oct 06 '22 01:10

Praveen Lobo


ls does a stat(2) call for every file. Other tools, like find(1) and the shell wildcard expansion, may avoid this call and just do readdir. One shell command combination that might work is find dir -maxdepth 1|wc -l, but it will gladly list the directory itself and miscount any filename with a newline in it.

From Python, the straight forward way to get just these names is os.listdir(directory). Unlike os.walk and os.path.walk, it does not need to recurse, check file types, or make further Python function calls.

Addendum: It seems ls doesn't always stat. At least on my GNU system, it can do only a getdents call when further information (such as which names are directories) is not requested. getdents is the underlying system call used to implement readdir in GNU/Linux.

Addition 2: One reason for a delay before ls outputs results is that it sorts and tabulates. ls -U1 may avoid this.

like image 38
Yann Vernier Avatar answered Oct 06 '22 01:10

Yann Vernier