Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quick file access in a directory with 500,000 files

I have a directory with 500,000 files in it. I would like to access them as quickly as possible. The algorithm requires me to repeatedly open and close them (can't have 500,000 file open simultaneously).

How can I do that efficiently? I had originally thought that I could cache the inodes and open the files that way, but *nix doesn't provide a way to open files by inode (security or some such).

The other option is to just not worry about it and hope the FS does good job on file look up in a directory. If that is the best option, which FS's would work best. Do certain filename patterns look up faster than others? eg 01234.txt vs foo.txt

BTW this is all on Linux.

like image 246
deft_code Avatar asked Nov 21 '08 22:11

deft_code


People also ask

How many files is too many in a directory?

You can put 4,294,967,295 files into a single folder if drive is formatted with NTFS (would be unusual if it were not) as long as you do not exceed 256 terabytes (single file size and space) or all of disk space that was available whichever is less.

Is there a limit to how many files can be in a folder?

Maximum number of files on disk: 4,294,967,295. Maximum number of files in a single folder: 4,294,967,295.

Is it bad to have too many files in a folder?

Most modern filesystems do ok with that many files. Once you hit 32k files in a directory some filesystems such as ext3 will start having serious performance issues.

Is rsync faster than rm?

rsync in this benchmark case is faster than rm -rf : web.archive.org/web/20130929001850/http://linuxnote.net/… Great explanation. Magma is liquid hot by definition. It's still a great example of a better file destruction method.


1 Answers

Assuming your file system is ext3, your directory is indexed with a hashed B-Tree if dir_index is enabled. That's going to give you as much a boost as anything you could code into your app.

If the directory is indexed, your file naming scheme shouldn't matter.

http://lonesysadmin.net/2007/08/17/use-dir_index-for-your-new-ext3-filesystems/

like image 95
Corbin March Avatar answered Sep 28 '22 14:09

Corbin March