Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What happens if there are too many files under a single directory in Linux?

Tags:

file

linux

system

If there are like 1,000,000 individual files (mostly 100k in size) in a single directory, flatly (no other directories and files in them), is there going to be any compromises in efficiency or disadvantages in any other possible ways?

like image 991
datasn.io Avatar asked Mar 18 '09 09:03

datasn.io


People also ask

Is there a limit on number of files in Linux directory?

Maximum number of files: 232 - 1 (4,294,967,295) Maximum number of files per directory: unlimited. Maximum file size: 244 - 1 bytes (16 TiB - 1) Maximum volume size: 248 - 1 bytes (256 TiB - 1)

How many files in a directory is too many?

It is entirely based on context, activity, and your definition of "too". The answer is likely between 100 and 10 million.

Can you have too many files in a folder?

You can put 4,294,967,295 files into a single folder if drive is formatted with NTFS (would be unusual if it were not) as long as you do not exceed 256 terabytes (single file size and space) or all of disk space that was available whichever is less.

What is the maximum number of files this file system can have?

NTFS File Size Maximum disk size: 256 terabytes. Maximum file size: 256 terabytes. Maximum number of files on disk: 4,294,967,295. Maximum number of files in a single folder: 4,294,967,295.


1 Answers

ARG_MAX is going to take issue with that... for instance, rm -rf * (while in the directory) is going to say "too many arguments". Utilities that want to do some kind of globbing (or a shell) will have some functionality break.

If that directory is available to the public (lets say via ftp, or web server) you may encounter additional problems.

The effect on any given file system depends entirely on that file system. How frequently are these files accessed, what is the file system? Remember, Linux (by default) prefers keeping recently accessed files in memory while putting processes into swap, depending on your settings. Is this directory served via http? Is Google going to see and crawl it? If so, you might need to adjust VFS cache pressure and swappiness.

Edit:

ARG_MAX is a system wide limit to how many arguments can be presented to a program's entry point. So, lets take 'rm', and the example "rm -rf *" - the shell is going to turn '*' into a space delimited list of files which in turn becomes the arguments to 'rm'.

The same thing is going to happen with ls, and several other tools. For instance, ls foo* might break if too many files start with 'foo'.

I'd advise (no matter what fs is in use) to break it up into smaller directory chunks, just for that reason alone.

like image 173
Tim Post Avatar answered Sep 28 '22 19:09

Tim Post