Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Maximum number of files/directories on Linux?

I'm developing a LAMP online store, which will allow admins to upload multiple images for each item.

My concern is - right off the bat there will be 20000 items meaning roughly 60000 images.

Questions:

  1. What is the maximum number of files and/or directories on Linux?

  2. What is the usual way of handling this situation (best practice)?

My idea was to make a directory for each item, based on its unique ID, but then I'd still have 20000 directories in a main uploads directory, and it will grow indefinitely as old items won't be removed.

Thanks for any help.

like image 594
CodeVirtuoso Avatar asked Nov 23 '11 08:11

CodeVirtuoso


People also ask

Is there a maximum number of files in a folder?

NTFS File Size Maximum disk size: 256 terabytes. Maximum file size: 256 terabytes. Maximum number of files on disk: 4,294,967,295. Maximum number of files in a single folder: 4,294,967,295.

How many files is too many Linux?

The Too many open files message occurs on UNIX and Linux operating systems. The default setting for the maximum number of open files might be too low. To avoid this condition, increase the maximum open files to 8000 : Edit the /etc/security/limit.

Is there a limit to folders?

Was this reply helpful? The maximum of files and folders is 4,294,967,295--*way* more than almost anyone is likely to want.


5 Answers

ext[234] filesystems have a fixed maximum number of inodes; every file or directory requires one inode. You can see the current count and limits with df -i. For example, on a 15GB ext3 filesystem, created with the default settings:

Filesystem           Inodes  IUsed   IFree IUse% Mounted on
/dev/xvda           1933312 134815 1798497    7% /

There's no limit on directories in particular beyond this; keep in mind that every file or directory requires at least one filesystem block (typically 4KB), though, even if it's a directory with only a single item in it.

As you can see, though, 80,000 inodes is unlikely to be a problem. And with the dir_index option (enablable with tune2fs), lookups in large directories aren't too much of a big deal. However, note that many administrative tools (such as ls or rm) can have a hard time dealing with directories with too many files in them. As such, it's recommended to split your files up so that you don't have more than a few hundred to a thousand items in any given directory. An easy way to do this is to hash whatever ID you're using, and use the first few hex digits as intermediate directories.

For example, say you have item ID 12345, and it hashes to 'DEADBEEF02842.......'. You might store your files under /storage/root/d/e/12345. You've now cut the number of files in each directory by 1/256th.

like image 112
bdonlan Avatar answered Oct 01 '22 22:10

bdonlan


If your server's filesystem has the dir_index feature turned on (see tune2fs(8) for details on checking and turning on the feature) then you can reasonably store upwards of 100,000 files in a directory before the performance degrades. (dir_index has been the default for new filesystems for most of the distributions for several years now, so it would only be an old filesystem that doesn't have the feature on by default.)

That said, adding another directory level to reduce the number of files in a directory by a factor of 16 or 256 would drastically improve the chances of things like ls * working without over-running the kernel's maximum argv size.

Typically, this is done by something like:

/a/a1111
/a/a1112
...
/b/b1111
...
/c/c6565
...

i.e., prepending a letter or digit to the path, based on some feature you can compute off the name. (The first two characters of md5sum or sha1sum of the file name is one common approach, but if you have unique object ids, then 'a'+ id % 16 is easy enough mechanism to determine which directory to use.)

like image 33
sarnold Avatar answered Oct 01 '22 20:10

sarnold


60000 is nothing, 20000 as well. But you should put group these 20000 by any means in order to speed up access to them. Maybe in groups of 100 or 1000, by taking the number of the directory and dividing it by 100, 500, 1000, whatever.

E.g., I have a project where the files have numbers. I group them in 1000s, so I have

id/1/1332
id/3/3256
id/12/12334
id/350/350934

You actually might have a hard limit - some systems have 32 bit inodes, so you are limited to a number of 2^32 per file system.

like image 41
glglgl Avatar answered Oct 01 '22 21:10

glglgl


In addition of the general answers (basically "don't bother that much", and "tune your filesystem", and "organize your directory with subdirectories containing a few thousand files each"):

If the individual images are small (e.g. less than a few kilobytes), instead of putting them in a folder, you could also put them in a database (e.g. with MySQL as a BLOB) or perhaps inside a GDBM indexed file. Then each small item won't consume an inode (on many filesystems, each inode wants at least some kilobytes). You could also do that for some threshold (e.g. put images bigger than 4kbytes in individual files, and smaller ones in a data base or GDBM file). Of course, don't forget to backup your data (and define a backup stategy).

like image 32
Basile Starynkevitch Avatar answered Oct 01 '22 22:10

Basile Starynkevitch


The year is 2014. I come back in time to add this answer. Lots of big/small files? You can use Amazon S3 and other alternatives based on Ceph like DreamObjects, where there are no directory limits to worry about.

I hope this helps someone decide from all the alternatives.

like image 22
Abhishek Dujari Avatar answered Oct 01 '22 20:10

Abhishek Dujari