Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Many files in one directory?

I develop some PHP project on Linux platform. Are there any disadvantages of putting several thousand images (files) in one directory? This is closed set which won't grow. The alternative would be to separate this files using directory structure based on some ID (this way there would be let's say only 100 in one directory).

I ask this question, because often I see such separation when I look at images URLs on different sites. You can see that directory separation is done in such way, that no more then several hundreds images are in one directory.

What would I gain by not putting several thousand files (of not growing set) in one directory but separating them in groups of e.g. 100? Is it worth complicating things?

UPDATE:

  • There won't be any programmatic iteration over files in a directory (just a direct access to an image by it's filename)
  • I want to emphasize that the image set is closed. It's less then 5000 images, and that is it.
  • There is no logical categorization of this images
  • Human access/browse is not required
  • Images have unique filenames
  • OS: Debian/Linux 2.6.26-2-686, Filesystem: ext3

VALUABLE INFORMATION FROM THE ANSWERS:

Why separate many files to different directories:

  • "32k files limit per directory when using ext3 over nfs"
  • performance reason (access speed) [but for several thousand files it is difficult to say if it's worth, without measuring ]
like image 542
Dawid Ohia Avatar asked Feb 21 '10 16:02

Dawid Ohia


People also ask

How many files can be in a directory?

You can put 4,294,967,295 files into a single folder if drive is formatted with NTFS (would be unusual if it were not) as long as you do not exceed 256 terabytes (single file size and space) or all of disk space that was available whichever is less.

How do I find out how many files are in a folder?

Alternatively, select the folder and press the Alt + Enter keys on your keyboard. When the Properties window opens, Windows 10 automatically starts counting the files and folders inside the selected directory. You can see the number of files and folders displayed in the Contains field.

How many files can you have in a directory terminal?

To determine how many files there are in the current directory, put in ls -1 | wc -l. This uses wc to do a count of the number of lines (-l) in the output of ls -1.

How many files are in a directory Linux?

The easiest way to count files in a directory on Linux is to use the “ls” command and pipe it with the “wc -l” command. The “wc” command is used on Linux in order to print the bytes, characters or newlines count. However, in this case, we are using this command to count the number of files in a directory.


3 Answers

In addition to faster file access by separating images into subdirectories, you also dramatically extend the number of files you can track before hitting the natural limits of the filesystem.

A simple approach is to md5() the file name, then use the first n characters as the directory name (eg, substr(md5($filename), 2)). This ensures a reasonably even distribution (vs taking the first n characters of the straight filename).

like image 136
MightyE Avatar answered Oct 10 '22 01:10

MightyE


usually the reason for such splitting is file system performance. for a closed set of 5000 files I am not sure it's worth the hassle. I suggest that you try the simple approach of putting all the files in one directory thing, but keep an eye open on the actual time it takes to access the files.

if you see that it's not fast enough for your needs, you can split it like you suggested.

I had to split files myself for performance reasons. in addition I bumped into a 32k files limit per directory when using ext3 over nfs (not sure if it's a limit of nfs or ext3). so that's another reason to split into multiple directories. in any case, try with a single dir and only split if you see it's not fast enough.

like image 27
Omry Yadan Avatar answered Oct 10 '22 00:10

Omry Yadan


There is no reason to split those files into multiple directories, if you won't expect any filename conflicts and if you don't need to iterate over those images at any point.

But still, if you can think of a suggestive categorization, it's not a bad idea to sort the images a bit, even if it is just for maintenance reasons.

like image 39
poke Avatar answered Oct 10 '22 00:10

poke