Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

handling lots of temporary small files

I have a web server which saves cache files and keeps them for 7 days. The file names are md5 hashes, i.e. exactly 32 hex characters long, and are being kept in a tree structure that looks like this:

00/
  00/
    00000ae9355e59a3d8a314a5470753d8
    .
    .
00/
  01/

You get the idea.

My problem is that deleting old files is taking a really long time. I have a daily cron job that runs

find cache/ -mtime +7 -type f -delete

which takes more than half a day to complete. I worry about scalability and the effect this has on the performance of the server. Additionally, the cache directory is now a black hole in my system, trapping the occasional innocent du or find.

The standard solution to LRU cache is some sort of a heap. Is there a way to scale this to the filesystem level? Is there some other way to implement this in a way which makes it easier to manage?

Here are ideas I considered:

  1. Create 7 top directories, one for each week day, and empty one directory every day. This increases the seek time for a cache file 7-fold, makes it really complicated when a file is overwritten, and I'm not sure what it will do to the deletion time.
  2. Save the files as blobs in a MySQL table with indexes on name and date. This seemed promising, but in practice it's always been much slower than FS. Maybe I'm not doing it right.

Any ideas?

like image 528
itsadok Avatar asked Nov 03 '08 09:11

itsadok


People also ask

Why are there so many temporary files on my computer?

Large temporary files, or a large number of small temporary files, accumulate in your profile over time. Often these temporary files are created by various applications that do not have the decency to cleanup after themselves.

How do I manage temporary files?

Find where your temp files are stored by pressing and holding the Windows button, and then hit R to bring up the Run dialogue box. Type temp and press Enter (or click OK) to open up the folder location and see your temp files. Hold Ctrl and click individual items to select them for cleanup.

What happens if there are more temporary files?

Temporary files often stick around a lot longer than they should. When that happens, they can take up space and slow down your computer.


1 Answers

When you store a file, make a symbolic link to a second directory structure that is organized by date, not by name.

Retrieve your files using the "name" structure, delete them using the "date" structure.

like image 181
Tomalak Avatar answered Sep 22 '22 08:09

Tomalak