I have a web server which saves cache files and keeps them for 7 days. The file names are md5 hashes, i.e. exactly 32 hex characters long, and are being kept in a tree structure that looks like this:
00/
00/
00000ae9355e59a3d8a314a5470753d8
.
.
00/
01/
You get the idea.
My problem is that deleting old files is taking a really long time. I have a daily cron job that runs
find cache/ -mtime +7 -type f -delete
which takes more than half a day to complete. I worry about scalability and the effect this has on the performance of the server. Additionally, the cache directory is now a black hole in my system, trapping the occasional innocent du
or find
.
The standard solution to LRU cache is some sort of a heap. Is there a way to scale this to the filesystem level? Is there some other way to implement this in a way which makes it easier to manage?
Here are ideas I considered:
Any ideas?
Large temporary files, or a large number of small temporary files, accumulate in your profile over time. Often these temporary files are created by various applications that do not have the decency to cleanup after themselves.
Find where your temp files are stored by pressing and holding the Windows button, and then hit R to bring up the Run dialogue box. Type temp and press Enter (or click OK) to open up the folder location and see your temp files. Hold Ctrl and click individual items to select them for cleanup.
Temporary files often stick around a lot longer than they should. When that happens, they can take up space and slow down your computer.
When you store a file, make a symbolic link to a second directory structure that is organized by date, not by name.
Retrieve your files using the "name" structure, delete them using the "date" structure.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With