In my web application I render pages using PHP script, and then generate static HTML files from them. The static HTML are served to the users to speed up performance. The HTML files become stale eventually, and need to be deleted.
I am debating between two ways to write the eviction script.
The first is using a single find command, like
find /var/www/cache -type f -mmin +10 -exec rm \{} \;
The second form is by piping through xargs, something like
find /var/www/cache -type f -mmin +10 -print0 | xargs -0 rm
The first form invokes rm
for each file it finds, while the second form just sends all the file names to a single rm
(but the file list might be very long).
Which form would be faster?
In my case, the cache directory is shared between a few web servers, so this is all done over NFS, if that matters for this issue.
Skimming is reading rapidly in order to get a general overview of the material. Scanning is reading rapidly in order to find specific facts. While skimming tells you what general information is within a section, scanning helps you locate a particular fact.
What is the average golf head speed? The average clubhead speed for many male, amateur golfers is between 80-90 mph. Leading LPGA players come in around 90-100 mph. Tour pros tend to have average golf swing speeds in the 110-115 mph range or even higher, and long drive competitors are all the way up in the 140s.
Skimming often refers to the way in which one reads at a faster rate to gain the general idea about the text without paying heed to the intentional and detailed meaning of the text. For Example - When one reads the text only in order to understand the thesis statement, in one or two lines.
The xargs version is dramatically faster with a lot of files than the -exec version as you posted it, this is because rm
is executed once for each file you want to remove, while xargs will lump as many files as possible together into a single rm
command.
With tens or hundreds of thousands of files, it can be the difference between a minute or less versus the better part of an hour.
You can get the same behavior with -exec by finishing the command with a "+" instead of "\;". This option is only available in newer versions of find
.
The following two are roughly equivalent:
find . -print0 | xargs -0 rm
find . -exec rm \{} +
Note that the xargs
version will still run slightly faster (by a few percent) on a multi-processor system, because some of the work can be parallelized. This is particularly true if a lot of computation is involved.
I expect the xargs version to be slightly faster as you aren't spawning a process for each filename. But, I would be surprised if there was actually much difference in practice. If you're worried about the long list xargs sends to each invocation of rm, you can use -l with xargs to limit the number of tokens it will use. However, xargs knows the longest cmdline length and won't go beyond that.
The find command has a -delete option builtin in, perhaps that could be useful as well? http://lists.freebsd.org/pipermail/freebsd-questions/2004-July/051768.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With