I use os.walk
to iterate over, say, 1000 files (just iteration, no process is done on these files).
The first run is slow, but subsequent runs (on the same path) are about 20 times faster.
As far as I know, os.walk
and os.listdir
(which is used by os.walk
) didn't do any caching, nor the FindFirstFile
/FindNextFile
(which is used by os.listdir
on my Windows platform).
So is this due to page caching or some thing else?
FYI, I'm trying to write a backup application and need to process huge number of files. If it's indeed due to page caching, then I'll need to write my own caching mechanism.
Your OS does the caching here; directory lookups require disk access which is slow, so such access is heavily cached.
For example, the ntfs.sys
driver uses the Data Map service to cache filesystem metadata such as directory listings.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With