I have huge set of files that I want to traverse through using python. I am using os.walk(source) for the same and is working but since I have a huge set of files it is taking too much and memory resources since its getting the complete list all at once. How can I optimize this to use less resources and may be walk through one directory at a time or in some other efficient manner and still able to iterate the complete set of files. Thanks
for dir, dirnames, filenames in os.walk(START_FOLDER):
for name in dirnames:
#if PRIVATE_FOLDER not in name:
for keyword in FOLDER_WITH_KEYWORDS_DELETION_EXCEPTION_LIST:
if keyword in name.lower():
ignoreList.append(name)
Project description. scandir() is a directory iteration function like os. listdir(), except that instead of returning a list of bare filenames, it yields DirEntry objects that include file type and stat information along with the name. Using scandir() increases the speed of os.
If the issue is that the directory simply has too many files in it, this will hopefully be solved in Python 3.5.
Until then, you may want to check out scandir.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With