Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pythonic way to process multiple for loops with different filters against the same list?

Here's a bit of a program I'm writing that will create a csv categorizing a directory of files:

matches = []
for root, dirnames, filenames in os.walk(directory):
    for filename in fnmatch.filter(filenames, '*[A-Z]*'):
        matches.append([os.path.join(root, filename), "No Capital Letters!"])

    test = re.compile(".*\.(py|php)", re.IGNORECASE)
    for filename in filter(test.search, filenames):
        matches.append([os.path.join(root, filename), "Invalid File type!"])

Basically, the user picks a folder and the program denotes problem files, which can be of several types (just two listed here: no files with uppercase letters, no php or python files). There will be probably five or six cases.

While this works, I want to refactor. Is it possible to do something like

for filename in itertools.izip(fnmatch.filter(filenames, '*[A-Z]*'), filter(test.search, filenames), ...):
    matches.append([os.path.join(root, filename), "Violation")

while being able to keep track of which of original unzipped lists caused the "violation?"

like image 231
two7s_clash Avatar asked May 27 '15 19:05

two7s_clash


1 Answers

A simpler solution would probably be to just iterate over the files first and then apply your checks one by one:

reTest = re.compile(".*\.(py|php)", re.IGNORECASE)
for root, dirnames, filenames in os.walk(directory):
    for filename in filenames:
        error = None
        if fnmatch.fnmatch(filename, '*[A-Z]*'):
            error = 'No capital letters!'
        elif reTest.search(filename):
            error = 'Invalid file type!'

        if error:
            matches.append([os.path.join(root, filename), error])

This will not only make the logic a lot simpler since you only ever need to check a single file (instead of having to figure every time out how to call your check method on a sequence of filenames), it will also iterate only once through the list of filenames.

Furthermore, it will also avoid generating multiple matches for a single file name; it just adds one error (the first) at most. If you don’t want this, you could make error a list instead and append to it in your checks—of course you want to change the elif to if then to evaluate all the checks.

like image 176
poke Avatar answered Sep 28 '22 02:09

poke