Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 3 filtering directories by name that matches specific pattern

Currently I'm developing script that will perform cleanup of specific directories.

For example: Directory: /app/test/log contains many sub-directories with name pattern testYYYYMMDD and logYYYYMMDD

What I need, is to filter out only directories like testYYYYMMDD

To get all folders with absolute path that are in given directory I use:

folders_in_given_folder = [name for name in os.listdir(Directory) if os.path.isdir(os.path.join(Directory, name))]
folder_list = []
for folder in folders_in_given_folder:
    folder_list.append([os.path.join(Directory, folder)])
print(folder_list)

Gives output:

[['/app/test/log/test20150615'], ['/app/test/log/test20150616'], ['/app/test/log/b'], ['/app/test/log/a'], ['/app/test/log/New folder'], ['/app/test/log/rem'], ['/app/test/log/test']]

So now I need to filter out sub-directories that fits pattern, pattern can be something like: *test*, test*, test2015*

I've tried using glob.glob(), but this seems to work only with files not directories.

Could someone please be so kind and explain how I could achieve desired outcome?

like image 484
user2768632 Avatar asked Jun 17 '15 07:06

user2768632


People also ask

How do I filter a directory in Python?

To filter and list the files according to their names, we need to use “fnmatch. fnmatch()” and “os. listdir()” functions with name filtering regex patterns. You may find an example of filtering and listing files according to their names in Python.

How do I search for a specific directory in Python?

Python can search for file names in a specified path of the OS. This can be done using the module os with the walk() functions. This will take a specific path as input and generate a 3-tuple involving dirpath, dirnames, and filenames.


2 Answers

import os 
import re

result = []
reg_compile = re.compile("test\d{8}")
for dirpath, dirnames, filenames in os.walk(myrootdir):
    result = result + [dirname for dirname in dirnames if  reg_compile.match(dirname)]

As advised I will explain (thanks for the -1 btw :D)

the compile("test\d{8}) will prepare a regex that matches any folder named test followed by a date with a 8 numbers format.

Then I take advantage of the os.walk method to have every folder properly in the folders iterator (thus avoiding using the method is_dir)

With the line [dirname for dirname in dirnames if reg_compile.match(dirname)] I filter the folder whose name matches the regular expression explained above.

For a first answer (yes it was the first) that works (tested on my computer for python2 and python3) I find it harsh to be downvoted. Also the accepted answers contains the same kind regular expression I used. Now I also agree that I should have had explained earlier.

Would you be kind enough to remove that downvote ?

like image 186
Azurtree Avatar answered Sep 20 '22 01:09

Azurtree


You need to use re module. re module is regexp python module. re.compile creates re object and you can use match method to filter list.

    import re
    R = re.compile(pattern)
    filtered = [folder for folder in folder_list if R.match(folder)]

As a pattern you can use smth like this:

>>> R = re.compile(".*test.*")
>>>
>>> R.match("1test")
<_sre.SRE_Match object at 0x024ED800>
>>> R.match("1test")
<_sre.SRE_Match object at 0x024ED598>
>>> R.match("test2015")
<_sre.SRE_Match object at 0x024ED800>
>>> R.match("1test2")
<_sre.SRE_Match object at 0x024ED598>
like image 32
Dmitry Zagorulkin Avatar answered Sep 22 '22 01:09

Dmitry Zagorulkin