Here's the scenario:
I have a long list of time-stamped file names with characters before and after the time-stamp.
Something like this: prefix_20160817_suffix
What I want is a list (which will ultimately be a subset of the original list) that contains file names with specific prefixes, suffixes, and parts of the timestamp. These specific strings are already given in a list. Note: this "contains" list might vary in size.
For example: ['prefix1', '2016', 'suffix']
or ['201608', 'suffix']
How can I easily get a list of file names that contain every element in the "contains" array?
Here's some pseudo code to demonstrate what I want:
for each fileName in the master list:
if the fileName contains EVERY element in the "contains" array:
add fileName to filtered list of filenames
I'd compile the list into a fnmatch
pattern:
import fnmatch
pattern = '*'.join(contains)
filetered_filenames = fnmatch.filter(master_list, pattern)
This basically concatenates all strings in contains
into a glob pattern with *
wildcards in between. This assumes the order of contains
is significant. Given that you are looking for prefixes, suffixes and (parts of) dates in between, that's not that much of a stretch.
It is important to note that if you run this on an OS that has a case-insensitive filesystem, that fnmatch
matching is also case-insensitive. This is usually exactly what you'd want in that case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With