I want to get two separate lists of file names using glob, with each list having the same type of files. I have two type of data files. For example,
The only difference is that the second file type is followed by "_patients". A date can be anything but the format is consistent. How can I accomplish this using glob?
To precisely match the digits, you can use the glob patterns:
[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].dat # matches e.g. 2018-01-02.dat
[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_patients.dat # matches e.g. 2018-01-02_patients.dat
You can also use the ? instead of [0-9] to match any single character, if you are sure about the absence of any other alike patterns.
In [103]: glob.glob('[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9].dat')
Out[103]: ['2018-01-02.dat', '2014-03-12.dat']
In [104]: glob.glob('[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]_patients.dat')
Out[104]: ['2018-01-02_patients.dat', '2014-03-12_patients.dat']
You can use re with glob:
import glob
import re
final_files = [i for i in glob.glob('*') if re.findall('\.dat$|_patients\.dat$', i)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With