I'd like to filter a list of strings in python by using regex. In the following case, keeping only the files with a '.npy' extension.
The code that doesn't work:
import re files = [ '/a/b/c/la_seg_x005_y003.png', '/a/b/c/la_seg_x005_y003.npy', '/a/b/c/la_seg_x004_y003.png', '/a/b/c/la_seg_x004_y003.npy', '/a/b/c/la_seg_x003_y003.png', '/a/b/c/la_seg_x003_y003.npy', ] regex = re.compile(r'_x\d+_y\d+\.npy') selected_files = filter(regex.match, files) print(selected_files)
The same regex works for me in Ruby:
selected = files.select { |f| f =~ /_x\d+_y\d+\.npy/ }
What's wrong with the Python code?
selected_files = filter(regex.match, files)
re.match('regex')
equals to re.search('^regex')
or text.startswith('regex')
but regex version. It only checks if the string starts with the regex.
So, use re.search()
instead:
import re files = [ '/a/b/c/la_seg_x005_y003.png', '/a/b/c/la_seg_x005_y003.npy', '/a/b/c/la_seg_x004_y003.png', '/a/b/c/la_seg_x004_y003.npy', '/a/b/c/la_seg_x003_y003.png', '/a/b/c/la_seg_x003_y003.npy', ] regex = re.compile(r'_x\d+_y\d+\.npy') selected_files = list(filter(regex.search, files)) # The list call is only required in Python 3, since filter was changed to return a generator print(selected_files)
Output:
['/a/b/c/la_seg_x005_y003.npy', '/a/b/c/la_seg_x004_y003.npy', '/a/b/c/la_seg_x003_y003.npy']
And if you just want to get all of the .npy
files, str.endswith()
would be a better choice:
files = [ '/a/b/c/la_seg_x005_y003.png', '/a/b/c/la_seg_x005_y003.npy', '/a/b/c/la_seg_x004_y003.png', '/a/b/c/la_seg_x004_y003.npy', '/a/b/c/la_seg_x003_y003.png', '/a/b/c/la_seg_x003_y003.npy', ] selected_files = list(filter(lambda x: x.endswith('.npy'), files)) print(selected_files)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With