Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Exclude ".txt" files

Tags:

python

regex

I want to exclude ".txt" files of a directory with a regex (and only regex). But this code doesn't work and I don't understand why. I have this list :

['/var/tmp/COMMUN/4.1.0_41/Apache',
 '/var/tmp/COMMUN/4.1.0_41/META-INF', 
 '/var/tmp/COMMUN/4.1.0_41/RewriteRules',
 '/var/tmp/COMMUN/4.1.0_41/Robots', 
 '/var/tmp/COMMUN/4.1.0_41/smokeTest',
 '/var/tmp/COMMUN/4.1.0_41/tutu.txt']

And I'm trying this code

# list_dit is a personal function
list_dir(toto, filter_function=lambda x: re.match("^.*(?!txt)$", x))

Anyone look what is wrong ?

like image 617
elhostis Avatar asked Dec 16 '22 05:12

elhostis


1 Answers

Usually .* are greedy matches, they will match as much as they can with the following still matching. As an empty string is an okay match for (?!txt) the .* will simply match the whole string, meaning that this regular expression will match each and every string.

Simply matching for .*\.txt$ and negating the re.match would work.

By the way, you should use a compiled regex instead of re.match, now the regex could be compiled for each and every file in your directory. If you use a compiled regex it will only be compiled once. The compiled regex may be cached by the re module, which in this case will likely be the case as there are no other regex calls in between the re.match calls. However it would, in my opinion, be more "correct" if you compile the regex yourself, that way you are sure that it is only compiled once. Thanks to EOL for the heads up on the caching.

like image 79
wich Avatar answered Dec 22 '22 00:12

wich