I am trying to find the extension of a file, given its name as a string. I know I can use the function os.path.splitext
but it does not work as expected in case my file extension is .tar.gz
or .tar.bz2
as it gives the extensions as gz
and bz2
instead of tar.gz
and tar.bz2
respectively.
So I decided to find the extension of files myself using pattern matching.
print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz')group('ext')
>>> gz # I want this to come as 'tar.gz'
print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.bz2')group('ext')
>>> bz2 # I want this to come 'tar.bz2'
I am using (?P<ext>...)
in my pattern matching as I also want to get the extension.
Please help.
fnmatch() compares a single file name against a pattern and returns TRUE if they match else returns FALSE. The comparison is case-sensitive when the operating system uses a case-sensitive file system. The special characters and their functions used in shell-style wildcards are : '*' – matches everything.
The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. There are only a few special characters: two different wild-cards, and character ranges are supported.
Pattern matching involves providing a pattern and an associated action to be taken if the data fits the pattern. At its simplest, pattern matching works like the switch statement in C/ C++/ JavaScript or Java. Matching a subject value against one or more cases.
root,ext = os.path.splitext('a.tar.gz')
if ext in ['.gz', '.bz2']:
ext = os.path.splitext(root)[1] + ext
Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.
>>> print re.compile(r'^.*[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
gz
>>> print re.compile(r'^.*?[.](?P<ext>tar\.gz|tar\.bz2|\w+)$').match('a.tar.gz').group('ext')
tar.gz
>>>
The ? operator tries to find the minimal match, so instead of .* eating ".tar" as well, .*? finds the minimal match that allows .tar.gz to be matched.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With