Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Use glob to find arbitrary length numbers

Tags:

python

glob

I'm looking for the glob-pattern that find all files containing the pattern myfile_[SomeNumber].txt

My naive attempt was

glob.glob("myfile_[0-9]*.txt")

but this also find all files on the form myfile_[SomeNumber][AnyStuff].txt

This answer shows how to do it for a fixed length, but that not what I want in this case. use python glob to find a folder that is a 14 digit number

like image 952
Mikael Fremling Avatar asked Jun 16 '16 15:06

Mikael Fremling


People also ask

Does glob support regex?

The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. There are only a few special characters: two different wild-cards, and character ranges are supported. The patterns rules are applied to segments of the filename (stopping at the path separator, /).

How do I use glob in Python?

We can use the function glob. glob() or glob. iglob() directly from glob module to retrieve paths recursively from inside the directories/files and subdirectories/subfiles. Note: When recursive is set True “ ** ” followed by path separator ('./**/') will match any files or directories.

How do I use glob to find files recursively?

How to use Glob() function to find files recursively in Python? To use Glob() to find files recursively, you need Python 3.5+. The glob module supports the "**" directive(which is parsed only if you pass recursive flag) which tells python to look recursively in the directories.

What is glob glob () in Python?

Python glob. glob() method returns a list of files or folders that matches the path specified in the pathname argument. This function takes two arguments, namely pathname, and recursive flag. pathname : Absolute (with full path and the file name) or relative (with UNIX shell-style wildcards).


1 Answers

You are probably confusing regular expression syntax with glob constructs. [0-9]* in globbing means "a single digit followed by zero or more of any character". So drop the *.

In extended globbing there is a qualifier of "one or more", but that is not supported by glob, so there is little choice but to use a regular expression, i.e. do your own filename pattern matching. There are several ways to do this, here is one:

import os
import re

files = []
for fname in os.listdir('.'):
    if re.match(r"myfile_[0-9]+.txt", fname):
        files.append(fname)

print files

Note that the RE is not exactly the same as yours, I use + which means "one of more of the preceding pattern", an * would mean "zero or more" - so the digits would be optional, which could be what you want (I'm not sure).

The bulk of the code could be done as a list comprehension, but that would arguably loose some readability:

files = [fname for fname in os.listdir('.') 
        if re.match(r"myfile_[0-9]+.txt", fname)]
like image 112
cdarke Avatar answered Oct 12 '22 08:10

cdarke