glob exclude pattern

People also ask

How do glob patterns work?

Globs, also known as glob patterns are patterns that can expand a wildcard pattern into a list of pathnames that match the given pattern. On the early versions of Linux, the command interpreters relied on a program that expanded these characters into unquoted arguments to a command: /etc/glob .

How does glob work in Python?

glob (short for global) is used to return all file paths that match a specific pattern. We can use glob to search for a specific file pattern, or perhaps more usefully, search for files where the filename matches a certain pattern by using wildcard characters.

What is the main purpose of glob characters?

The main purpose of using glob characters is to be able to provide a list of filenames to a command.

The pattern rules for glob are not regular expressions. Instead, they follow standard Unix path expansion rules. There are only a few special characters: two different wild-cards, and character ranges are supported [from pymotw: glob – Filename pattern matching].

So you can exclude some files with patterns.
For example to exclude manifests files (files starting with _) with glob, you can use:

files = glob.glob('files_path/[!_]*')

You can deduct sets:

set(glob("*")) - set(glob("eph*"))

You can't exclude patterns with the glob function, globs only allow for inclusion patterns. Globbing syntax is very limited (even a [!..] character class must match a character, so it is an inclusion pattern for every character that is not in the class).

You'll have to do your own filtering; a list comprehension usually works nicely here:

files = [fn for fn in glob('somepath/*.txt') 
         if not os.path.basename(fn).startswith('eph')]

Late to the game but you could alternatively just apply a python filter to the result of a glob:

files = glob.iglob('your_path_here')
files_i_care_about = filter(lambda x: not x.startswith("eph"), files)

or replacing the lambda with an appropriate regex search, etc...

EDIT: I just realized that if you're using full paths the startswith won't work, so you'd need a regex

In [10]: a
Out[10]: ['/some/path/foo', 'some/path/bar', 'some/path/eph_thing']

In [11]: filter(lambda x: not re.search('/eph', x), a)
Out[11]: ['/some/path/foo', 'some/path/bar']

Compared with glob, I recommend pathlib. Filtering one pattern is very simple.

from pathlib import Path

p = Path(YOUR_PATH)
filtered = [x for x in p.glob("**/*") if not x.name.startswith("eph")]

And if you want to filter a more complex pattern, you can define a function to do that, just like:

def not_in_pattern(x):
    return (not x.name.startswith("eph")) and not x.name.startswith("epi")


filtered = [x for x in p.glob("**/*") if not_in_pattern(x)]

Using that code, you can filter all files that start with eph or start with epi.

How about skipping the particular file while iterating over all the files in the folder! Below code would skip all excel files that start with 'eph'

import glob
import re
for file in glob.glob('*.xlsx'):
    if re.match('eph.*\.xlsx',file):
        continue
    else:
        #do your stuff here
        print(file)

This way you can use more complex regex patterns to include/exclude a particular set of files in a folder.

Related questions
                            
                                Why does Python print unicode characters when the default encoding is ASCII?
                            
                                How to document class attributes in Python?
                            
                                Does python have a sorted list?
                            
                                Easiest way to rename a model using Django/South?
                            
                                How can I save an image with PIL?
                            
                                How can I create a Set of Sets in Python?
                            
                                What is exactly sklearn.pipeline.Pipeline?
                            
                                Check if argparse optional argument is set or not
                            
                                How to display full output in Jupyter, not only last result?
                            
                                Getting the caller function name inside another function in Python? [duplicate]
                            
                                How to stop flask application without using ctrl-c
                            
                                How to create a trie in Python
                            
                                How to capture botocore's NoSuchKey exception?
                            
                                Get path from open file in Python
                            
                                What is the perfect counterpart in Python for "while not EOF"
                            
                                How do I get an empty array of any size in python?
                            
                                In Python, how can you load YAML mappings as OrderedDicts?
                            
                                PyTorch - What does contiguous() do?
                            
                                How are POST and GET variables handled in Python?
                            
                                How to repeat last command in python interpreter shell?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

glob exclude pattern

Tags:

python

glob

People also ask

Recent Activity

Donate For Us