Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Ignore case in glob() on Linux

Tags:

python

linux

I'm writing a script which will have to work on directories which are modified by hand by Windows and Linux users alike. The Windows users tend to not care at all about case in assigning filenames.

Is there a way to handle this on the Linux side in Python, i.e. can I get a case-insensitive, glob-like behaviour?

like image 468
andreas-h Avatar asked Nov 16 '11 11:11

andreas-h


People also ask

How do I ignore a case in Linux?

The key to that case-insensitive search is the use of the -iname option, which is only one character different from the -name option. The -iname option is what makes the search case-insensitive.

Is glob glob case insensitive?

glob() is case sensitive in Linux, but case insensitive in Windows.

How do you grep a case insensitive?

Case Insensitive Search By default, grep is case sensitive. This means that the uppercase and lowercase characters are treated as distinct. To ignore case when searching, invoke grep with the -i option (or --ignore-case ).

How do you ignore case sensitive in find command?

If you want the search for a word or phrase to be case insensitive, use the -iname option with the find command. It is the case insensitive version of the -name command.


6 Answers

You can replace each alphabetic character c with [cC], via

import glob
def insensitive_glob(pattern):
    def either(c):
        return '[%s%s]' % (c.lower(), c.upper()) if c.isalpha() else c
    return glob.glob(''.join(map(either, pattern)))
like image 168
Geoffrey Irving Avatar answered Oct 02 '22 11:10

Geoffrey Irving


Use case-insensitive regexes instead of glob patterns. fnmatch.translate generates a regex from a glob pattern, so

re.compile(fnmatch.translate(pattern), re.IGNORECASE)

gives you a case-insensitive version of a glob pattern as a compiled RE.

Keep in mind that, if the filesystem is hosted by a Linux box on a Unix-like filesystem, users will be able to create files foo, Foo and FOO in the same directory.

like image 21
Fred Foo Avatar answered Oct 02 '22 12:10

Fred Foo


Non recursively

In order to retrieve the files (and files only) of a directory "path", with "globexpression":

list_path = [i for i in os.listdir(path) if os.path.isfile(os.path.join(path, i))]
result = [os.path.join(path, j) for j in list_path if re.match(fnmatch.translate(globexpression), j, re.IGNORECASE)]

Recursively

with walk:

result = []
for root, dirs, files in os.walk(path, topdown=True):
    result += [os.path.join(root, j) for j in files \
        if re.match(fnmatch.translate(globexpression), j, re.IGNORECASE)]

Better also compile the regular expression, so instead of

re.match(fnmatch.translate(globexpression)

do (before the loop):

reg_expr = re.compile(fnmatch.translate(globexpression), re.IGNORECASE)

and then replace in the loop:

result += [os.path.join(root, j) for j in files if re.match(reg_expr, j)]
like image 37
Raffi Avatar answered Oct 02 '22 11:10

Raffi


Here is my non-recursive file search for Python with glob like behavior for Python 3.5+

# Eg: find_files('~/Downloads', '*.Xls', ignore_case=True)
def find_files(path: str, glob_pat: str, ignore_case: bool = False):
    rule = re.compile(fnmatch.translate(glob_pat), re.IGNORECASE) if ignore_case \
            else re.compile(fnmatch.translate(glob_pat))
    return [n for n in os.listdir(os.path.expanduser(path)) if rule.match(n)]

Note: This version handles home directory expansion

like image 38
Timothy C. Quinn Avatar answered Oct 02 '22 12:10

Timothy C. Quinn


Depending on your case, you might use .lower() on both file pattern and results from folder listing and only then compare the pattern with the filename

like image 26
HCLivess Avatar answered Oct 02 '22 11:10

HCLivess


Riffing off of @Timothy C. Quinn's answer, this modification allows the use of wildcards anywhere in the path. This is admittedly only case insensitive for the glob_pat argument.

import re
import os
import fnmatch
import glob

def find_files(path: str, glob_pat: str, ignore_case: bool = False):
    rule = re.compile(fnmatch.translate(glob_pat), re.IGNORECASE) if ignore_case \
            else re.compile(fnmatch.translate(glob_pat))
    return [n for n in glob.glob(os.path.join(path, '*')) if rule.match(n)]
like image 30
Matthew Snyder Avatar answered Oct 02 '22 12:10

Matthew Snyder