Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to glob two patterns with pathlib?

Tags:

python

pathlib

I want find two types of files with two different extensions: .jl and .jsonlines. I use

from pathlib import Path
p1 = Path("/path/to/dir").joinpath().glob("*.jl")
p2 = Path("/path/to/dir").joinpath().glob("*.jsonlines")

but I want p1 and p2 as one variable not two. Should I merge p1 and p2 in first place? Are there other ways to concatinate glob's patterns?

like image 270
Gmosy Gnaq Avatar asked Jan 10 '18 05:01

Gmosy Gnaq


People also ask

What does Pathlib path () do?

The pathlib is a Python module which provides an object API for working with files and directories. The pathlib is a standard module. Path is the core object to work with files.

How do you glob glob?

Using Glob() function to find files recursivelyglob() or glob. iglob() directly from glob module to retrieve paths recursively from inside the directories/files and subdirectories/subfiles. Note: When recursive is set True “ ** ” followed by path separator ('./**/') will match any files or directories.

What does Pathlib path return?

parts : returns a tuple that provides access to the path's components. name : the path component without any directory. parent : sequence providing access to the logical ancestors of the path. stem : final path component without its suffix.


3 Answers

from pathlib import Path

exts = [".jl", ".jsonlines"]
mainpath = "/path/to/dir"

# Same directory

files = [p for p in Path(mainpath).iterdir() if p.suffix in exts]

# Recursive

files = [p for p in Path(mainpath).rglob('*') if p.suffix in exts]

# 'files' will be a generator of Path objects, to unpack into strings:

list(files)
like image 149
lesleslie Avatar answered Sep 21 '22 18:09

lesleslie


If you're ok with installing a package, check out wcmatch. It can patch the Python PathLib so that you can run multiple matches in one go:

from wcmatch.pathlib import Path
paths = Path('path/to/dir').glob(['*.jl', '*.jsonlines'])
like image 42
Ciprian Tomoiagă Avatar answered Sep 21 '22 18:09

Ciprian Tomoiagă


Inspired by @aditi's answer, I came up with this:

from pathlib import Path
from itertools import chain

exts = ["*.jl", "*.jsonlines"]
mainpath = "/path/to/dir"

P = []
for i in exts:
    p = Path(mainpath).joinpath().glob(i)
    P = chain(P, p)
print(list(P))
like image 32
Gmosy Gnaq Avatar answered Sep 19 '22 18:09

Gmosy Gnaq