.txt
20210624
is present inside the file and string 20210625
not in the filesimport os
match_str = ['20210624']
not_match_str = ['20210625']
for root, dirs, files in os.walk(path):
for name in files:
if name.endswith((".txt")):
## search files with match_str `20210624` and not_match_str `20210625`
Can i use using import walk
You can set the recursive
keyword argument in the glob.glob()
method to True
for the program to search recursively through the files of the folders, subfolders, etc.
from glob import glob
path = 'C:\\Users\\User\\Desktop'
for file in glob(path + '\\**\\*.txt', recursive=True):
with open(file) as f:
text = f.read()
if '20210624' in text and '20210625' not in text:
print(file)
If you don't want to entire path of the files to be printed; only the filenames, then:
from glob import glob
path = 'C:\\Users\\User\\Desktop'
for file in glob(path + '\\**\\*.txt', recursive=True):
with open(file) as f:
text = f.read()
if '20210624' in text and '20210625' not in text:
print(file.split('\\')[-1])
In order to use the os.walk()
method, you can use the str.endswith()
method (as you have done in your post) like so:
import os
for path, _, files in os.walk('C:\\Users\\User\\Desktop'):
for file in files:
if file.endswith('.txt'):
with open(os.path.join(path, file)) as f:
text = f.read()
if '20210624' in text and '20210625' not in text:
print(file)
And to search within a maximum level of subdirectories:
import os
levels = 2
root = 'C:\\Users\\User\\Desktop'
total = root.count('\\') + levels
for path, _, files in os.walk(root):
if path.count('\\') > total:
break
for file in files:
if file.endswith('.txt'):
print(os.path.join(path, file))
You can achieve this with pathlib
and glob
.
import pathlib
path = pathlib.Path(path)
maybe_valids = list(path.glob("*20210624*.txt"))
valids = [elem for elem in maybe_valids if "20210625" not in elem.name]
print(valids)
maybe_valids
list is created taking every element that contains "20210624" and ends with .txt, while valids
are the ones that doesn't contain "20210625".
Continue from here -
if name.endswith((".txt")):
f = file.read(name,mode='r')
a = f.read()
if match_str[0] in f.read():
# Number is present
You can use for loops for reading too if you have more than one match_str.
Similarly, you can use not in
keyword to check for not_match_str
You can get the file names with several simple shell commands:
find . -name "*.txt" | xargs grep -l "20210624" | xargs grep -L "20210625"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With