Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Searching multiple text files for two strings?

I have a folder with many text files (EPA10.txt, EPA55.txt, EPA120.txt..., EPA150.txt). I have 2 strings that are to be searched in each file and the result of the search is written in a text file result.txt. So far I have it working for a single file. Here is the working code:

if 'LZY_201_335_R10A01' and 'LZY_201_186_R5U01' in open('C:\\Temp\\lamip\\EPA150.txt').read():
    with open("C:\\Temp\\lamip\\result.txt", "w") as f:
        f.write('Current MW in node is EPA150')
else:
    with open("C:\\Temp\\lamip\\result.txt", "w") as f:
        f.write('NOT EPA150')

Now I want this to be repeated for all the text files in the folder. Please help.

like image 929
slyclam Avatar asked Dec 25 '22 00:12

slyclam


2 Answers

Given that you have some amount of files named from EPA1.txt to EPA150.txt, but you don't know all the names, you can put them all together inside a folder, then read all the files in that folder using the os.listdir() method to get a list of filenames. You can read the file names using listdir("C:/Temp/lamip").

Also, your if statement is wrong, you should do this instead:

text = file.read()
if "string1" in text and "string2" in text

Here's the code:

from os import listdir

with open("C:/Temp/lamip/result.txt", "w") as f:
    for filename in listdir("C:/Temp/lamip"):
        with open('C:/Temp/lamip/' + filename) as currentFile:
            text = currentFile.read()
            if ('LZY_201_335_R10A01' in text) and ('LZY_201_186_R5U01' in text):
                f.write('Current MW in node is ' + filename[:-4] + '\n')
            else:
                f.write('NOT ' + filename[:-4] + '\n')

PS: You can use / instead of \\ in your paths, Python automatically converts them for you.

like image 82
Marco Bonelli Avatar answered Dec 27 '22 13:12

Marco Bonelli


Modularise! Modularise!

Well, not in the terms of having to write distinct Python modules, but isolate the different tasks at hand.

  1. Find the files you wish to search.
  2. Read the file and locate the text.
  3. Write the result into a separate file.

Each of these tasks can be solved independently. I.e. to list the files, you have os.listdir which you might want to filter.

For step 2, it does not matter whether you have 1 or 1,000 files to search. The routine is the same. You merely have to iterate over each file found in step 1. This indicates that step 2 could be implemented as a function that takes the filename (and possible search-string) as argument, and returns True or False.

Step 3 is the combination of each element from step 1 and the result of step 2.

The result:

files = [fn for fn in os.listdir('C:/Temp/lamip') if fn.endswith('.txt')]
# perhaps filter `files`

def does_fn_contain_string(filename):
  with open('C:/Temp/lamip/' + filename) as blargh:
    content = blargh.read()
    return 'string1' in content and/or 'string2' in content

with open('results.txt', 'w') as output:
  for fn in files:
    if does_fn_contain_string(fn):
      output.write('Current MW in node is {1}\n'.format(fn[:-4]))
    else:
      output.write('NOT {1}\n'.format(fn[:-4]))
like image 44
MrGumble Avatar answered Dec 27 '22 13:12

MrGumble