Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read the next word in a file in python

Tags:

python

I am looking for some words in a file in python. After I find each word I need to read the next two words from the file. I've looked for some solution but I could not find reading just the next words.

# offsetFile - file pointer
# searchTerms - list of words

for line in offsetFile:
    for word in searchTerms:
        if word in line:
           # here get the next two terms after the word

Thank you for your time.

Update: Only the first appearance is necessary. Actually only one appearance of the word is possible in this case.

file:

accept 42 2820 access 183 3145 accid 1 4589 algebra 153 16272 algem 4 17439 algol 202 6530

word: ['access', 'algebra']

Searching the file when I encounter 'access' and 'algebra', I need the values of 183 3145 and 153 16272 respectively.

like image 575
Quazi Farhan Avatar asked Apr 22 '12 01:04

Quazi Farhan


2 Answers

An easy way to deal with this is to read the file using a generator that yields one word at a time from the file.

def words(fileobj):
    for line in fileobj:
        for word in line.split():
            yield word

Then to find the word you're interested in and read the next two words:

with open("offsetfile.txt") as wordfile:
    wordgen = words(wordfile)
    for word in wordgen:
        if word in searchterms:   # searchterms should be a set() to make this fast
            break
    else:
        word = None               # makes sure word is None if the word wasn't found

    foundwords = [word, next(wordgen, None), next(wordgen, None)]

Now foundwords[0] is the word you found, foundwords[1] is the word after that, and foundwords[2] is the second word after it. If there aren't enough words, then one or more elements of the list will be None.

It is a little more complex if you want to force this to match only within one line, but usually you can get away with considering the file as just a sequence of words.

like image 122
kindall Avatar answered Oct 05 '22 12:10

kindall


If you need to retrieve only two first words, just do it:

offsetFile.readline().split()[:2]
like image 44
Stan Avatar answered Oct 05 '22 12:10

Stan