Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extract Words from a file

Tags:

python

I open a file using python to find whether a predefined set of words are present in the opened file or not. I took the predefined set of words in a list and opened the file that has to be tested. Now is there any method to extract words in python rather than lines. Thats makes my work lot easier.

like image 602
nikhil Avatar asked Jan 20 '23 12:01

nikhil


1 Answers

import re

def get_words_from_string(s):
    return set(re.findall(re.compile('\w+'), s.lower()))

def get_words_from_file(fname):
    with open(fname, 'rb') as inf:
        return get_words_from_string(inf.read())

def all_words(needle, haystack):
    return set(needle).issubset(set(haystack))

def any_words(needle, haystack):
    return set(needle).intersection(set(haystack))

search_words = get_words_from_string("This is my test")
find_in = get_words_from_string("If this were my test, I is passing")

print any_words(search_words, find_in)

print all_words(search_words, find_in)

returns

set(['this', 'test', 'is', 'my'])
True
like image 165
Hugh Bothwell Avatar answered Feb 02 '23 15:02

Hugh Bothwell