Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: extracting a sentence with a particular word

Tags:

python

regex

nltk

I have a json file containing texts like:

dr. goldberg offers everything.parking is good.he's nice and easy to talk

How can I extract the sentence with the keyword "parking"? I don't need the other two sentences.

I tried this:

with open("test_data.json") as f:
    for line in f:
        if "parking" in line:
            print line

It prints all the text and not that particular sentence.

I even tried using regex:

f=open("test_data.json")
for line in f:
    line=line.rstrip()
    if re.search('parking',line):
        print line

Even this shows the same result.

like image 418
dipit malhotra Avatar asked May 07 '26 07:05

dipit malhotra


1 Answers

you can use nltk.tokenize :

from nltk.tokenize import sent_tokenize
from nltk.tokenize import word_tokenize
f=open("test_data.json").read()
sentences=sent_tokenize(f)
my_sentence=[sent for sent in sentences if 'parking' in word_tokenize(sent)] #this gave you the all sentences that your special word is in it ! 

and as a complete way you can use a function :

>>> def sentence_finder(text,word):
...    sentences=sent_tokenize(text)
...    return [sent for sent in sentences if word in word_tokenize(sent)]

>>> s="dr. goldberg offers everything. parking is good. he's nice and easy to talk"
>>> sentence_finder(s,'parking')
['parking is good.']
like image 198
Mazdak Avatar answered May 08 '26 20:05

Mazdak



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!