I have a json file containing texts like:
dr. goldberg offers everything.parking is good.he's nice and easy to talk
How can I extract the sentence with the keyword "parking"? I don't need the other two sentences.
I tried this:
with open("test_data.json") as f:
for line in f:
if "parking" in line:
print line
It prints all the text and not that particular sentence.
I even tried using regex:
f=open("test_data.json")
for line in f:
line=line.rstrip()
if re.search('parking',line):
print line
Even this shows the same result.
you can use nltk.tokenize :
from nltk.tokenize import sent_tokenize
from nltk.tokenize import word_tokenize
f=open("test_data.json").read()
sentences=sent_tokenize(f)
my_sentence=[sent for sent in sentences if 'parking' in word_tokenize(sent)] #this gave you the all sentences that your special word is in it !
and as a complete way you can use a function :
>>> def sentence_finder(text,word):
... sentences=sent_tokenize(text)
... return [sent for sent in sentences if word in word_tokenize(sent)]
>>> s="dr. goldberg offers everything. parking is good. he's nice and easy to talk"
>>> sentence_finder(s,'parking')
['parking is good.']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With