Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Limit the number of sentences in a string

Tags:

python

A beginner's Python question:

I have a string with x number of sentences. How to I extract first 2 sentences (may end with . or ? or !)

like image 419
anroots Avatar asked Dec 23 '22 00:12

anroots


1 Answers

Ignoring considerations such as when a . constitutes the end of sentence:

import re
' '.join(re.split(r'(?<=[.?!])\s+', phrase, 2)[:-1])

EDIT: Another approach that just occurred to me is this:

re.match(r'(.*?[.?!](?:\s+.*?[.?!]){0,1})', phrase).group(1)

Notes:

  1. Whereas the first solution lets you replace the 2 with some other number to choose a different number of sentences, in the second solution, you change the 1 in {0,1} to one less than the number of sentences you want to extract.
  2. The second solution isn't quite as robust in handling, e.g., empty strings, or strings with no punctuation. It could be made so, but the regex would be even more complex than it is already, and I would favour the slightly less efficient first solution over an unreadable mess.
like image 184
Marcelo Cantos Avatar answered Dec 26 '22 10:12

Marcelo Cantos