I have a list of strings s
as follows:
s = ['Hello', 'world', '!', 'How', 'are', 'you', '?', 'Have', 'a', 'good', 'day', '.']
I want this list to be split into sublists. Whenever there's a ?!.\n
a new sublist is formed as follows:
final = [['Hello', 'world', '!'],
['How', 'are', 'you', '?'],
['Have', 'a', 'good', 'day', '.']]
I tried this:
x = 0
for i in range(len(s)):
if s[i] in ('!','?','.','\n'):
final = s[x: x+i]
x = i+1
final stores my output. Not getting the way it should be. Any suggestions?
You were not that far away:
x=0
final=[]
for i in range(len(s)):
if s[i] in ('!','?','.','\n'):
final.append(s[x:i+1])
x=i+1
Only a bit of indexing problem and making final a list to collect all partial lists.
You could use the following:
s = ['Hello', 'world', '!', 'How', 'are', 'you', '?', 'Have', 'a', 'good', 'day', '.']
letters = ['!', '?', '.']
idxes = [idx for idx, val in enumerate(s) if val in letters]
idxes = [-1] + idxes
answer = [s[idxes[i]+1:idxes[i+1]+1] for i in range(len(idxes[:-1]))]
print(answer)
Output
[['Hello', 'world', '!'], ['How', 'are', 'you', '?'], ['Have', 'a', 'good', 'day', '.']]
This uses a list comprehension with the built in enumerate
function to extract the idxes
of s
where a punctuation mark occurs. It then uses another list comprehension to construct a list of sublists by slicing the s
using the values of idxes
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With