Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: List containing sublist of strings

I have a list of strings s as follows:

s = ['Hello', 'world', '!', 'How', 'are', 'you', '?', 'Have', 'a', 'good', 'day', '.']

I want this list to be split into sublists. Whenever there's a ?!.\n a new sublist is formed as follows:

final = [['Hello', 'world', '!'],
         ['How', 'are', 'you', '?'],
         ['Have', 'a', 'good', 'day', '.']]

I tried this:

x = 0
for i in range(len(s)):
    if s[i] in ('!','?','.','\n'):
         final = s[x: x+i]
    x = i+1

final stores my output. Not getting the way it should be. Any suggestions?

like image 383
ATS sharon Avatar asked Feb 28 '16 16:02

ATS sharon


2 Answers

You were not that far away:

x=0
final=[]
for i in range(len(s)):
    if s[i] in ('!','?','.','\n'):
        final.append(s[x:i+1])
        x=i+1

Only a bit of indexing problem and making final a list to collect all partial lists.

like image 98
mkiever Avatar answered Sep 28 '22 06:09

mkiever


You could use the following:

s = ['Hello', 'world', '!', 'How', 'are', 'you', '?', 'Have', 'a', 'good', 'day', '.']
letters = ['!', '?', '.']

idxes = [idx for idx, val in enumerate(s) if val in letters]
idxes = [-1] + idxes
answer = [s[idxes[i]+1:idxes[i+1]+1] for i in range(len(idxes[:-1]))]
print(answer)

Output

[['Hello', 'world', '!'], ['How', 'are', 'you', '?'], ['Have', 'a', 'good', 'day', '.']]

This uses a list comprehension with the built in enumerate function to extract the idxes of s where a punctuation mark occurs. It then uses another list comprehension to construct a list of sublists by slicing the s using the values of idxes.

like image 30
gtlambert Avatar answered Sep 28 '22 05:09

gtlambert