Most efficient way to do multiple list comprehensions in Python

Question

Given these three list comprehensions, is there a more efficient way to do this rather than three deliberate sets? I believe that for loops in this case would probably be bad form but if I were to iterate over a large number of lines in rowsaslist I feel like what I have below is not that efficient.

cachedStopWords = stopwords.words('english')

rowsaslist = [x.lower() for x in rowsaslist]
rowsaslist = [''.join(c for c in s if c not in string.punctuation) for s in rowsaslist]
rowsaslist = [' '.join([word for word in p.split() if word not in cachedStopWords]) for p in rowsaslist]

Is combining these all into one comprehension statement more efficient? I know from a readability standpoint it would probably be a mess of code.

Eric Duminil · Accepted Answer

Instead of iterating 3 times on the same list, you could simply define 2 functions and use them in one single list comprehension:

cachedStopWords = stopwords.words('english')


def remove_punctuation(text):
    return ''.join(c for c in text.lower() if c not in string.punctuation)

def remove_stop_words(text):
    return ' '.join([word for word in p.split() if word not in cachedStopWords])

rowsaslist = [remove_stop_words(remove_punctuation(text)) for text in rowsaslist]

I've never used stopwords. If it returns a list, you'd better convert it to a set first to speed up the word not in cachedStopWords test.

Finally, the NLTK package might help you process text. See @alvas' answer.

Most efficient way to do multiple list comprehensions in Python

Tags:

python

list-comprehension

nltk

Sean

1 Answers

Eric Duminil

Recent Activity

Donate For Us

Most efficient way to do multiple list comprehensions in Python

Tags:

python

list-comprehension

nltk

Sean

1 Answers

Eric Duminil

Related questions

Recent Activity

Donate For Us