Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

List's ngrams with zip

I can make a quick and dirty bigram sequence like so:

>>> w = ['a', 'b', 'c', 'd']
>>> zip(w, w[1:])
[('a', 'b'), ('b', 'c'), ('c', 'd')]

I want to make a function that accepts a numerical argument, n, of an n-gram. How do I take that argument and automatically fill in the zip arguments as shown above? In other words, my function:

>>> make_ngrams(w, 3)

will create

>>> zip(w, w[1:], w[2:])

on the fly, and return:

[('a', 'b', 'c'), ('b', 'c', 'd')]

Can the star operator(s) help(s) me here? Thanks for any insight!

like image 679
verbsintransit Avatar asked Dec 15 '22 15:12

verbsintransit


1 Answers

def make_ngrams(lst, n):
    return zip(*(lst[i:] for i in xrange(n)))

The * operator basically takes all elements of an iterable and feeds them as separate arguments into the function.

like image 89
Volatility Avatar answered Dec 21 '22 09:12

Volatility