I can make a quick and dirty bigram sequence like so:
>>> w = ['a', 'b', 'c', 'd']
>>> zip(w, w[1:])
[('a', 'b'), ('b', 'c'), ('c', 'd')]
I want to make a function that accepts a numerical argument, n, of an n-gram. How do I take that argument and automatically fill in the zip arguments as shown above? In other words, my function:
>>> make_ngrams(w, 3)
will create
>>> zip(w, w[1:], w[2:])
on the fly, and return:
[('a', 'b', 'c'), ('b', 'c', 'd')]
Can the star operator(s) help(s) me here? Thanks for any insight!
def make_ngrams(lst, n):
return zip(*(lst[i:] for i in xrange(n)))
The *
operator basically takes all elements of an iterable and feeds them as separate arguments into the function.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With