Best way to determine if a sequence is in another sequence?

Tags:

This is a generalization of the "string contains substring" problem to (more) arbitrary types.

Given an sequence (such as a list or tuple), what's the best way of determining whether another sequence is inside it? As a bonus, it should return the index of the element where the subsequence starts:

Example usage (Sequence in Sequence):

>>> seq_in_seq([5,6],  [4,'a',3,5,6]) 3 >>> seq_in_seq([5,7],  [4,'a',3,5,6]) -1 # or None, or whatever

So far, I just rely on brute force and it seems slow, ugly, and clumsy.

458

asked Jan 08 '09 19:01

Gregg Lind

2 Answers

I second the Knuth-Morris-Pratt algorithm. By the way, your problem (and the KMP solution) is exactly recipe 5.13 in Python Cookbook 2nd edition. You can find the related code at http://code.activestate.com/recipes/117214/

It finds all the correct subsequences in a given sequence, and should be used as an iterator:

>>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,6]): print s 3 >>> for s in KnuthMorrisPratt([4,'a',3,5,6], [5,7]): print s (nothing)

answered Sep 24 '22 15:09

Federico A. Ramponi

Here's a brute-force approach O(n*m) (similar to @mcella's answer). It might be faster than the Knuth-Morris-Pratt algorithm implementation in pure Python O(n+m) (see @Gregg Lind answer) for small input sequences.

#!/usr/bin/env python def index(subseq, seq):     """Return an index of `subseq`uence in the `seq`uence.      Or `-1` if `subseq` is not a subsequence of the `seq`.      The time complexity of the algorithm is O(n*m), where          n, m = len(seq), len(subseq)      >>> index([1,2], range(5))     1     >>> index(range(1, 6), range(5))     -1     >>> index(range(5), range(5))     0     >>> index([1,2], [0, 1, 0, 1, 2])     3     """     i, n, m = -1, len(seq), len(subseq)     try:         while True:             i = seq.index(subseq[0], i + 1, n - m + 1)             if subseq == seq[i:i + m]:                return i     except ValueError:         return -1  if __name__ == '__main__':     import doctest; doctest.testmod()

I wonder how large is the small in this case?

answered Sep 21 '22 15:09

jfs

Related questions
                            
                                Python pandas equivalent to R groupby mutate
                            
                                How do I treat an ASCII string as unicode and unescape the escaped characters in it in python?
                            
                                PyQt: Always on top
                            
                                Developing with Django+Celery without running `celeryd`?
                            
                                Processing multiple values for one single option using getopt/optparse?
                            
                                Declaring a python function with an array parameters and passing an array argument to the function call?
                            
                                correlation matrix in python
                            
                                Pydev Perspective Not Showing After Install For Eclipse
                            
                                What is the pythonic way to avoid shadowing variables?
                            
                                Cannot use geometry manager pack inside
                            
                                Merging dictionary value lists in python
                            
                                pandas DataFrame, how to apply function to a specific column?
                            
                                How to compare pandas DataFrame against None in Python?
                            
                                How to use Faker from Factory_boy
                            
                                Selenium gives "selenium.common.exceptions.WebDriverException: Message: unknown error: cannot find Chrome binary" on Mac
                            
                                How to treat NULL as a normal string with pandas?
                            
                                Python OpenCV cv2 drawing rectangle with text
                            
                                Equivalent of j in NumPy
                            
                                Testing socket connection in Python
                            
                                How to compute the nth root of a very big integer

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Best way to determine if a sequence is in another sequence?

Tags:

python

algorithm

sequence

Gregg Lind

People also ask

2 Answers

Federico A. Ramponi

jfs

Recent Activity

Donate For Us