Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find subsequences of strings within strings

Tags:

python

I want to make a function which checks a string for occurrences of other strings within them.
However, the sub-strings which are being checked may be interrupted within the main string by other letters.

For instance:

a = 'abcde'
b = 'ace'
c = 'acb'

The function in question should return as b being in a, but not c.

I've tried set(a). intersection(set(b)) already, and my problem with that is that it returns c as being in a.

like image 366
Anon Not4Chan Avatar asked Sep 09 '10 02:09

Anon Not4Chan


People also ask

How do you find the number of subsequences in a string?

Given a string, find the count of distinct subsequences of it. The problem of counting distinct subsequences is easy if all characters of input string are distinct. The count is equal to nC0 + nC1 + nC2 + … nCn = 2n.

Can a substring be a subsequence?

A subsequence is a sequence that can be derived from another sequence of elements without changing the order of the remaining elements. In other words, it's a generalised subarray/substring where the rule of contiguity does not apply.

What is string subsequence?

A subsequence of a string is a sequence that can be derived from the given string by deleting zero or more elements without changing the order of the remaining elements.


2 Answers

You can turn your expected sequence into a regex:

import re

def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    pat = ".*".join(s1)
    if re.search(pat, s2):
        return True
    return False

# or, more compactly:
def sequence_in(s1, s2):
    """Does `s1` appear in sequence in `s2`?"""
    return bool(re.search(".*".join(s1), s2))

a = 'abcde' 
b = 'ace' 
c = 'acb'

assert sequence_in(b, a)
assert not sequence_in(c, a)

"ace" gets turned into the regex "a.*c.*e", which finds those three characters in sequence, with possible intervening characters.

like image 134
Ned Batchelder Avatar answered Oct 31 '22 05:10

Ned Batchelder


how about something like this...

def issubstr(substr, mystr, start_index=0):
    try:
        for letter in substr:
            start_index = mystr.index(letter, start_index) + 1
        return True
    except: return False

or...

def issubstr(substr, mystr, start_index=0):
    for letter in substr:
        start_index = mystr.find(letter, start_index) + 1
        if start_index == 0: return False
    return True
like image 24
Lord British Avatar answered Oct 31 '22 04:10

Lord British