Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Substrings of a string using Python

How many substrings can you make out of a string like abcd?

How can I get all of its substrings:

['a', 'b', 'c', 'd', 'ab', 'bc', 'cd', 'abc', 'bcd', 'abcd']
like image 363
user1754666 Avatar asked Oct 17 '12 23:10

user1754666


2 Answers

Try this:

def consecutive_groups(iterable):
    s = tuple(iterable)
    for size in range(1, len(s)+1):
        for index in range(len(s)+1-size):
            yield iterable[index:index+size]

>>> print list(consecutive_groups('abcd'))
['a', 'b', 'c', 'd', 'ab', 'bc', 'cd', 'abc', 'bcd', 'abcd']

And the number of combinations is simply equal to the sum from 1 to the length of the string, which is equivalent to n * (n + 1) / 2.

By the way, if you want to avoid duplicates, you can simply use a locally-defined set in the generator function, like so:

def consecutive_groups(iterable):
    s = tuple(iterable)
    seen = set()
    for size in range(1, len(s)+1):
        for index in range(len(s)+1-size):
            slc = iterable[index:index+size]
            if slc not in seen:
                seen.add(slc)
                yield slc

That code is a little more unwieldy and could probably be optimized for indentation, but it will do for a proof of concept.

like image 187
Platinum Azure Avatar answered Oct 03 '22 19:10

Platinum Azure


Would this do?

import itertools
def substrings(x):
    for i, j in itertools.combinations(xrange(len(x)+1), 2):
        yield x[i:j]

or as generator expression:

(x[i:j] for i, j in itertools.combinations(xrange(len(x)+1), 2))

The expanded result for your example looks like this:

['a', 'ab', 'abc', 'abcd', 'b', 'bc', 'bcd', 'c', 'cd', 'd']

To order by length, use sort key=len.

like image 29
hochl Avatar answered Oct 03 '22 17:10

hochl