Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Find all list permutations of splitting a string in Python

I have a string of letters that I'd like to split into all possible combinations (the order of letters must remain fixed), so that:

s = 'monkey'

becomes:

combinations = [['m', 'onkey'], ['mo', 'nkey'], ['m', 'o', 'nkey'] ... etc]

Any ideas?

like image 992
cyrus Avatar asked Feb 05 '11 01:02

cyrus


People also ask

Can you split a string in a list Python?

The split() method splits a string into a list. You can specify the separator, default separator is any whitespace. Note: When maxsplit is specified, the list will contain the specified number of elements plus one.

How many ways can a string be split?

In javascript, we can split a string in 3 ways.


5 Answers

def splitter(str):
    for i in range(1, len(str)):
        start = str[0:i]
        end = str[i:]
        yield (start, end)
        for split in splitter(end):
            result = [start]
            result.extend(split)
            yield result

combinations = list(splitter(str))

Note that I defaulted to a generator to save you from running out of memory with long strings.

like image 152
btilly Avatar answered Oct 10 '22 00:10

btilly


http://wordaligned.org/articles/partitioning-with-python contains an interesting post about sequence partitioning, here is the implementation they use:

#!/usr/bin/env python

# From http://wordaligned.org/articles/partitioning-with-python

from itertools import chain, combinations

def sliceable(xs):
    '''Return a sliceable version of the iterable xs.'''
    try:
        xs[:0]
        return xs
    except TypeError:
        return tuple(xs)

def partition(iterable):
    s = sliceable(iterable)
    n = len(s)
    b, mid, e = [0], list(range(1, n)), [n]
    getslice = s.__getitem__
    splits = (d for i in range(n) for d in combinations(mid, i))
    return [[s[sl] for sl in map(slice, chain(b, d), chain(d, e))]
            for d in splits]

if __name__ == '__main__':
    s = "monkey"
    for i in partition(s):
        print i

Which would print:

['monkey']
['m', 'onkey']
['mo', 'nkey']
['mon', 'key']
['monk', 'ey']
['monke', 'y']
['m', 'o', 'nkey']
['m', 'on', 'key']
['m', 'onk', 'ey']
['m', 'onke', 'y']
['mo', 'n', 'key']
['mo', 'nk', 'ey']
['mo', 'nke', 'y']
['mon', 'k', 'ey']
['mon', 'ke', 'y']
['monk', 'e', 'y']
...
['mo', 'n', 'k', 'e', 'y']
['m', 'o', 'n', 'k', 'e', 'y']
like image 32
miku Avatar answered Oct 10 '22 00:10

miku


The idea is to realize that the permutation of a string s is equal to a set containing s itself, and a set union of each substring X of s with the permutation of s\X. For example, permute('key'):

  1. {'key'} # 'key' itself
  2. {'k', 'ey'} # substring 'k' union 1st permutation of 'ey' = {'e, 'y'}
  3. {'k', 'e', 'y'} # substring 'k' union 2nd permutation of 'ey' = {'ey'}
  4. {'ke', 'y'} # substring 'ke' union 1st and only permutation of 'y' = {'y'}
  5. Union of 1, 2, 3, and 4, yield all permutations of the string key.

With this in mind, a simple algorithm can be implemented:

>>> def permute(s):
    result = [[s]]
    for i in range(1, len(s)):
        first = [s[:i]]
        rest = s[i:]
        for p in permute(rest):
            result.append(first + p)
    return result

>>> for p in permute('monkey'):
        print(p)    

['monkey']
['m', 'onkey']
['m', 'o', 'nkey']
['m', 'o', 'n', 'key']
['m', 'o', 'n', 'k', 'ey']
['m', 'o', 'n', 'k', 'e', 'y']
['m', 'o', 'n', 'ke', 'y']
['m', 'o', 'nk', 'ey']
['m', 'o', 'nk', 'e', 'y']
['m', 'o', 'nke', 'y']
['m', 'on', 'key']
['m', 'on', 'k', 'ey']
['m', 'on', 'k', 'e', 'y']
['m', 'on', 'ke', 'y']
['m', 'onk', 'ey']
['m', 'onk', 'e', 'y']
['m', 'onke', 'y']
['mo', 'nkey']
['mo', 'n', 'key']
['mo', 'n', 'k', 'ey']
['mo', 'n', 'k', 'e', 'y']
['mo', 'n', 'ke', 'y']
['mo', 'nk', 'ey']
['mo', 'nk', 'e', 'y']
['mo', 'nke', 'y']
['mon', 'key']
['mon', 'k', 'ey']
['mon', 'k', 'e', 'y']
['mon', 'ke', 'y']
['monk', 'ey']
['monk', 'e', 'y']
['monke', 'y']
like image 20
João Silva Avatar answered Oct 10 '22 00:10

João Silva


Consider more_itertools.partitions:

Given

import more_itertools as mit


s = "monkey"

Demo

As-is:

list(mit.partitions(s))
#[[['m', 'o', 'n', 'k', 'e', 'y']],
# [['m'], ['o', 'n', 'k', 'e', 'y']],
# [['m', 'o'], ['n', 'k', 'e', 'y']],
# [['m', 'o', 'n'], ['k', 'e', 'y']],
# [['m', 'o', 'n', 'k'], ['e', 'y']],
# [['m', 'o', 'n', 'k', 'e'], ['y']],
# ...]

After some string joining:

[list(map("".join, x)) for x in mit.partitions(s)]

Output

[['monkey'],
 ['m', 'onkey'],
 ['mo', 'nkey'],
 ['mon', 'key'],
 ['monk', 'ey'],
 ['monke', 'y'],
 ['m', 'o', 'nkey'],
 ['m', 'on', 'key'],
 ['m', 'onk', 'ey'],
 ['m', 'onke', 'y'],
 ['mo', 'n', 'key'],
 ['mo', 'nk', 'ey'],
 ['mo', 'nke', 'y'],
 ['mon', 'k', 'ey'],
 ['mon', 'ke', 'y'],
 ['monk', 'e', 'y'],
 ['m', 'o', 'n', 'key'],
 ['m', 'o', 'nk', 'ey'],
 ['m', 'o', 'nke', 'y'],
 ['m', 'on', 'k', 'ey'],
 ['m', 'on', 'ke', 'y'],
 ['m', 'onk', 'e', 'y'],
 ['mo', 'n', 'k', 'ey'],
 ['mo', 'n', 'ke', 'y'],
 ['mo', 'nk', 'e', 'y'],
 ['mon', 'k', 'e', 'y'],
 ['m', 'o', 'n', 'k', 'ey'],
 ['m', 'o', 'n', 'ke', 'y'],
 ['m', 'o', 'nk', 'e', 'y'],
 ['m', 'on', 'k', 'e', 'y'],
 ['mo', 'n', 'k', 'e', 'y'],
 ['m', 'o', 'n', 'k', 'e', 'y']]

Install via > pip install more_itertools.

like image 21
pylang Avatar answered Oct 10 '22 01:10

pylang


A string (as opposed to list) oriented approach is to think of the each adjacent pair of characters being separated by either a space or empty string. That can be mapped to 1 and 0, and the number of possible splits are a power of 2:

2 ^ (len(s)-1)

for example, "key" can have '' or ' ' separating 'ke' and a '' or ' ' separating 'ey' which leads to 4 possibilities:

  • key ('' between 'k' and 'e', '' between 'e' and 'y')
  • k ey (' ' between 'k' and 'e', '' between 'e' and 'y')
  • k e y (' ' between 'k' and 'e', ' ' between 'e' and 'y')
  • ke y ('' between 'k' and 'e', ' ' between 'e' and 'y')

An unreadable python one liner that gives you a generator in string form:

operator_positions = (''.join([str(a >> i & 1).replace('0', '').replace('1', ' ') + s[len(s)-1-i] for i in range(len(s)-1, -1, -1)]) for a in range(pow(2, len(s)-1)))

A readable version of this generator with comments and sample:

s = 'monkey'
s_length = len(s)-1  # represents the number of ' ' or '' that can split digits

operator_positions = (
    ''.join(
        [str(a >> i & 1).replace('0', '').replace('1', ' ') + s[s_length-i]
         for i in range(s_length, -1, -1)])   # extra digit is for blank string to always precede first digit
    for a in range(pow(2, s_length))   # binary number loop
)
for i in operator_positions:
    print i

str(a >> i & 1) converts a into a binary string, which then has it's 0's and 1's replaced by '' and ' ', respectively. The binary string is an extra digit long so that the first digit is always ''. That way, as the digit splitter is combined with the first character, it always results in just the first character.

like image 37
Joe Golton Avatar answered Oct 10 '22 00:10

Joe Golton