I am trying to generate all possible ways to interleave any two arbitrary strings in Python.
For example: If the two strings are 'ab'
and 'cd'
, the output I wish to get is:
['abcd', 'acbd', 'acdb', 'cabd', 'cadb', 'cdab']
See a
is always before b
(and c
before d
). I am struggling to find a solution to this. I have tried itertools as shown below:
import itertools
def shuffle(s,t):
string = s+t
for i in itertools.permutations(string):
print(''.join(i))
shuffle('ab','cd')
But as expected, this returns all possible permutations disregarding order of a
and b
(and c
and d
).
An interleaving of two strings s and t is a configuration where they are divided into non-empty substrings such that: s = s1 + s2 + ...
Interference happens when two operations, running in different threads, but acting on the same data, interleave. This means that the two operations consist of multiple steps, and the sequences of steps overlap.
If all characters of C match either with a character of A or a character of B and length of C is sum of lengths of A and B, then C is an interleaving A and B.
Let the two strings you want to interleave be s
and t
. We will use recursion to generate all the possible ways to interleave these two strings.
If at any point of time we have interleaved the first i
characters of s
and the first j
characters of t
to create some string res
, then we have two ways to interleave them for the next step-
i+1
th character of s
to res
j+1
th character of t
to res
We continue this recursion till all characters of both the strings have been used and then we store this result in a list of strings lis
as in the code below.
def interleave(s, t, res, i, j, lis):
if i == len(s) and j == len(t):
lis.append(res)
return
if i < len(s):
interleave(s, t, res + s[i], i + 1, j, lis)
if j < len(t):
interleave(s, t, res + t[j], i, j + 1, lis)
l = []
s = "ab"
t = "cd"
interleave(s, t, "", 0, 0, l)
print l
Output
['abcd', 'acbd', 'acdb', 'cabd', 'cadb', 'cdab']
This implementation is as efficient as we can get (at least asymptotically) since we never generate the same string twice.
Several other solutions have already been posted, but most of them generate the full list of interleaved strings (or something equivalent to it) in memory, making their memory usage grow exponentially as a function of the input length. Surely there must be a better way.
Enumerating all ways to interleave two sequences, of length a and b respectively, is basically the same as enumerating all a+b bit integers with exably b bits set. Each such integer corresponds to a distinct way to interleave the sequences, obtained by replacing every 0 bit with an element of the first sequence, and every 1 bit with an element of the second sequence.
Conveniently, there's a clever and efficient way to calculate the next integer with the same number of bits set, which we can use to generate all such integers. So let's do that first:
def bit_patterns(m, n):
"""Generate all m-bit numbers with exactly n bits set, in ascending order.
See http://www.geeksforgeeks.org/next-higher-number-with-same-number-of-set-bits/
"""
patt = (1 << int(n)) - 1
if patt == 0: yield 0; return # loop below assumes patt has at least one bit set!
while (patt >> m) == 0:
yield patt
lowb = patt & -patt # extract the lowest bit of the pattern
incr = patt + lowb # increment the lowest bit
diff = patt ^ incr # extract the bits flipped by the increment
patt = incr + ((diff // lowb) >> 2) # restore bit count after increment
Now we can use this generator to generate all ways to interleave any two sequences:
def interleave(a, b):
"""Generate all possible ways to interleave two sequences."""
m = len(a) + len(b)
n = len(a)
for pattern in bit_patterns(m, n):
seq = []
i = j = 0
for k in range(m):
bit = pattern & 1
pattern >>= 1
seq.append(a[i] if bit else b[j])
i += bit
j += 1-bit
yield seq
Note that, in order to try to be as generic as possible, this code takes arbitrary sequence types and returns lists. Strings are sequences in Python, so you can pass them in just fine; to convert the generated lists back into strings, you can concatenate their elements e.g. with "".join()
, like this:
foo = "ABCD"
bar = "1234"
for seq in interleave(foo, bar):
print("".join(seq))
There we go: a fully non-recursive efficient generator-based solution that uses very little memory even for long inputs, and only generates each output once (thus requiring no inefficient duplicate elimination step). And it even works in both Python 2 and 3.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With