Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python limit repeating letters

What would be the best way to limit repeating letters down to 1 and 2 such as:
appppppppple => aple and apple
bbbbbeeeeeer => ber, beer, bber, bbeer

Right now, I have this:

a = "hellllllllllooooooooooooo"
    match = re.search('(.)\\1+', a)

    if match:
        print 'found'
        print re.sub('(.)\\1+', '\\1', a)
        print re.sub('(.)\\1+', '\\1\\1', a)
    else:
        print 'not found'

But it only returns:

helo
helloo

How can I make it work the way I want to?

like image 508
Raptrex Avatar asked Jun 21 '26 20:06

Raptrex


2 Answers

Don't use REs for this. REs are good for searching, matching, and transforming, but not for generating strings.

We can consider a string as a vector; each letter is a dimension, and the count of repetitions is the length of a component along that dimension. Given a vector V, You want all possible vectors of the same dimension as V, such that the value of each component is 1 if the corresponding component of V is 1, or is either 1 or 2 otherwise. Based on that, here's a function that does what you want.

def doppelstring(s):
    letter_groups = ((val, list(group)) for val, group in itertools.groupby(s))
    max_vector = ((val, min(len(group), 2)) for val, group in letter_groups)
    vector_components = ([dim * (l + 1) for l in range(maxlen)] for dim, maxlen in max_vector)
    return [''.join(letters) for letters in itertools.product(*vector_components)]

Here's a more compact version that uses slicing. It may be a bit less readable, but at least it keeps within the 78-char limit:

def doppelstring(s):
    max_vs = (''.join(itertools.islice(g, 2)) for k, g in itertools.groupby(s))
    components = ([s[:l + 1] for l in range(len(s))] for s in max_vs)
    return [''.join(letters) for letters in itertools.product(*components)]
like image 61
senderle Avatar answered Jun 24 '26 10:06

senderle


import re

def permute(seq):
    if len(seq) < 2:
        yield seq
    else:
        for tail in permute(seq[2:]):
            yield seq[:2] + tail
            yield seq[:2] + seq[1:2] + tail

text = "hellllllllllooooooooooooo"
seq = re.split('(.)\\1+', text)

for result in permute(seq):
    print ''.join(result)
like image 22
pyroscope Avatar answered Jun 24 '26 10:06

pyroscope



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!