Looking for a fast way to limit duplicates to a max of 2 when they occur next to each other.
For example: jeeeeeeeep
=> ['jep','jeep']
Looking for suggestions in python but happy to see an example in anything - not difficult to switch.
Thanks for any assistance!
EDIT: English doesn't have any (or many) consonants (same letter) in a row right? Lets limit this so no duplicate consonants in a row and up to two vowels in a row
EDIT2: I'm silly (hey that word has two consonants), just checking all letters, limiting duplicate letters that are next to each other to two.
Here's a recursive solution using groupby
. I've left it up to you which characters you want to be able to repeat (defaults to vowels only though):
from itertools import groupby
def find_dub_strs(mystring):
grp = groupby(mystring)
seq = [(k, len(list(g)) >= 2) for k, g in grp]
allowed = ('aeioupt')
return rec_dubz('', seq, allowed=allowed)
def rec_dubz(prev, seq, allowed='aeiou'):
if not seq:
return [prev]
solutions = rec_dubz(prev + seq[0][0], seq[1:], allowed=allowed)
if seq[0][0] in allowed and seq[0][1]:
solutions += rec_dubz(prev + seq[0][0] * 2, seq[1:], allowed=allowed)
return solutions
This is really just a heuristically pruned depth-first search into your "solution space" of possible words. The heuristic is that we only allow a single repeat at a time, and only if it is a valid repeatable letter. You should end up with 2**n words at the end, where n is he number times an "allowed" character was repeated in your string.
>>> find_dub_strs('jeeeeeep')
['jep', 'jeep']
>>> find_dub_strs('jeeeeeeppp')
['jep', 'jepp', 'jeep', 'jeepp']
>>> find_dub_strs('jeeeeeeppphhhht')
['jepht', 'jeppht', 'jeepht', 'jeeppht']
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With