I need to decode a string 'a3b2' into 'aaabb'. The problem is when the numbers are double,triple digits. E.g. 'a10b3' should detect that the number is not 1 but 10.
I need to start accumulating digits.
a = "a12345t5i6o2r43e2"
for i in range(0, len(a)-1):
if a[i].isdigit() is False:
#once i see a letter, i launch a while loop to check how long a digit streak
#after it can be - it's 2,3,4,5 digit number etc
print(a[i])
current_digit_streak = ''
counter = i+1
while a[counter].isdigit(): #this gives index out of range error!
current_digit_streak += a[counter]
counter+=1
If I change the while loop to this:
while a[counter].isdigit() and counter < ( len(a)-1)
it does work but omits the last letter. I should not use regex, only loops.
Regex is a good fit here.
import re
pat = re.compile(r"""
(\w) # a word character, followed by...
(\d+) # one or more digits""", flags=re.X)
s = "a12345t5i6o2r43e2"
groups = pat.findall(s)
# [('a', '12345'), ('t', '5'), ('i', '6'), ('o', '2'), ('r', '43'), ('e', '2')]
result = ''.join([lett*int(count) for lett, count in groups])
Since you can't use regex for some unbeknownst reason, I recommend a recursive function to split the string into parts.
import itertools
def split_into_groups(s):
if not s:
return []
lett, *rest = s
count, rest = int(itertools.takewhile(str.isdigit, rest)), itertools.dropwhile(str.isdigit, rest)
return [(lett, count)] + split_into_groups(rest)
s = "a12345t5i6o2r43e2"
groups = split_into_groups(s)
result = ''.join([lett*count for lett, count in groups])
or, using a more generic (and Functional-derived) pattern:
def unfold(f, x):
while True:
v, x = f(x)
yield v
def get_group(s):
if not s:
raise StopIteration()
lett, *rest = s
count, rest = int(itertools.takewhile(str.isdigit, rest)), itertools.dropwhile(str.isdigit, rest)
return lett*count, rest
s = "a12345t5i6o2r43e2"
result = ''.join(unfold(get_group, s))
You could use groupby:
from itertools import groupby
text = 'a12345t5i6o2r43e2'
groups = [''.join(group) for _, group in groupby(text, key=str.isdigit)]
result = list(zip(groups[::2], groups[1::2]))
print(result)
Output
[('a', '12345'), ('t', '5'), ('i', '6'), ('o', '2'), ('r', '43'), ('e', '2')]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With