I have a task requiring an operation on every element of a list, with the outcome of the operation depending on other elements in the list.
For example, I might like to concatenate a list of strings conditional on them starting with a particular character:
This code solves the problem:
x = ['*a', 'b', 'c', '*d', 'e', '*f', '*g']
concat = []
for element in x:
if element.startswith('*'):
concat.append(element)
else:
concat[len(concat) - 1] += element
resulting in:
concat
Out[16]: ['*abc', '*de', '*f', '*g']
But this seems horribly un-Pythonic. How should one operate on the elements of a list
when the outcome of the operation depends on previous outcomes?
A few relevant excerpts from import this
(the arbiter of what is Pythonic):
I would just use code like this, and not worry about replacing the for loop with something "flatter".
x = ['*a', 'b', 'c', '*d', 'e', '*f', '*g']
partials = []
for element in x:
if element.startswith('*'):
partials.append([])
partials[-1].append(element)
concat = map("".join, partials)
You could use regex to accomplish this succinctly. This does however, sort of circumvent your question regarding how to operate on dependent list elements. Credits to mbomb007 for improving the allowed character functionality.
import re
z = re.findall('\*[^*]+',"".join(x))
Outputs:
['*abc', '*de', '*f', '*g']
Small benchmarking:
Donkey Kong's answer:
import timeit
setup = '''
import re
x = ['*a', 'b', 'c', '*d', 'e', '*f', '*g']
y = ['*a', 'b', 'c', '*d', 'e', '*f', '*g'] * 100
'''
print (min(timeit.Timer('re.findall("\*[^\*]+","".join(x))', setup=setup).repeat(7, 1000)))
print (min(timeit.Timer('re.findall("\*[^\*]+","".join(y))', setup=setup).repeat(7, 1000)))
Returns 0.00226416693456
, and 0.06827958075
, respectively.
Chepner's answer:
setup = '''
x = ['*a', 'b', 'c', '*d', 'e', '*f', '*g']
y = ['*a', 'b', 'c', '*d', 'e', '*f', '*g'] * 100
def chepner(x):
partials = []
for element in x:
if element.startswith('*'):
partials.append([])
partials[-1].append(element)
concat = map("".join, partials)
return concat
'''
print (min(timeit.Timer('chepner(x)', setup=setup).repeat(7, 1000)))
print (min(timeit.Timer('chepner(y)', setup=setup).repeat(7, 1000)))
Returns 0.00456210269896
and 0.364635824689
, respectively.
Saksham's answer
setup = '''
x = ['*a', 'b', 'c', '*d', 'e', '*f', '*g']
y = ['*a', 'b', 'c', '*d', 'e', '*f', '*g'] * 100
'''
print (min(timeit.Timer("['*'+item for item in ''.join(x).split('*') if item]", setup=setup).repeat(7, 1000)))
print (min(timeit.Timer("['*'+item for item in ''.join(y).split('*') if item]", setup=setup).repeat(7, 1000))))
Returns 0.00104848906006
, and 0.0556093171512
respectively.
tl;dr Saksham's is slightly faster than mine, then Chepner's follows both of ours.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With