What is the preferred way to concatenate sequences in Python 3?
Right now, I'm doing:
import functools
import operator
def concatenate(sequences):
return functools.reduce(operator.add, sequences)
print(concatenate([['spam', 'eggs'], ['ham']]))
# ['spam', 'eggs', 'ham']
Needing to import two separate modules to do this seems clunky.
An alternative could be:
def concatenate(sequences):
concatenated_sequence = []
for sequence in sequences:
concatenated_sequence += sequence
return concatenated_sequence
However, this is incorrect because you don't know that the sequences are lists.
You could do:
import copy
def concatenate(sequences):
head, *tail = sequences
concatenated_sequence = copy.copy(head)
for sequence in sequences:
concatenated_sequence += sequence
return concatenated_sequence
But that seems horribly bug prone -- a direct call to copy? (I know head.copy()
works for lists and tuples, but copy
isn't part of the sequence ABC, so you can't rely on it... what if you get handed strings?). You have to copy to prevent mutation in case you get handed a MutableSequence
. Moreover, this solution forces you to unpack the entire set of sequences first. Trying again:
import copy
def concatenate(sequences):
iterable = iter(sequences)
head = next(iterable)
concatenated_sequence = copy.copy(head)
for sequence in iterable:
concatenated_sequence += sequence
return concatenated_sequence
But come on... this is python! So... what is the preferred way to do this?
I'd use itertools.chain.from_iterable()
instead:
import itertools
def chained(sequences):
return itertools.chain.from_iterable(sequences):
or, since you tagged this with python-3.3 you could use the new yield from
syntax (look ma, no imports!):
def chained(sequences):
for seq in sequences:
yield from seq
which both return iterators (use list()
on them if you must materialize the full list). Most of the time you do not need to construct a whole new sequence from concatenated sequences, really, you just want to loop over them to process and/or search for something instead.
Note that for strings, you should use str.join()
instead of any of the techniques described either in my answer or your question:
concatenated = ''.join(sequence_of_strings)
Combined, to handle sequences fast and correct, I'd use:
def chained(sequences):
for seq in sequences:
yield from seq
def concatenate(sequences):
sequences = iter(sequences)
first = next(sequences)
if hasattr(first, 'join'):
return first + ''.join(sequences)
return first + type(first)(chained(sequences))
This works for tuples, lists and strings:
>>> concatenate(['abcd', 'efgh', 'ijkl'])
'abcdefghijkl'
>>> concatenate([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
[1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> concatenate([(1, 2, 3), (4, 5, 6), (7, 8, 9)])
(1, 2, 3, 4, 5, 6, 7, 8, 9)
and uses the faster ''.join()
for a sequence of strings.
what is wrong with:
from itertools import chain
def chain_sequences(*sequences):
return chain(*sequences)
Use itertools.chain.from_iterable
.
import itertools
def concatenate(sequences):
return list(itertools.chain.from_iterable(sequences))
The call to list
is needed only if you need an actual new list, so skip it if you just iterate over this new sequence once.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With