Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Concatenate elements of a tuple in a list in python

Tags:

I have a list of tuples that has strings in it For instance:

[('this', 'is', 'a', 'foo', 'bar', 'sentences') ('is', 'a', 'foo', 'bar', 'sentences', 'and') ('a', 'foo', 'bar', 'sentences', 'and', 'i') ('foo', 'bar', 'sentences', 'and', 'i', 'want') ('bar', 'sentences', 'and', 'i', 'want', 'to') ('sentences', 'and', 'i', 'want', 'to', 'ngramize') ('and', 'i', 'want', 'to', 'ngramize', 'it')] 

Now I wish to concatenate each string in a tuple to create a list of space separated strings. I used the following method:

NewData=[] for grams in sixgrams:        NewData.append( (''.join([w+' ' for w in grams])).strip()) 

which is working perfectly fine.

However, the list that I have has over a million tuples. So my question is that is this method efficient enough or is there some better way to do it. Thanks.

like image 200
alphacentauri Avatar asked Dec 23 '13 04:12

alphacentauri


People also ask

Can we concatenate list and tuple in Python?

The primary way in which tuples are different from lists is that they cannot be modified. This means that items cannot be added to or removed from tuples, and items cannot be replaced within tuples. You can, however, concatenate 2 or more tuples to form a new tuple. This is because tuples cannot be modified.

Can we concatenate list and tuple?

It's actually as simple as the error message states: You are not allowed to concatenate lists and tuples.

How do you concatenate two tuples in Python?

When it is required to concatenate multiple tuples, the '+' operator can be used. A tuple is an immutable data type. It means, values once defined can't be changed by accessing their index elements. If we try to change the elements, it results in an error.


1 Answers

For a lot of data, you should consider whether you need to keep it all in a list. If you are processing each one at a time, you can create a generator that will yield each joined string, but won't keep them all around taking up memory:

new_data = (' '.join(w) for w in sixgrams) 

if you can get the original tuples also from a generator, then you can avoid having the sixgrams list in memory as well.

like image 92
lvc Avatar answered Sep 22 '22 13:09

lvc