I have a list of tuples that has strings in it For instance:
[('this', 'is', 'a', 'foo', 'bar', 'sentences') ('is', 'a', 'foo', 'bar', 'sentences', 'and') ('a', 'foo', 'bar', 'sentences', 'and', 'i') ('foo', 'bar', 'sentences', 'and', 'i', 'want') ('bar', 'sentences', 'and', 'i', 'want', 'to') ('sentences', 'and', 'i', 'want', 'to', 'ngramize') ('and', 'i', 'want', 'to', 'ngramize', 'it')]
Now I wish to concatenate each string in a tuple to create a list of space separated strings. I used the following method:
NewData=[] for grams in sixgrams: NewData.append( (''.join([w+' ' for w in grams])).strip())
which is working perfectly fine.
However, the list that I have has over a million tuples. So my question is that is this method efficient enough or is there some better way to do it. Thanks.
The primary way in which tuples are different from lists is that they cannot be modified. This means that items cannot be added to or removed from tuples, and items cannot be replaced within tuples. You can, however, concatenate 2 or more tuples to form a new tuple. This is because tuples cannot be modified.
It's actually as simple as the error message states: You are not allowed to concatenate lists and tuples.
When it is required to concatenate multiple tuples, the '+' operator can be used. A tuple is an immutable data type. It means, values once defined can't be changed by accessing their index elements. If we try to change the elements, it results in an error.
For a lot of data, you should consider whether you need to keep it all in a list. If you are processing each one at a time, you can create a generator that will yield each joined string, but won't keep them all around taking up memory:
new_data = (' '.join(w) for w in sixgrams)
if you can get the original tuples also from a generator, then you can avoid having the sixgrams
list in memory as well.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With