Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Sum of Nested List with Empty List Explanation

Tags:

python

I am working through a gensim tutorial and have come across something I don't understand. texts is a nested list of strings:

In [37]: texts
Out[37]:
[['human', 'machine', 'interface', 'lab', 'abc', 'computer', 'applications'],
 ['survey', 'user', 'opinion', 'computer', 'system', 'response', 'time'],
 ['eps', 'user', 'interface', 'management', 'system'],
 ['system', 'human', 'system', 'engineering', 'testing', 'eps'],
 ['relation', 'user', 'perceived', 'response', 'time', 'error', 'measurement'],
 ['generation', 'random', 'binary', 'unordered', 'trees'],
 ['intersection', 'graph', 'paths', 'trees'],
 ['graph', 'minors', 'iv', 'widths', 'trees', 'well', 'quasi', 'ordering'],
 ['graph', 'minors', 'survey']]

and sum(texts,[]) gives:

Out[38]:
['human',
 'machine',
 'interface',
 'lab',
 'abc',
 'computer',
 'applications',
 'survey',
 'user',
 'opinion',
 'computer',

The list goes on for a few more lines but I omitted the rest to save space. I have two questions:

1) Why does sum(texts,[]) produces that outcome (i.e. flattens the nested list)?

2) Why is the output displayed strangely - one element of per line? Is there something special with this output (...or I suspect it might be my iPython behaving strangely). Please confirm if you see this as well.

like image 776
mchangun Avatar asked Dec 25 '22 19:12

mchangun


1 Answers

It's because adding lists together concatenates them.

sum([a, b, c, d, ..., z], start)

is equivalent to

start + a + b + c + d + ... + z

So

sum([['one', 'two'], ['three', 'four']], [])

is equivalent to

[] + ['one', 'two'] + ['three', 'four']

Which gives you

['one', 'two', 'three', 'four']

Note that start, by default, is 0, since by default it works with numbers, so if you were to try

sum([['one', 'two'], ['three', 'four']])

then it would try the equivalent of

0 + ['one', 'two'] + ['three', 'four']

and it would fail because you can't add integers to lists.


The one-per-line thing is just how IPython is deciding to output your long list of strings.

like image 141
Claudiu Avatar answered Dec 28 '22 09:12

Claudiu