Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Create stacked pandas series from series with list elements

Tags:

python

pandas

I have a pandas series with elements as list:

import pandas as pd
s = pd.Series([ ['United States of America'],['China', 'Hong Kong'], []])
print(s)

0    [United States of America]
1            [China, Hong Kong]
2                            []

How to get a series like the following:

0 United States of America
1 China
1 Hong Kong

I am not sure about what happens to 2.

like image 322
BhishanPoudel Avatar asked Jan 26 '23 17:01

BhishanPoudel


1 Answers

The following options all return Series. Create a new frame and listify.

pd.DataFrame(s.tolist()).stack()

0  0    United States of America
1  0                       China
   1                   Hong Kong
dtype: object

To reset the index, use

pd.DataFrame(s.tolist()).stack().reset_index(drop=True)

0    United States of America
1                       China
2                   Hong Kong
dtype: object

To convert to DataFrame, call to_frame()

pd.DataFrame(s.tolist()).stack().reset_index(drop=True).to_frame('countries')

                  countries
0  United States of America
1                     China
2                 Hong Kong

If you're trying to code golf, use

sum(s, [])
# ['United States of America', 'China', 'Hong Kong']

pd.Series(sum(s, []))

0    United States of America
1                       China
2                   Hong Kong
dtype: object

Or even,

pd.Series(np.sum(s))

0    United States of America
1                       China
2                   Hong Kong
dtype: object

However, like most other operations involving sums of lists operations, this is bad in terms of performance (list concatenation operations are inefficient).


Faster operations are possible using chaining with itertools.chain:

from itertools import chain
pd.Series(list(chain.from_iterable(s)))

0    United States of America
1                       China
2                   Hong Kong
dtype: object

pd.DataFrame(list(chain.from_iterable(s)), columns=['countries'])

                  countries
0  United States of America
1                     China
2                 Hong Kong
like image 57
cs95 Avatar answered Feb 14 '23 02:02

cs95