Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Split Column containing lists into different rows in pandas [duplicate]

I have a dataframe in pandas like this:

id     info
1      [1,2]
2      [3]
3      []

And I want to split it into different rows like this:

id     info
1      1 
1      2 
2      3 
3      NaN

How can I do this?

like image 293
Alexandra Espichán Avatar asked Jun 06 '18 21:06

Alexandra Espichán


2 Answers

You can try this out:

>>> import pandas as pd
>>> df = pd.DataFrame({'id': [1,2,3], 'info': [[1,2],[3],[]]})
>>> s = df.apply(lambda x: pd.Series(x['info']), axis=1).stack().reset_index(level=1, drop=True)
>>> s.name = 'info'
>>> df2 = df.drop('info', axis=1).join(s)
>>> df2['info'] = pd.Series(df2['info'], dtype=object)
>>> df2
   id info
0   1    1
0   1    2
1   2    3
2   3  NaN

Similar question is posted in here

like image 128
pgngp Avatar answered Oct 05 '22 22:10

pgngp


This is rather convoluted way, which drops empty cells:

import pandas as pd

df = pd.DataFrame({'id': [1,2,3],
                   'info': [[1,2], [3], [ ]]})

unstack_df = df.set_index(['id'])['info'].apply(pd.Series)\
                                         .stack()\
                                         .reset_index(level=1, drop=True)

unstack_df = unstack_df.reset_index()
unstack_df.columns = ['id', 'info']

unstack_df

>>
       id   info
    0   1   1.0
    1   1   2.0
    2   2   3.0
like image 31
An economist Avatar answered Oct 05 '22 22:10

An economist