Pandas not reindexing properly with NaN

Question

I am having trouble reindexing a pandas dataframe after dropping NaN values.

I am trying to extract dicts in a df column to another df, then join those values back to the original df in the corresponding rows.

df = pd.DataFrame({'col1': [1, 2, 3, 4, 5], 
                   'col2': [np.NaN, np.NaN, {'aa': 11, 'bb': 22}, {'aa': 33, 'bb': 44}, {'aa': 55, 'bb': 66}]})
df

    col1 col2
0   1    NaN
1   2    NaN
2   3    {'aa': 11, 'bb': 22}
3   4    {'aa': 33, 'bb': 44}
4   5    {'aa': 55, 'bb': 66}

The desired end result is:

    col1    aa      bb
0   1       NaN     NaN
1   2       NaN     NaN
2   3       11      22
3   4       33      44
4   5       55      66

If I pass col2 to the pandas .tolist() function, the dict is not unpacked.

pd.DataFrame(df['col2'].tolist())

0   NaN
1   NaN
2   {'aa': 11, 'bb': 22}
3   {'aa': 33, 'bb': 44}
4   {'aa': 55, 'bb': 66}

If I use dropna(), the dict is unpacked but the index is reset

pd.DataFrame(df['col2'].dropna().tolist())

    aa  bb
0   11  22
1   33  44
2   55  66

If I try to reset the index to that of the original df, the row data appear in different index positions.

pd.DataFrame(df['col2'].dropna().tolist()).reindex(df.index)

    aa  bb
0   11.0    22.0
1   33.0    44.0
2   55.0    66.0
3   NaN     NaN
4   NaN     NaN

The data is varied, and there is no way to know how many NaN values will be at any point in the column.

Any help is very much appreciated.

ansev · Accepted Answer

Use Series.to_dict to take into account the index:

df.join(pd.DataFrame(df['col2'].to_dict()).T).drop(columns='col2')
   col1    aa    bb
0     1   NaN   NaN
1     2   NaN   NaN
2     3  11.0  22.0
3     4  33.0  44.0
4     5  55.0  66.0

BENY · Answer

IIUC fix your code by passing the index after dropna

s=df.col2.dropna()
df=df.join(pd.DataFrame(s.tolist(), index=s.index))
df
Out[103]: 
   col1                  col2    aa    bb
0     1                   NaN   NaN   NaN
1     2                   NaN   NaN   NaN
2     3  {'aa': 11, 'bb': 22}  11.0  22.0
3     4  {'aa': 33, 'bb': 44}  33.0  44.0
4     5  {'aa': 55, 'bb': 66}  55.0  66.0

Pandas not reindexing properly with NaN

Tags:

python

indexing

pandas

Latecomer

2 Answers

ansev

BENY

Recent Activity

Donate For Us

Pandas not reindexing properly with NaN

Tags:

python

indexing

pandas

Latecomer

2 Answers

ansev

BENY

Related questions

Recent Activity

Donate For Us