Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas data frame fill null values with index

Tags:

python

pandas

I have a dataframe where for one column I want to fill null values with the index value. What is the best way of doing this?

Say my dataframe looks like this:

>>> import numpy as np
>>> import pandas as pd
>>> d=pd.DataFrame(index=['A','B','C'], columns=['Num','Name'], data=[[1,'Andrew'], [2, np.nan], [3, 'Chris']])
>>> print d

  Num    Name
A    1  Andrew
B    2     NaN
C    3   Chris

I can use the following line of code to get what I'm looking for:

d['Name'][d['Name'].isnull()]=d.index

However, I get the following warning: "A value is trying to be set on a copy of a slice from a DataFrame"

I imagine it'd be better to do this either using fillna or loc, but I can't figure out how to do this with either. I have tried the following:

>>> d['Name']=d['Name'].fillna(d.index)

>>> d.loc[d['Name'].isnull()]=d.index

Any suggestions on which is the best option?

like image 859
AJG519 Avatar asked Aug 10 '15 22:08

AJG519


People also ask

How do you fill null values in a data frame?

Definition and Usage. The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.


1 Answers

IMO you should use fillna, as the Index type is not an acceptable data type for the fill value you need to pass a series. Index has a to_series method:

In [13]:
d=pd.DataFrame(index=['A','B','C'], columns=['Num','Name'], data=[[1,'Andrew'], [2, np.nan], [3, 'Chris']])
d['Name']=d['Name'].fillna(d.index.to_series())
d

Out[13]:
   Num    Name
A    1  Andrew
B    2       B
C    3   Chris
like image 129
EdChum Avatar answered Oct 15 '22 22:10

EdChum