Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas fill missing values in dataframe from another dataframe

Tags:

python

pandas

I cannot find a pandas function (which I had seen before) to substitute the NaN's in a dataframe with values from another dataframe (assuming a common index which can be specified). Any help?

like image 954
user308827 Avatar asked Mar 30 '15 22:03

user308827


People also ask

What does Fillna do in pandas?

Pandas DataFrame fillna() Method The fillna() method replaces the NULL values with a specified value. The fillna() method returns a new DataFrame object unless the inplace parameter is set to True , in that case the fillna() method does the replacing in the original DataFrame instead.


2 Answers

If you have two DataFrames of the same shape, then:

df[df.isnull()] = d2 

Will do the trick.

visual representation

Only locations where df.isnull() evaluates to True (highlighted in green) will be eligible for assignment.

In practice, the DataFrames aren't always the same size / shape, and transforming methods (especially .shift()) are useful.

Data coming in is invariably dirty, incomplete, or inconsistent. Par for the course. There's a pretty extensive pandas tutorial and associated cookbook for dealing with these situations.

like image 190
Jonathan Eunice Avatar answered Sep 19 '22 18:09

Jonathan Eunice


As I just learned, there is a DataFrame.combine_first() method, which does precisely this, with the additional property that if your updating data frame d2 is bigger than your original df, the additional rows and columns are added, as well.

df = df.combine_first(d2) 
like image 39
Anaphory Avatar answered Sep 18 '22 18:09

Anaphory