Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python Pandas replace NaN in one column with value from corresponding row of second column

People also ask

How do you fill NaN values with values from another column in Pandas?

Using fillna() to fill values from another column To modify the dataframe in-place, pass inplace=True to the above function.

How replace column values in Pandas based on multiple conditions?

You can replace values of all or selected columns based on the condition of pandas DataFrame by using DataFrame. loc[ ] property. The loc[] is used to access a group of rows and columns by label(s) or a boolean array. It can access and can also manipulate the values of pandas DataFrame.


Assuming your DataFrame is in df:

df.Temp_Rating.fillna(df.Farheit, inplace=True)
del df['Farheit']
df.columns = 'File heat Observations'.split()

First replace any NaN values with the corresponding value of df.Farheit. Delete the 'Farheit' column. Then rename the columns. Here's the resulting DataFrame:

resulting DataFrame


The above mentioned solutions did not work for me. The method I used was:

df.loc[df['foo'].isnull(),'foo'] = df['bar']

An other way to solve this problem,

import pandas as pd
import numpy as np

ts_df = pd.DataFrame([[1,"YesQ",75,],[1,"NoR",115,],[1,"NoT",63,13],[2,"YesT",43,71]],columns=['File','heat','Farheit','Temp'])


def fx(x):
    if np.isnan(x['Temp']):
        return x['Farheit']
    else:
        return x['Temp']
print(1,ts_df)
ts_df['Temp']=ts_df.apply(lambda x : fx(x),axis=1)

print(2,ts_df)

returns:

(1,    File  heat  Farheit  Temp                                                                                    
0     1  YesQ       75   NaN                                                                                        
1     1   NoR      115   NaN                                                                                        
2     1   NoT       63  13.0                                                                                        
3     2  YesT       43  71.0)                                                                                       
(2,    File  heat  Farheit   Temp                                                                                   
0     1  YesQ       75   75.0                                                                                       
1     1   NoR      115  115.0
2     1   NoT       63   13.0
3     2  YesT       43   71.0)

The accepted answer uses fillna() which will fill in missing values where the two dataframes share indices. As explained nicely here, you can use combine_first to fill in missing values, rows and index values for situations where the indices of the two dataframes don't match.

df.Col1 = df.Col1.fillna(df.Col2) #fill in missing values if indices match

#or 
df.Col1 = df.Col1.combine_first(df.Col2) #fill in values, rows, and indices

@Jonathan's answer is good, but an overkill, just use pop:

df['Temp_Rating'] = df['Temp_Rating'].fillna(df.pop('Farheit'))