Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I combine two columns within a dataframe in Pandas?

Tags:

python

pandas

Say I have two columns, A and B, in my dataframe:

A  B
1  NaN
2  5
3  NaN
4  6

I want to get a new column, C, which fills in NaN cells in column B using values from column A:

A  B   C
1  NaN 1
2  5   5
3  NaN 3
4  6   6

How do I do this?

I'm sure this is a very basic question, but as I am new to Pandas, any help will be appreciated!

like image 877
runawaykid Avatar asked Nov 26 '15 07:11

runawaykid


3 Answers

You can use combine_first:

df['c'] = df['b'].combine_first(df['a'])

Docs: http://pandas.pydata.org/pandas-docs/version/0.17.0/generated/pandas.Series.combine_first.html

like image 86
eumiro Avatar answered Oct 21 '22 06:10

eumiro


You can use where which is a vectorized if/else:

df['C'] = df['A'].where(df['B'].isnull(), df['B'])

   A   B  C
0  1 NaN  1
1  2   5  5
2  3 NaN  3
3  4   6  6
like image 41
Colonel Beauvel Avatar answered Oct 21 '22 06:10

Colonel Beauvel


df['c'] = df['b'].fillna(df['a'])

So what .fillna will do is it will fill all the Nan values in the data frame We can pass any value to it Here we pass the value df['a'] So this method will put the corresponding values of 'a' into the Nan values of 'b' And the final answer will be in 'c'

like image 40
user517696 Avatar answered Oct 21 '22 08:10

user517696