Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas merge two columns with customized text

I am having following dataframe

df1 = pd.DataFrame({'Name': ['A0', 'A1', 'A2', 'A3', 'A4'],
                    'Buy': [True, True, False, False, False],
                    'Sell': [False, False, True, False, True]
                   },
                   index=[0, 1, 2, 3, 4])
df1

    Name    Buy Sell
0   A0  True    False
1   A1  True    False
2   A2  False   True
3   A3  False   False
4   A4  False   True

I want to merge Buy and Sell columns on a condition that if "Buy" is having True value then "Buyer" if "Sell" has True value then "Seller" and if both "Buy" and "Sell" has False value then it should have "NA"

sample required output

    Name    Type 
0   A0      Buyer
1   A1      Buyer
2   A2      Seller
3   A3      NA
4   A4      Seller
like image 443
Parag Avatar asked Jan 13 '20 19:01

Parag


People also ask

How do I merge two text columns in pandas?

By use + operator simply you can combine/merge two or multiple text/string columns in pandas DataFrame. Note that when you apply + operator on numeric columns it actually does addition instead of concatenation.

How do I merge column values in pandas?

To start, you may use this template to concatenate your column values (for strings only): df['New Column Name'] = df['1st Column Name'] + df['2nd Column Name'] + ... Notice that the plus symbol ('+') is used to perform the concatenation.

What is difference between pandas concat and merge?

merge() for combining data on common columns or indices. . join() for combining data on a key column or an index. concat() for combining DataFrames across rows or columns.


2 Answers

np.select

a = np.select([df1.Buy, df1.Sell], ['Buyer', 'Seller'], 'NA')
pd.DataFrame({'Name': df1.Name, 'Type': a})

  Name    Type
0   A0   Buyer
1   A1   Buyer
2   A2  Seller
3   A3      NA
4   A4  Seller

df1.assign(Type=np.select([df1.Buy, df1.Sell], ['Buyer', 'Seller'], 'NA'))

  Name    Buy   Sell    Type
0   A0   True  False   Buyer
1   A1   True  False   Buyer
2   A2  False   True  Seller
3   A3  False  False      NA
4   A4  False   True  Seller
like image 101
piRSquared Avatar answered Oct 03 '22 01:10

piRSquared


You can do:

s = df[['Buy','Sell']]
df['Type'] = ([email protected]).add('er').replace('er', np.nan)

# or 
# df['Type'] = np.where(s.any(1), s.idxmax(1).add('er'), np.nan)

Output:

  Name    Buy   Sell    Type
0   A0   True  False   Buyer
1   A1   True  False   Buyer
2   A2  False   True  Seller
3   A3  False  False     NaN
4   A4  False   True  Seller
like image 22
Quang Hoang Avatar answered Oct 03 '22 00:10

Quang Hoang