I have a table which has close to 31million records. There are around 10 columns , two of which are card number and transaction_status. Each card can have multiple rows. so there could be 2000 rows of same card , with each row as a transaction with a corresponding status
transaction_Status has value of 'Y'/'N'.
i want to add another two columns to this table, 'count_of_approved', 'count_of_rejected' using pandas dataframe.
How do i do that? trn so far, i have been using get_dummies() and merge(), however it takes a lot of time and worse, causes out of memory error.
so say my input is as below:
trn_id | card_id | status
1 | c1 | Y
2 | c2 | Y
3 | c2 | N
4 | c3 | Y
5 | c3 | Y
i would like my output to be
trn_id | card_id | status | num_approved | num_of_denied
1 | c1 | Y | 1 | 0
2 | c2 | Y | 1 | 1
3 | c2 | N | 1 | 1
4 | c3 | Y | 2 | 0
5 | c3 | Y | 2 | 0
my code so far as given below:
import pandas as panda
a = panda.DataFrame({'id':[1,2,3],'c_id':[22,22,33], 'status':['Y','Y','N']})
temp = a.status.str.get_dummies()
a[['N','Y']]= temp
tt = a.groupby(['c_id'])['Y'].count()
tt=tt.reset_index()
yes_count_added = a.merge(tt,how='right',on='c_id')
yes_count_added.rename(columns = {'Y_y':'num_of_approved'})
You could GroupBy
card_id
and use transform
with a lambda expression to sum
the amount of times status
equals either Y
for the num_approved
or N
for the num_of_denied
using eq
:
df['num_approved'] = df.groupby('card_id').status.transform(
lambda x: x.eq('Y').sum())
df['num_of_denied'] = df.groupby('card_id').status.transform(
lambda x: x.eq('N').sum())
trn_id card_id status num_approved num_of_denied
0 1 c1 Y 1 0
1 2 c2 Y 1 1
2 3 c2 N 1 1
3 4 c3 Y 2 0
4 5 c3 Y 2 0
Use str.get_dummies
+ a single groupby
call for performance:
df.status.str.get_dummies().groupby(df.card_id).transform('sum')
N Y
0 0 1
1 1 1
2 1 1
3 0 2
4 0 2
v = (df.status
.str.get_dummies()
.groupby(df.card_id)
.transform('sum')
.rename({'Y': 'num_approved', 'N': 'num_denied'}, axis=1))
pd.concat([df, v], axis=1)
trn_id card_id status num_denied num_approved
0 1 c1 Y 0 1
1 2 c2 Y 1 1
2 3 c2 N 1 1
3 4 c3 Y 0 2
4 5 c3 Y 0 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With