I have this dataframe:

I want to add each column, as duration + credit_amount, so I have created the following algorithm:
def automate_add(add):
for i, column in enumerate(df):
for j, operando in enumerate(df):
if column != operando:
columnName = column + '_sum_' + operando
add[columnName] = df[column] + df[operando]
with the output:

However, knowing that duration + credit_amount = credit_amount + duration. I wouldn't like to have repeated columns.
Expecting this result from the function:
How can I do it?
I am trying to use hash sets but seems to work only in pandas series [1].
EDIT: Dataframe: https://www.openml.org/d/31
Use the below, should work faster:
import itertools
my_list=[(pd.Series(df.loc[:,list(i)].sum(axis=1),\
name='_sum_'.join(df.loc[:,list(i)].columns))) for i in list(itertools.combinations(df.columns,2))]
final_df=pd.concat(my_list,axis=1)
print(final_df)
duration_sum_credit_amount duration_sum_installment_commitment \
0 1175 10
1 5999 50
2 2108 14
3 7924 44
4 4894 27
credit_amount_sum_installment_commitment
0 1173
1 5953
2 2098
3 7884
4 4873
Explanation:
print(list(itertools.combinations(df.columns,2))) gives:
[('duration', 'credit_amount'),
('duration', 'installment_commitment'),
('credit_amount', 'installment_commitment')]
Post that do :
for i in list(itertools.combinations(df.columns,2)):
print(df.loc[:,list(i)])
print("---------------------------")
this prints the combinations of columns together. so i just summed it on axis=1 and called it under pd.series, and gave it a name by joining them.
Post this just append them to the list and concat them on axis=1 to get the final result. :)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With