I have a dataframe named pricecomp_df, I want to take compare the price of column "market price" and each of the other columns like "apple price","mangoes price", "watermelon price" but prioritize the difference based on the condition : (First priority is watermelon price, second to mangoes and third for apple). The input dataframe is given below:
code apple price mangoes price watermelon price market price
0 101 101 NaN NaN 122
1 102 123 123 NaN 124
2 103 NaN NaN NaN 123
3 105 123 167 NaN 154
4 107 165 NaN 177 176
5 110 123 NaN NaN 123
So here the first row has just apple price and market price then take their diff, but in second row, we have apple, mangoes price so i have to take only the difference between market price and mangoes price. likewise take the difference based on priority condition. Also skip the rows with nan for all three prices. Can anyone help on this?
Difference between rows or columns of a pandas DataFrame object is found using the diff() method. The axis parameter decides whether difference to be calculated is between rows or between columns. When the periods parameter assumes positive values, difference is found by subtracting the previous row from the next row.
Pandas DataFrame diff() Method The diff() method returns a DataFrame with the difference between the values for each row and, by default, the previous row. Which row to compare with can be specified with the periods parameter.
Hope I'm not too late. The idea is to calculate the differences and overwrite them according to your priority list.
import numpy as np
import pandas as pd
df = pd.DataFrame({'code': [101, 102, 103, 105, 107, 110],
'apple price': [101, 123, np.nan, 123, 165, 123],
'mangoes price': [np.nan, 123, np.nan, 167, np.nan, np.nan],
'watermelon price': [np.nan, np.nan, np.nan, np.nan, 177, np.nan],
'market price': [122, 124, 123, 154, 176, 123]})
# Calculate difference to apple price
df['diff'] = df['market price'] - df['apple price']
# Overwrite with difference to mangoes price
df['diff'] = df.apply(lambda x: x['market price'] - x['mangoes price'] if not np.isnan(x['mangoes price']) else x['diff'], axis=1)
# Overwrite with difference to watermelon price
df['diff'] = df.apply(lambda x: x['market price'] - x['watermelon price'] if not np.isnan(x['watermelon price']) else x['diff'], axis=1)
print df
apple price code mangoes price market price watermelon price diff
0 101 101 NaN 122 NaN 21
1 123 102 123 124 NaN 1
2 NaN 103 NaN 123 NaN NaN
3 123 105 167 154 NaN -13
4 165 107 NaN 176 177 -1
5 123 110 NaN 123 NaN 0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With