I have a dataframe with groups. To normalize the values for each group I'd like to divide all values of each group by the value of a certain element within that group.
df = pd.DataFrame([['a','2018-02-03',42],
['a','2018-02-04',22],
['a','2018-02-05',10],
['a','2018-02-06',32],
['b','2018-02-03',10],
['b','2018-02-04',8],
['b','2018-02-05',2],
['b','2018-02-06',12],
['c','2018-02-03',20],
['c','2018-02-04',30],
['c','2018-02-05',5],
['c','2018-02-06',15]])
df.columns = ['product','day','value']
I want to normalize column 'value' for each 'product' by 'value' of 'day' == '2018-02-05'
Expected Result:
product day value
0 a 2018-02-03 4.2
1 a 2018-02-04 2.2
2 a 2018-02-05 1
3 a 2018-02-06 3.2
5 b 2018-02-03 5
6 b 2018-02-04 4
7 b 2018-02-05 1
8 b 2018-02-06 6
10 c 2018-02-03 4
11 c 2018-02-04 6
12 c 2018-02-05 1
13 c 2018-02-06 3
I tried df.groupby('product').transform().
To access the first value .transform('first') is possible.
But I cannot find a way to access a certain value.
Annotation:
Maybe this one can be solved without using .groupby() ?
Do like this:
df = pd.DataFrame([['a','2018-02-03',42],
['a','2018-02-04',22],
['a','2018-02-05',10],
['a','2018-02-06',32],
['b','2018-02-03',10],
['b','2018-02-04',8],
['b','2018-02-05',2],
['b','2018-02-06',12],
['c','2018-02-03',20],
['c','2018-02-04',30],
['c','2018-02-05',5],
['c','2018-02-06',15]])
df.columns = ['product','day','value']
date = '2018-02-05'
# Set the index to ['product', 'day']
df.set_index(['product', 'day'], inplace=True)
# Helper Series - Values of date at index 'day'
s = df.xs(date, level=1)
# Divide df by helper Series and reset index
df = df.div(s, level=0).reset_index()
print(df)
product day value
0 a 2018-02-03 4.2
1 a 2018-02-04 2.2
2 a 2018-02-05 1.0
3 a 2018-02-06 3.2
4 b 2018-02-03 5.0
5 b 2018-02-04 4.0
6 b 2018-02-05 1.0
7 b 2018-02-06 6.0
8 c 2018-02-03 4.0
9 c 2018-02-04 6.0
10 c 2018-02-05 1.0
11 c 2018-02-06 3.0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With