This is my original data:
fi_diameter ever_percent
0 -1.000000 0.00
1 -0.694212 0.00
2 -0.499782 0.00
3 -0.249749 0.00
4 -0.000000 0.00
5 0.249822 0.00
6 0.500218 0.00
7 0.749038 0.00
8 0.985645 0.00
9 1.251539 0.00
10 1.498179 0.00
11 2.122177 0.78
12 2.000000 3.70
13 2.251539 6.23
14 2.498179 8.66
15 2.746616 10.79
16 3.000000 12.13
17 3.251539 11.93
18 3.506353 10.78
19 3.756331 8.55
20 3.988504 5.90
21 4.237864 4.34
22 4.506353 2.89
23 5.011588 2.84
24 5.506353 1.88
25 5.965784 1.65
26 6.965784 2.49
27 7.965784 1.33
28 7.965784 3.13
I want to change my data to this type:
1-2 4.48
2-3 37.81
3-4 37.16
4-5 7.23
5-6 6.37
6-7 2.49
7-8 4.46
I have try to use pandas of "groupby" ,but it not transform data to the like what i want it to be.
How do I get this output?
pd.cut and bin fi_diameterever_percentdf = df.query('fi_diameter >= 1')
grouper = pd.cut(
df['fi_diameter'],
bins=[1, 2, 3, 4, 5, 6, 7, 8],
labels=['1-2', '2-3', '3-4', '4-5', '5-6', '6-7', '7-8']
)
df.groupby(grouper).ever_percent.sum().reset_index()
fi_diameter ever_percent
0 1-2 3.70
1 2-3 38.59
2 3-4 37.16
3 4-5 7.23
4 5-6 6.37
5 6-7 2.49
6 7-8 4.46
Here's how you generalise this for any upper bound.
mx = int(np.ceil(df['fi_diameter'].max()))
rn = np.arange(1, mx + 1)
df = df.query('fi_diameter >= 1')
grouper = pd.cut(
df['fi_diameter'],
bins=rn,
labels=[
str(i) + '-' + str(j) for i, j in zip(rn, rn[1:])
]
)
df.groupby(grouper).ever_percent.sum().reset_index()
fi_diameter ever_percent
0 1-2 3.70
1 2-3 38.59
2 3-4 37.16
3 4-5 7.23
4 5-6 6.37
5 6-7 2.49
6 7-8 4.46
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With