I have a dataframe with a depth column with a 0.1 m grid.
import pandas as pd
df1 = pd.DataFrame({'depth': [1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0, 2.1 ],
'350': [7.898167, 6.912074, 6.049002, 5.000357, 4.072320, 3.070662, 2.560458, 2.218879, 1.892131, 1.588389, 1.573693],
'351': [8.094912, 7.090584, 6.221289, 5.154516, 4.211746, 3.217615, 2.670147, 2.305846, 1.952723, 1.641423, 1.622722],
'352': [8.291657, 7.269095, 6.393576, 5.308674, 4.351173, 3.364569, 2.779837, 2.392813, 2.013316, 1.694456, 1.671752],
'353': [8.421007, 7.374317, 6.496641, 5.403691, 4.439815, 3.412494, 2.840625, 2.443868, 2.069017, 1.748445, 1.718081 ],
'354': [8.535562, 7.463452, 6.584512, 5.485725, 4.517310, 3.438680, 2.890678, 2.487039, 2.123644, 1.802643, 1.763818 ],
'355': [8.650118, 7.552586, 6.672383, 4.517310, 4.594806, 3.464867, 2.940732, 2.530211, 2.178271, 1.856841, 1.809555 ]},
index=[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
)
My question is: how do I bin the data to get a new dataframe on a 0.5 m depth frequency?
Or rather, how do I average the column values from df1 (which have data per each 0.1 m) for the dz=0.5 m bins?
The point is to get the same df structure, same columns (350-355), but the rows should be averaged/binned per column for a certain dz interval (number of rows), let's say 0.5 m
So my new dataframe would have only two rows in this case with depth values of 1.35 and 1.85 m, keeping each column as in df1. The first one would have averaged values for the 1.1-1.6m interval, the second one from 1.6-2.1 m .
Use a combination of df.groupby
and pd.cut
import pandas as pd
import numpy as np
# Specifiy your desired dz step size
step = 0.5
dz = np.arange(1,3,step)
# rebin dataframe
df2 = df1.groupby(pd.cut(df1.depth, dz, labels=False), as_index=False).mean()
# refill 'depth' column
df2.depth = dz[:-1]
gives
depth 350 351 352 353 354 355
0 1.0 5.986384 6.154609 6.322835 6.427094 6.517312 6.397441
1 1.5 2.266104 2.357551 2.448998 2.502890 2.548537 2.594184
2 2.0 1.573693 1.622722 1.671752 1.718081 1.763818 1.809555
where in each line there is the mean of the 35x
columns within 1 < x <= 1.5
, 1.5 < x <= 2
, etc...
You can easily change the rebinning by selecting a desired value for the step
variable.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With