So in R, I'd use an optimized apply function for this, but I've read now that the Panda's apply function is an abstracted loop and is perhaps even slower than one, and it shows in the performance. On my machine, it took 30 mins to process 60k rows.
So essentially I'm looking to calculate a moving average based on a dataset with different groups on which I need to calculate a moving average. There are a lot of these groups. So I essentially first have to subset the dataset on a row/cell-wise basis, and only then calculate the moving average.
So I'm trying to come up with a vectorised solution to this but can't seem to figure how you'd go about subsetting the dataframe within a vectorised approach.
My current solution makes use of an apply function which is really easy to understand and maintain:
df['SMA'] = df.apply(SMA, axis=1)
def SMA(row):
Subset = df[(df['group']==row['group'])&(df['t']<=row['t'])].reset_index()
Subset2 = Subset[len(Subset.index)-(2):len(Subset.index)]
return df['val'].mean()
This is my expected output (which I am currently getting but just very very slowly):
This is the dataframe and this example I want the moving average over two timepoints, "t" in this example:
t group val moving average
1 A 1 NA
2 A 2 1.5
3 A 3 2.5
1 B 4 NA
2 B 5 4.5
3 B 6 5.5
1 C 7 NA
2 C 8 7.5
3 C 9 8.5
This kind of operation (splitting into groups) is handled by the .groupby
method in pandas. If we take care to set the index to time, it also handles giving us the right output with a time index back.
Here's an example which does basically the same as your code:
df = pandas.DataFrame(
[[1, 'A', 1],
[2, 'A', 2],
[3, 'A', 3],
[1, 'B', 4],
[2, 'B', 5],
[3, 'B', 6],
[1, 'C', 7],
[2, 'C', 8],
[3, 'C', 9]],
columns=['t', 'group', 'val'])
df = df.set_index('t')
moving_avg = df.groupby('group').rolling(2).mean()
moving_avg
is now a new dataframe. Note that because I set the index to be t
in the first part, it's handled correctly in the grouping and rolling averages:
val
group t
A 1 NaN
2 1.5
3 2.5
B 1 NaN
2 4.5
3 5.5
C 1 NaN
2 7.5
3 8.5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With