Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Calculating age of peak performance in R [duplicate]

I have a data set for racing performance in horses over several years and I want to calculate the age at which the horses reach their peak performance. Here is a made up example of my data:

data <- data.frame(
Name=c(rep("Ari",3),rep("Aegir",3),rep("Lixhof",3)),
Competition.year = c("2015", "2013", "2012", "2008", "2009", "2010", "2015", "2016", "2017"), 
P2=c(7.97, 8.40, 8.51, 9.49, 8.70, 8.40, 8.82, 9.07, 8.59),
Competition.age=c(16,14,13,8,9,10,12,13,14))

Here, P2 is the variable for the time records. The smaller the value, the better performance (I'm looking for fastest times to calculate peak performance). Competition age shows what age (in years) each horse was for each year they competed.

My real data has around 2000 observations for 127 horses. What I want is to calculate the mean age for when they reach their peak performance (as in, at what age are horses, in general, fastest). I've seen some posts that use aggregate to calculate means by groups, but I don't think that's exactly what I need, since it has to first look at the times, then make a mean of the ages from the fastest one.

I'd appreciate any help with this! Thank you!

like image 326
Laura Bas Avatar asked Nov 30 '25 15:11

Laura Bas


2 Answers

Given your example you can use something like this.

library(dplyr)

df_min <- df %>% 
  group_by(Name) %>% 
  filter(P2 == min(P2)) # filter records on fastest race time per horse

mean(df_min$Competition.age)
[1] 13.33333

As @MKR pointed out, you can also do it in one statement. It is slightly more typing and you do not have the intermediate result of df_min. It all depends on what else you want to do with the data you have.

df_min <- df %>% 
  group_by(Name) %>% 
  filter(P2 == min(P2)) %>% 
  ungroup() %>% 
  summarise(best_age = mean(Competition.age)) 
like image 175
phiver Avatar answered Dec 02 '25 06:12

phiver


We can calculate the average using data.table by first filtering for horses's age with maximum performance (min(P2)) and then then taking mean of Competition.age as:

library(data.table)
setDT(data)

data[,.SD[P2 == min(P2)], by=.(Name)][,mean(Competition.age)]
#[1] 13.33333
like image 23
MKR Avatar answered Dec 02 '25 06:12

MKR



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!