Creating 'Top 10' lists in R

Question

I have a data frame where each row represents a recorded event. As an example, let's say I measured the speed of passing cars, and some cars passed me more than once.

cardata <- data.frame(
  car.ID = c(3,4,1,2,5,4,5),
  speed = c(100,121,56,73,87,111,107)
  )

I can sort the list and pull out the three fastest events...

top3<-head(cardata[order(cardata$speed,decreasing=TRUE),],n=3)
> top3
  car.ID speed
2      4   121
6      4   111
7      5   107

... but you'll notice that car 4 recorded the two fastest times. How do I find the three fastest events without any duplicate car ID's? I realize that may 'Top 3' list will not include the three fastest events in this instance.

flodel · Accepted Answer

You can use aggregate to first find the top speed per car.ID:

cartop <- aggregate(speed ~ car.ID, data = cardata, FUN = max)
top3 <- head(cartop[order(cartop$speed, decreasing = TRUE), ], n = 3)

 #   car.ID speed
 # 4      4   121
 # 5      5   107
 # 3      3   100

eddi · Answer

Using data.table instead of data.frame:

library(data.table)
dt = data.table(cardata)

# the easier to read way
dt[order(-speed), speed[1], by = car.ID][1:3]
#   car.ID  V1
#1:      4 121
#2:      5 107
#3:      3 100

# (probably) a faster way
setkey(dt, speed) # faster sort by speed
tail(dt[, speed[.N], by = car.ID], 3)
#  car.ID  V1
#1:      5 107
#2:      3 100
#3:      4 121

# and another way for fun (not sure how fast it is)
setkey(dt, car.ID, speed)
tail(dt[J(unique(car.ID)), mult = 'last'], 3)

Creating 'Top 10' lists in R

Tags:

r

mdd

2 Answers

flodel

eddi

Recent Activity

Donate For Us

Creating 'Top 10' lists in R

Tags:

r

mdd

2 Answers

flodel

eddi

Related questions

Recent Activity

Donate For Us