Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

R - getting the highest value for each ID

Tags:

r

analytics

I've the following df:

>> animals_df:

animal_name    age
cat             1
cat             1
cat             2
cat             3
cat             3
dog             1
dog             1
dog             3
dog             4
dog             4
dog             4
horse           1
horse           3
horse           5
horse           5
horse           5

I want to extract only the animals with highest age from each species. So I want to get the following output:

animal_name    age
    cat         3
    cat         3
    dog         4
    dog         4
    dog         4
    horse       5
    horse       5
    horse       5

I've tried using:

animals_df = do.call(rbind,lapply(split(animals_df, animals_df$animal_name), function(x) tail(x, 1) ) )

But this will only give one instance of each animal which is the following:

animals_name    age
    cat          3
    dog          4
    horse        5
like image 763
ibrr1 Avatar asked Jun 05 '26 15:06

ibrr1


1 Answers

This is easy with dplyr/tidyverse:

library(tidyverse)

# How I read your data in, ignore since you already have your data available
df = read.table(file="clipboard", header=TRUE)
df %>%
    group_by(animal_name) %>%
    filter(age == max(age))

# Output:
Source: local data frame [8 x 2]
Groups: animal_name [3]

  animal_name   age
       <fctr> <int>
1         cat     3
2         cat     3
3         dog     4
4         dog     4
5         dog     4
6       horse     5
7       horse     5
8       horse     5
like image 144
Marius Avatar answered Jun 10 '26 20:06

Marius



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!