I've the following df:
>> animals_df:
animal_name age
cat 1
cat 1
cat 2
cat 3
cat 3
dog 1
dog 1
dog 3
dog 4
dog 4
dog 4
horse 1
horse 3
horse 5
horse 5
horse 5
I want to extract only the animals with highest age from each species. So I want to get the following output:
animal_name age
cat 3
cat 3
dog 4
dog 4
dog 4
horse 5
horse 5
horse 5
I've tried using:
animals_df = do.call(rbind,lapply(split(animals_df, animals_df$animal_name), function(x) tail(x, 1) ) )
But this will only give one instance of each animal which is the following:
animals_name age
cat 3
dog 4
horse 5
This is easy with dplyr/tidyverse:
library(tidyverse)
# How I read your data in, ignore since you already have your data available
df = read.table(file="clipboard", header=TRUE)
df %>%
group_by(animal_name) %>%
filter(age == max(age))
# Output:
Source: local data frame [8 x 2]
Groups: animal_name [3]
animal_name age
<fctr> <int>
1 cat 3
2 cat 3
3 dog 4
4 dog 4
5 dog 4
6 horse 5
7 horse 5
8 horse 5
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With