I have the following data:
Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed") Age <- c(22,12,31,35,58,82,17,34,12,24,44,67,43) Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D") data <- data.frame(Name, Age, Group)
And I'd like to use dplyr to
(1) group the data by "Group" (2) show the min and max Age within each Group (3) show the Name of the person with the min and max ages
The following code does this:
data %>% group_by(Group) %>% summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))], maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))])
Which works well:
Group minAge minAgeName maxAge maxAgeName 1 A 22 Sam 22 Sam 2 B 12 Sarah 58 James 3 C 17 Andrew 82 Sally 4 D 12 Mairin 67 Ray
However, I have a problem if there are multiple min or max values:
Name <- c("Sam", "Sarah", "Jim", "Fred", "James", "Sally", "Andrew", "John", "Mairin", "Kate", "Sasha", "Ray", "Ed") Age <- c(22,31,31,35,58,82,17,34,12,24,44,67,43) Group <- c("A", "B", "B", "B", "B", "C", "C", "D", "D", "D", "D", "D", "D") data <- data.frame(Name, Age, Group) > data %>% group_by(Group) %>% + summarize(minAge = min(Age), minAgeName = Name[which(Age == min(Age))], + maxAge = max(Age), maxAgeName = Name[which(Age == max(Age))]) Error: expecting a single value
I'm looking for two solutions:
(1) where it doesn't matter which min or max name is shown, just that one is shown (i.e., the first value found) (2) where if there are "ties" all minimum values and maximum values are shown
Please let me know if this isn't clear and thanks in advance!
The min is simply the lowest observation, while the max is the highest observation. Obviously, it is easiest to determine the min and max if the data are ordered from lowest to highest. So for our data, the min is 13 and the max is 110.
Row wise maximum of the dataframe or maximum value of each row in R is calculated using rowMaxs() function. Other method to get the row maximum in R is by using apply() function. row wise maximum of the dataframe is also calculated using dplyr package.
Minimum value of a column in R can be calculated by using min() function. min() Function takes column name as argument and calculates the Minimum value of that column.
You can use which.min
and which.max
to get the first value.
data %>% group_by(Group) %>% summarize(minAge = min(Age), minAgeName = Name[which.min(Age)], maxAge = max(Age), maxAgeName = Name[which.max(Age)])
To get all values, use e.g. paste with an appropriate collapse
argument.
data %>% group_by(Group) %>% summarize(minAge = min(Age), minAgeName = paste(Name[which(Age == min(Age))], collapse = ", "), maxAge = max(Age), maxAgeName = paste(Name[which(Age == max(Age))], collapse = ", "))
I would actually recommend keeping your data in a "long" format. Here's how I would approach this:
library(dplyr)
Keeping all values when there are ties:
data %>% group_by(Group) %>% arrange(Age) %>% ## optional filter(Age %in% range(Age)) # Source: local data frame [8 x 3] # Groups: Group # # Name Age Group # 1 Sam 22 A # 2 Sarah 31 B # 3 Jim 31 B # 4 James 58 B # 5 Andrew 17 C # 6 Sally 82 C # 7 Mairin 12 D # 8 Ray 67 D
Keeping only one value when there are ties:
data %>% group_by(Group) %>% arrange(Age) %>% slice(if (length(Age) == 1) 1 else c(1, n())) ## maybe overkill? # Source: local data frame [7 x 3] # Groups: Group # # Name Age Group # 1 Sam 22 A # 2 Sarah 31 B # 3 James 58 B # 4 Andrew 17 C # 5 Sally 82 C # 6 Mairin 12 D # 7 Ray 67 D
If you really want a "wide" dataset, the basic concept would be to gather
and spread
the data, using "tidyr":
library(dplyr) library(tidyr) data %>% group_by(Group) %>% arrange(Age) %>% slice(c(1, n())) %>% mutate(minmax = c("min", "max")) %>% gather(var, val, Name:Age) %>% unite(key, minmax, var) %>% spread(key, val) # Source: local data frame [4 x 5] # # Group max_Age max_Name min_Age min_Name # 1 A 22 Sam 22 Sam # 2 B 58 James 31 Sarah # 3 C 82 Sally 17 Andrew # 4 D 67 Ray 12 Mairin
Though what wide form you would want with ties is unclear.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With