Can Summarise in dplyr not drop other columns in my data frame?

Tags:

r

I have a data frame with three columns in it and I am attempting a simple summary to find the maximum temperature for each city in the data frame, but also keep the date listed for each max temperature.

Here is the data frame:

we'll call it maxT

  new.ID       Date   Max_TemperatureF
1     TUS 1960-04-05               87
2     TUS 1984-04-24               86
3     TUS 1972-04-01               75
4     TUS 2006-04-14               91
5     TUS 2000-05-03               96
6     PHX 1960-04-05               93
7     PHX 1984-04-24               93
8     PHX 1972-04-01               84
9     PHX 2006-04-14               91
10    PHX 2000-05-03               99
11    LAS 1960-04-05               91
12    LAS 1984-04-24               86
13    LAS 1972-04-01               81
14    LAS 2006-04-14               81
15    LAS 2000-05-03               98
16    LAX 1960-04-05               72
17    LAX 1984-04-24               69
18    LAX 1972-04-01               73
19    LAX 2006-04-14               63
20    LAX 2000-05-03               69
21    SAC 1960-04-05               82
22    SAC 1984-04-24               75
23    SAC 1972-04-01               64
24    SAC 2006-04-14               71
25    SAC 2000-05-03               81
26    PSP 1960-04-05               98
27    PSP 1984-04-24               91
28    PSP 1972-04-01               91
29    PSP 2006-04-14               81
30    PSP 2000-05-03               9

Each city has 5 temperatures listed and I would like to find the maximum for each city and then also list the date. I am using dplyr and have tried a quite a few variations of this code, but Date is always dropped in the final product. Is there a way to add a condition like drop=FALSE or something similar?

maxT <- tbl_df(maxT) %>%
  select(new.ID,Date,Max_TemperatureF)%>%
  group_by(new.ID) %>% 
  summarise(max_temp= max(Max_TemperatureF))

This is the output I keep getting:

 new.ID max_temp
1    LAS       98
2    LAX       73
3    PHX       99
4    PSP       99
5    SAC       82
6    TUS       96

Thanks.

287

asked Apr 15 '15 18:04

user3720887

Video Answer

1 Answers

We could try either filter or slice. If there are ties for the maximum 'Max_TemperatureF' and want to get all those rows,

 tbl_df(test) %>%
      group_by(new.ID) %>% 
      filter(Max_TemperatureF==max(Max_TemperatureF))

Or we can get the index of the rows with which.max and subset with slice

 tbl_df(test) %>% 
       group_by(new.ID) %>% 
       slice(which.max(Max_TemperatureF))

188

answered Oct 11 '22 21:10

akrun

Related questions
                            
                                difference between 1:10 and c(1:10)
                            
                                Send expression to website return dynamic result (picture)
                            
                                read.csv replaces column-name characters like `?` with `.`, `-` with `...`
                            
                                Calculate multiple columns from one function and add them to data.frame
                            
                                How to group similar rows in R
                            
                                Exclude specific object type from the global environment
                            
                                Pass expression as variable to curve
                            
                                Classification accuracy of binomial glmer() predictions
                            
                                How to use namespaced function with dplyr::mutate_each?
                            
                                Adjusting x limits xlim() in ggplot2 geom_density() to mimic ggvis layer_densities() behavior
                            
                                Getting an R expression from a value (similar to enquote)
                            
                                Parliamentary seats graph -> colors and labels?
                            
                                Financial Data - R data.table - group by condiction
                            
                                rJava - .jcall calling issue: method with signature not found
                            
                                Joining two data.tables in R based on multiple keys and duplicate entries
                            
                                How to assign different images to different vertices in an igraph?
                            
                                What is the reason to add quotation marks around R function names?
                            
                                How can i get a 'rcom' package?
                            
                                Sum product by row across two dataframes/matrix in r
                            
                                Error while adding main title with subscript in gridExtra

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With