I have a data frame that looks like this: <pre class="prettyprint"><code>#df ID DRUG FED AUC0t Tmax Cmax 1 1 0 100 5 20 2 1 1 200 6 25 3 0 1 NA 2 30 4 0 0 150 6 65 </code></pre> Ans so on. I want to summarize some statistics on AUC, Tmax and Cmax by drug <code>DRUG</code> and FED STATUS <code>FED</code>. I use dplyr. For example: for the AUC: <pre class="prettyprint"><code>CI90lo <- function(x) quantile(x, probs=0.05, na.rm=TRUE) CI90hi <- function(x) quantile(x, probs=0.95, na.rm=TRUE) summary <- df %>% group_by(DRUG,FED) %>% summarize(mean=mean(AUC0t, na.rm=TRUE), low = CI90lo(AUC0t), high= CI90hi(AUC0t), min=min(AUC0t, na.rm=TRUE), max=max(AUC0t,na.rm=TRUE), sd= sd(AUC0t, na.rm=TRUE)) </code></pre> However, the output is not grouped by DRUG and FED. It gives only one line containing the statistics of all by not faceted on DRUG and FED. Any idea why? and how can I make it do the right thing?

I believe you've loaded plyr after dplyr, which is why you are getting an overall summary instead of a grouped summary. This is what happens with plyr loaded last. <pre class="prettyprint"><code>library(dplyr) library(plyr) df %>% group_by(DRUG,FED) %>% summarize(mean=mean(AUC0t, na.rm=TRUE), low = CI90lo(AUC0t), high= CI90hi(AUC0t), min=min(AUC0t, na.rm=TRUE), max=max(AUC0t,na.rm=TRUE), sd= sd(AUC0t, na.rm=TRUE)) mean low high min max sd 1 150 105 195 100 200 50 </code></pre> Now remove plyr and try again and you get the grouped summary. <pre class="prettyprint"><code>detach(package:plyr) df %>% group_by(DRUG,FED) %>% summarize(mean=mean(AUC0t, na.rm=TRUE), low = CI90lo(AUC0t), high= CI90hi(AUC0t), min=min(AUC0t, na.rm=TRUE), max=max(AUC0t,na.rm=TRUE), sd= sd(AUC0t, na.rm=TRUE)) Source: local data frame [4 x 8] Groups: DRUG DRUG FED mean low high min max sd 1 0 0 150 150 150 150 150 NaN 2 0 1 NaN NA NA NA NA NaN 3 1 0 100 100 100 100 100 NaN 4 1 1 200 200 200 200 200 NaN </code></pre>

A variant of aosmith's answer that might help some folks out. Direct R to call dplyr's functions directly. Good trick when one package interferes with another. <pre class="prettyprint"><code>df %>% dplyr::group_by(DRUG,FED) %>% dplyr::summarize(mean=mean(AUC0t, na.rm=TRUE), low = CI90lo(AUC0t), high= CI90hi(AUC0t), min=min(AUC0t, na.rm=TRUE), max=max(AUC0t,na.rm=TRUE), sd= sd(AUC0t, na.rm=TRUE)) </code></pre>

Why are my dplyr group_by & summarize not working properly? (name-collision with plyr)

Tags:

r

dplyr

shadowing

plyr

name-collision

I have a data frame that looks like this:

#df ID  DRUG FED  AUC0t  Tmax   Cmax 1    1     0   100     5      20 2    1     1   200     6      25 3    0     1   NA      2      30  4    0     0   150     6      65

Ans so on. I want to summarize some statistics on AUC, Tmax and Cmax by drug DRUG and FED STATUS FED. I use dplyr. For example: for the AUC:

CI90lo <- function(x) quantile(x, probs=0.05, na.rm=TRUE) CI90hi <- function(x) quantile(x, probs=0.95, na.rm=TRUE)    summary <- df %>%              group_by(DRUG,FED) %>%              summarize(mean=mean(AUC0t, na.rm=TRUE),                                   low = CI90lo(AUC0t),                                   high= CI90hi(AUC0t),                                  min=min(AUC0t, na.rm=TRUE),                                  max=max(AUC0t,na.rm=TRUE),                                   sd= sd(AUC0t, na.rm=TRUE))

However, the output is not grouped by DRUG and FED. It gives only one line containing the statistics of all by not faceted on DRUG and FED.

Any idea why? and how can I make it do the right thing?

751

asked Nov 14 '14 06:11

Amer

Video Answer

2 Answers

I believe you've loaded plyr after dplyr, which is why you are getting an overall summary instead of a grouped summary.

This is what happens with plyr loaded last.

library(dplyr) library(plyr) df %>%       group_by(DRUG,FED) %>%       summarize(mean=mean(AUC0t, na.rm=TRUE),                  low = CI90lo(AUC0t),                   high= CI90hi(AUC0t),                  min=min(AUC0t, na.rm=TRUE),                  max=max(AUC0t,na.rm=TRUE),                   sd= sd(AUC0t, na.rm=TRUE))    mean low high min max sd 1  150 105  195 100 200 50

Now remove plyr and try again and you get the grouped summary.

detach(package:plyr) df %>%       group_by(DRUG,FED) %>%       summarize(mean=mean(AUC0t, na.rm=TRUE),                  low = CI90lo(AUC0t),                   high= CI90hi(AUC0t),                  min=min(AUC0t, na.rm=TRUE),                  max=max(AUC0t,na.rm=TRUE),                   sd= sd(AUC0t, na.rm=TRUE))  Source: local data frame [4 x 8] Groups: DRUG    DRUG FED mean low high min max  sd 1    0   0  150 150  150 150 150 NaN 2    0   1  NaN  NA   NA  NA  NA NaN 3    1   0  100 100  100 100 100 NaN 4    1   1  200 200  200 200 200 NaN

answered Oct 12 '22 17:10

aosmith

A variant of aosmith's answer that might help some folks out. Direct R to call dplyr's functions directly. Good trick when one package interferes with another.

df %>%       dplyr::group_by(DRUG,FED) %>%       dplyr::summarize(mean=mean(AUC0t, na.rm=TRUE),                  low = CI90lo(AUC0t),                   high= CI90hi(AUC0t),                  min=min(AUC0t, na.rm=TRUE),                  max=max(AUC0t,na.rm=TRUE),                   sd= sd(AUC0t, na.rm=TRUE))

answered Oct 12 '22 18:10

mmann1123

Related questions
                            
                                How to italicize part (one or two words) of an axis title
                            
                                reshape2 melt warning message
                            
                                Multi-row x-axis labels in ggplot line chart
                            
                                How to put labels over geom_bar in R with ggplot2
                            
                                Merge data frames based on rownames in R
                            
                                Split data.frame based on levels of a factor into new data.frames
                            
                                Is there a command in R to view all the functions present in a package? [duplicate]
                            
                                Extract file extension from file path
                            
                                Adding vertical line in plot ggplot
                            
                                Pass arguments to dplyr functions
                            
                                How to convert R formula to text?
                            
                                Sum a list of matrices [duplicate]
                            
                                How to paste a string on each element of a vector of strings using apply in R?
                            
                                R + ggplot : Time series with events
                            
                                Adding greek character to axis title
                            
                                How to save a data frame as CSV to a user selected location using tcltk
                            
                                Aggregate Daily Data to Month/Year intervals
                            
                                Chopping a string into a vector of fixed width character elements
                            
                                multiple authors and subtitles in Rmarkdown yaml
                            
                                First letter to upper case

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With