There are a couple of issues about this on the dplyr Github repo already, and at least one related SO question, but none of them quite covers my question -- I think. <ul> <li> Adding multiple columns in a dplyr mutate call is more or less what I want, but there's a special-case answer for that case (<code>tidyr::separate</code>) that doesn't (I think) work for me.</li> <li> This issue ("summarise or mutate with functions returning multiple values/columns") says "use <code>do()</code>".</li> </ul> Here's my use case: I want to compute exact binomial confidence intervals <pre class="prettyprint"><code>dd <- data.frame(x=c(3,4),n=c(10,11)) get_binCI <- function(x,n) { rbind(setNames(c(binom.test(x,n)$conf.int),c("lwr","upr"))) } with(dd[1,],get_binCI(x,n)) ## lwr upr ## [1,] 0.06673951 0.6524529 </code></pre> I can get this done with <code>do()</code> but I wonder if there's a more expressive way to do this (it feels like <code>mutate()</code> could have a <code>.n</code> argument as is being discussed for summarise() ...) <pre class="prettyprint"><code>library("dplyr") dd %>% group_by(x,n) %>% do(cbind(.,get_binCI(.$x,.$n))) ## Source: local data frame [2 x 4] ## Groups: x, n ## ## x n lwr upr ## 1 3 10 0.06673951 0.6524529 ## 2 4 11 0.10926344 0.6920953 </code></pre>

Yet another option could be to use the <code>purrr::map</code> family of functions. If you replace <code>rbind</code> with <code>dplyr::bind_rows</code> in the <code>get_binCI</code> function: <pre class="prettyprint lang-r prettyprint-override"><code>library(tidyverse) dd <- data.frame(x = c(3, 4), n = c(10, 11)) get_binCI <- function(x, n) { bind_rows(setNames(c(binom.test(x, n)$conf.int), c("lwr", "upr"))) } </code></pre> You can use <code>purrr::map2</code> with <code>tidyr::unnest</code>: <pre class="prettyprint lang-r prettyprint-override"><code>dd %>% mutate(result = map2(x, n, get_binCI)) %>% unnest() #> x n lwr upr #> 1 3 10 0.06673951 0.6524529 #> 2 4 11 0.10926344 0.6920953 </code></pre> Or <code>purrr::map2_dfr</code> with <code>dplyr::bind_cols</code>: <pre class="prettyprint lang-r prettyprint-override"><code>dd %>% bind_cols(map2_dfr(.$x, .$n, get_binCI)) #> x n lwr upr #> 1 3 10 0.06673951 0.6524529 #> 2 4 11 0.10926344 0.6920953 </code></pre>

dplyr::mutate to add multiple values

Tags:

r

dplyr

There are a couple of issues about this on the dplyr Github repo already, and at least one related SO question, but none of them quite covers my question -- I think.

Adding multiple columns in a dplyr mutate call is more or less what I want, but there's a special-case answer for that case (tidyr::separate) that doesn't (I think) work for me.
This issue ("summarise or mutate with functions returning multiple values/columns") says "use do()".

Here's my use case: I want to compute exact binomial confidence intervals

dd <- data.frame(x=c(3,4),n=c(10,11)) get_binCI <- function(x,n) {     rbind(setNames(c(binom.test(x,n)$conf.int),c("lwr","upr"))) } with(dd[1,],get_binCI(x,n)) ##             lwr       upr ## [1,] 0.06673951 0.6524529

I can get this done with do() but I wonder if there's a more expressive way to do this (it feels like mutate() could have a .n argument as is being discussed for summarise() ...)

library("dplyr") dd %>% group_by(x,n) %>%     do(cbind(.,get_binCI(.$x,.$n)))  ## Source: local data frame [2 x 4] ## Groups: x, n ##  ##   x  n        lwr       upr ## 1 3 10 0.06673951 0.6524529 ## 2 4 11 0.10926344 0.6920953

556

asked Apr 13 '15 20:04

Ben Bolker

2 Answers

Yet another variant, although I think we're all splitting hairs here.

> dd <- data.frame(x=c(3,4),n=c(10,11)) > get_binCI <- function(x,n) { +   as_data_frame(setNames(as.list(binom.test(x,n)$conf.int),c("lwr","upr"))) + } >  > dd %>%  +   group_by(x,n) %>% +   do(get_binCI(.$x,.$n)) Source: local data frame [2 x 4] Groups: x, n    x  n        lwr       upr 1 3 10 0.06673951 0.6524529 2 4 11 0.10926344 0.6920953

Personally, if we're just going by readability, I find this preferable:

foo  <- function(x,n){     bi <- binom.test(x,n)$conf.int     data_frame(lwr = bi[1],                upr = bi[2]) }  dd %>%      group_by(x,n) %>%     do(foo(.$x,.$n))

...but now we're really splitting hairs.

164

answered Oct 14 '22 16:10

joran

Yet another option could be to use the purrr::map family of functions.

If you replace rbind with dplyr::bind_rows in the get_binCI function:

library(tidyverse)  dd <- data.frame(x = c(3, 4), n = c(10, 11)) get_binCI <- function(x, n) {   bind_rows(setNames(c(binom.test(x, n)$conf.int), c("lwr", "upr"))) }

You can use purrr::map2 with tidyr::unnest:

dd %>% mutate(result = map2(x, n, get_binCI)) %>% unnest()  #>   x  n        lwr       upr #> 1 3 10 0.06673951 0.6524529 #> 2 4 11 0.10926344 0.6920953

Or purrr::map2_dfr with dplyr::bind_cols:

dd %>% bind_cols(map2_dfr(.$x, .$n, get_binCI))  #>   x  n        lwr       upr #> 1 3 10 0.06673951 0.6524529 #> 2 4 11 0.10926344 0.6920953

answered Oct 14 '22 17:10

markdly

Related questions
                            
                                In what cases should new.env be used to create a new environment?
                            
                                Set number of columns (or rows) in a facetted plot
                            
                                How to get mean, median, and other statistics over entire matrix, array or dataframe?
                            
                                Line break when no data in ggplot2
                            
                                Subset a dataframe between 2 dates
                            
                                selectInput that is dependent on another selectInput
                            
                                How can I reference the local environment within a function, in R?
                            
                                Bar plot with log scales
                            
                                "Factor has new levels" error for variable I'm not using
                            
                                Why does median trip up data.table (integer versus double)?
                            
                                How to display verbatim inline r code with backticks using Rmarkdown?
                            
                                geom_point() and geom_line() for multiple datasets on same graph in ggplot2
                            
                                R: convert XML data to data frame
                            
                                Can I navigate, zoom in and zoom out R plots?
                            
                                ggplot2 for grayscale printouts
                            
                                Why is one_of() called that?
                            
                                Passing arguments to iterated function through apply
                            
                                %>% key binding / keyboard shortcut in Rstudio
                            
                                unzip a tar.gz file? [duplicate]
                            
                                Remove a library from .libPaths() permanently without Rprofile.site

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With