Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I see the help for the `dplyr::collect` method?

Tags:

r

dplyr

I am trying to find out what additional arguments can be passed to dplyr::collect in the ellipsis .... I want to do this because I believe that the behaviour of collect has changed between dplyr version 0.4.3 and 0.5. It seems that in the new version collect() only downloads the first 100k rows, unless a new n = Inf argument is passed.

I have retrieved the methods associated with collect using:

> methods('collect')
[1] collect.data.frame* collect.tbl_sql*   
see '?methods' for accessing help and source code

I have looked at the help file for S3 methods but cannot work out how to get help on collect.tbl_sql, as ?"dplyr::collect.tbl_sql" does not work.

like image 202
Alex Avatar asked Oct 30 '22 19:10

Alex


1 Answers

As noted by Chrisss and Zheyuan Li:

  1. The asterisk/star/* next to the method name after running methods indicates that each of these methods are not exported from the dplyr namespace.
  2. To access the helpfile, one then needs to use three colons, i.e., ?dplyr:::collect.tbl_sql
  3. However, there is no helpfile for these methods, so we need to examine the source code to look at the behaviour of each of these functions in the respective versions.

In 0.4.3 by examining tbl-sqr.r file in the source code:

collect.tbl_sql <- function(x, ...) {
  grouped_df(x$query$fetch(), groups(x))
}

and in 0.5:

> dplyr:::collect.tbl_sql

function (x, ..., n = 1e+05, warn_incomplete = TRUE) 
{
    assert_that(length(n) == 1, n > 0L)
    if (n == Inf) {
        n <- -1
    }
    sql <- sql_render(x)
    res <- dbSendQuery(x$src$con, sql)
    on.exit(dbClearResult(res))
    out <- dbFetch(res, n)
    if (warn_incomplete) {
        res_warn_incomplete(res, "n = Inf")
    }
    grouped_df(out, groups(x))
}

Thus, we can conclude that the behaviour of collect has indeed changed in the manner originally described in my question.

like image 55
Alex Avatar answered Nov 14 '22 03:11

Alex