Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Finding What You Need in R: focused searching within R and all (3,500+) CRAN Packages

Often in R, there are a dozen functions scattered across as many packages--all of which have the same purpose but of course differ in accuracy, performance, documentation, theoretical rigor, and so on.

How do you locate these--from within R and even from among the CRAN Packages which you have not installed?

So for instance: the generic plot function. Setting secondary ticks is much easier using a function outside of the base package:

minor.tick(nx=n, ny=n, tick.ratio=n)

Of course plot is in R core, but minor.tick is not, it's actually in Hmisc.

Of course, that doesn't show up in the documentation for plot, nor should you expect it to.

Another example: data-input arguments to plot can be supplied by an object returned from the function hexbin, again, this function is from a library outside of R core.

What would be great obviously is a programmatic way to gather these function arguments from the various libraries and put them in a single namespace?

*edit: (trying to re-state my example just above more clearly:) the arguments to plot supplied in R core, e.g., setting the axis tick frequency are xaxp/yaxp; however, one can also set a/t/f via a function outside of the base package, again, as in the minor.tick function from the Hmisc package--but you wouldn't know that just from looking at the plot method signature. Is there a meta function in R for this?*

So far, as i come across them, i've been manually gathering them, each set gathered in a single TextMate snippet (along with the attendant library imports). This isn't that difficult or time consuming, but i can only update my snippet as i find out about these additional arguments/parameters. Is there a canonical R way to do this, or at least an easier way?

Just in case that wasn't clear, i am not talking about the case where multiple packages provide functions directed to the same statistic or view (e.g., 'boxplot' in the base package; 'boxplot.matrix' in gplots; and 'bplots' in Rlab). What i am talking is the case in which the function name is the same across two or more packages.

like image 653
doug Avatar asked Nov 28 '09 14:11

doug


1 Answers

The "sos" package is an excellent resource. It's primary interface is the "findFn" command, which accepts a string (your search term) and scans the "function" entries in Johnathan Baron's site search database, and returns the entries that contain the search term in a data frame (of class "findFn").

The columns of this data frame are: Count, MaxScore, TotalScore, Package, Function, Date, Score, Description, and Link. Clicking on "Link" in any entry's row will immediately pull up the help page.

An example: suppose you wanted to find all convolution filters across all 1800+ R packages.

library(sos)
cf = findFn("convolve") 

This query will look the term "convolve", in other words, that doesn't have to be the function name.

Keying in "cf" returns an HTML table of all matches found (23 in this case). This table is an HTML rendering of the data frame i mentioned just above. What is particularly convenient is that each column ("Count", "MaxScore", etc.) is sortable by clicking on the column header, so you can view the results by "Score", by "Package Name", etc.

(As an aside: when running that exact query, one of the results was the function "panel.tskernel" in a package called "latticeExtra". I was not aware this package had any time series filters in it and i doubt i would have discovered it otherwise.

like image 94
doug Avatar answered Oct 13 '22 00:10

doug