One nice feature of R which is related to its inherent vectorized nature is the recycling rule described in An Introduction to R in Section 2.2.
Vectors occurring in the same expression need not all be of the same length. If they are not, the value of the expression is a vector with the same length as the longest vector which occurs in the expression. Shorter vectors in the expression are recycled as often as need be (perhaps fractionally) until they match the length of the longest vector. In particular a constant is simply repeated.
Most standard functions use this, but the code that does so is buried in the underlying C code.
Is there a canonical way to implement the standard recycling rules for a function entirely in R code? That is, given a function like
mock <- function(a, b, c) { # turn a, b, and c into appropriate recycled versions # do something with recycled a, b, and c in some appropriately vectorized way }
where a
, b
, and c
are vectors, possibly of different lengths and unknown types/classes, is there a canonical way to get a new set of vectors which are recycled according to the standard recycling rules? In particular, I can't assume that "do something" step will do the proper recycling itself, so I need to do it myself beforehand.
I've used this in the past,
expand_args <- function(...){ dots <- list(...) max_length <- max(sapply(dots, length)) lapply(dots, rep, length.out = max_length) }
I'd likely use the length.out
argument of rep()
to do most of the real work.
Here's an example that creates a better.data.frame()
function (it should really be called "better".data.frame()
), which places no restrictions on the lengths of the vectors it's handed as arguments. In this case, I recycle all of the vectors to the length of the the longest one, but you can obviously adapt this to serve your own recycling needs!
better.data.frame <- function(...) { cols <- list(...) names(cols) <- sapply(as.list(match.call()), deparse)[-1] # Find the length of the longest vector # and then recycle all columns to that length. n <- max(lengths(cols)) cols <- lapply(cols, rep, length.out = n) as.data.frame(cols) } # Try it out a <- Sys.Date() + 0:9 b <- 1:3 c <- letters[1:4] data.frame(a,b,c) # Error in data.frame(a, b, c) : # arguments imply differing number of rows: 10, 3, 4 better.data.frame(a,b,c) # a b c # 1 2012-02-17 1 a # 2 2012-02-18 2 b # 3 2012-02-19 3 c # 4 2012-02-20 1 d # 5 2012-02-21 2 a # 6 2012-02-22 3 b # 7 2012-02-23 1 c # 8 2012-02-24 2 d # 9 2012-02-25 3 a # 10 2012-02-26 1 b
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With