I came across this post: http://r.789695.n4.nabble.com/speeding-up-perception-tp3640920p3646694.html from Matt Dowle, discussing some early? implementation ideas of the data.table
package.
He uses the following code:
x = list(a = 1:10000, b = 1:10000)
class(x) = "newclass"
"[<-.newclass" = function(x,i,j,value) x # i.e. do nothing
tracemem(x)
x[1, 2] = 42L
Specifically I am looking at:
"[<-.newclass" = function(x,i,j,value) x
I am trying to understand what is done there and how i could use this notation.
It looks to me like:
My best guess would therefore be that i define a custom function for in place modification (for a given class).
[<-.newclass
is in class modification for class newclass.
Understanding what happens: Usually the following code should return an error:
x = list(a = 1:10000, b = 1:10000)
x[1, 2] = 42L
so i guess the sample code does not have any practical use.
Attempt to use the logic:
A simple non-sense try would be to square the value to be inserted:
x[i, j] <- value^2
Full try:
> x = matrix(1:9, 3, 3)
> class(x) = "newclass"
> "[<-.newclass" = function(x, i, j, value) x[i, j] <- value^2 # i.e. do something
> x[1, 2] = 9
Error: C stack usage 19923536 is too close to the limit
This doesnt seem to work.
My question(s):
"[<-.newclass" = function(x,i,j,value) x
How exactly does this notation work and how would I use it?
(I add data.table tag since the linked discussion is about the "by-reference" in place modification in data.table, i think).
The `[<-`()
function is (traditionally) used for subassignment, and is, more broadly, a type of replacement function. It is also generic (more specifically, an internal generic), which allows you to write custom methods for it, as you correctly surmised.
In general, when you call a replacement function, such as ...
foo(x) <- bar(y)
... the expression on the right hand side of <-
(so here bar(y)
) gets passed as a named value
argument to `foo<-`()
with x
as the first argument, and the object x
is reassigned with the result: that is, the said call is equivalent to writing:
x <- `foo<-`(x, value = bar(y))
So in order to work at all, all replacement functions must take at least two arguments, one of which must be named value
.
Most replacement functions only have these two arguments, but there are also exceptions: such as `attr<-`
and, typically, subassignment.
When you have a subassignment call like x[i, j] <- y
, i
and j
get passed as additional arguments to the `[<-`()
function with x
and y
as the first and value
arguments, respectively:
x <- `[<-`(x, i, j, value = y) # x[i, j] <- y
In the case of a matrix
or a data.frame
, i
and j
would be used for selecting rows and columns; but in general, this does not need to be the case. A method for a custom class could do anything with the arguments. Consider this example:
x <- matrix(1:9, 3, 3)
class(x) <- "newclass"
`[<-.newclass` <- function(x, y, z, value) {
x + (y - z) * value # absolute nonsense
}
x[1, 2] <- 9
x
#> [,1] [,2] [,3]
#> [1,] -8 -5 -2
#> [2,] -7 -4 -1
#> [3,] -6 -3 0
#> attr(,"class")
#> [1] "newclass"
Is this useful or reasonable? Probably not. But is it valid R code? Absolutely!
It's less common to see custom subassignment methods in real applications, as `[<-`()
usually "just works" as you might expect it to, based on the underlying object of your class. A notable exception is `[<-.data.frame`
, where the underlying object is a list, but subassignment behaves matrix-like. (On the other hand, many classes do need a custom subsetting method, as the default `[`()
method drops most attributes, including the class
attribute, see ?`[`
for details).
As to why your example doesn't work: remember that you are writing a method for a generic function, and all the regular rules apply. If we use the functional form of `[<-`()
and expand the method dispatch in your example, we can see immediately why it fails:
`[<-.newclass` <- function(x, i, j, value) {
x <- `[<-.newclass`(x, i, j, value = value^2) # x[i, j] <- value^2
}
That is, the function was defined recursively, without a base case, resulting in an infinite loop. One way to get around this would be to unclass(x)
before calling the next method:
`[<-.newclass` <- function(x, i, j, value) {
x <- unclass(x)
x[i, j] <- value^2
x # typically you would also add the class back here
}
(Or, using a somewhat more advanced technique, the body could also be replaced with an explicit next method like this: NextMethod(value = value^2)
. This plays nicer with inheritance and superclasses.)
And just to verify that it works:
x <- matrix(1:9, 3, 3)
class(x) <- "newclass"
x[1, 2] <- 9
x
#> [,1] [,2] [,3]
#> [1,] 1 81 7
#> [2,] 2 5 8
#> [3,] 3 6 9
Perfectly confusing!
As for the context of Dowle's "do nothing" subassignment example, I believe this was to illustrate that back in R 2.13.0, a custom subassignment method would always cause a deep copy of the object to be made, even if the method itself did nothing at all. (This is no longer the case, since R 3.1.0 I believe.)
Created on 2018-08-15 by the reprex package (v0.2.0).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With