Suppose I have an S4 class with two slots. I then create a method that sets one of the slots to something and returns the result. Will the other slot also be copied on assignment?
For example,
setClass('foo', representation(first.slot = 'numeric', second.slot = 'numeric'))
setGeneric('setFirstSlot', function(object, value) {standardGeneric('setFirstSlot')})
setMethod('setFirstSlot', signature('foo', 'numeric'), function(object, value) {
[email protected] = value
return(object)
})
f <- new('foo')
[email protected] <- 2
f <- setFirstSlot(f, 1)
On the last line, will the values of both the first and second slot be copied or will there be some sort of optimization? I have a class with a field holding a gigabyte of data and a few fields with small numeric vectors, I'd like to have a setter function for the numeric fields that doesn't waste time needlessly copying the data every time it's used.
Thanks :)
There are mainly two major systems of OOP, which are described below: S3 Classes: These let you overload the functions. S4 Classes: These let you limit the data as it is quite difficult to debug the program.
The S4 system in R is a system for object oriented programing. Confusingly, R has support for at least 3 different systems for object oriented programming: S3, S4 and S5 (also known as reference classes).
Description. The S3 and S4 software in R are two generations implementing functional object-oriented programming. S3 is the original, simpler for initial programming but less general, less formal and less open to validation. The S4 formal methods and classes provide these features but require more programming.
If you are copying large amounts of data in a field, one solution is to use a reference class. Let's compare the reference classes to S4.
## Store timing output
m = matrix(0, ncol=4, nrow=6)
Create a function class definition:
foo_ref = setRefClass("test", fields = list(x = "numeric", y = "numeric"))
Then time data assignment:
## Reference class
g = function(x) {x$x[1] = 1; return(x)}
for(i in 6:8){
f = foo_ref$new(x = 1, y = 1)
y = runif(10^i)
t1 = system.time({f$y <- y})[3]
t2 = system.time({f$y[1] = 1})[3]
t3 = system.time({f$x = 1})[3]
t4 = system.time({g(f)})[3]
m[i-5, ] = c(t1, t2, t3, t4)
}
We can repeat for a similar S4 structure:
g = function(x) {x@y[1] = 1; return(x)}
setClass('foo_s4', representation(x = 'numeric', y = 'numeric'))
for(i in 6:8){
f = new('foo_s4'); f@x = 1; f@y = 1
y = runif(10^i)
t1 = system.time({f@y <- y})[3]
t2 = system.time({f@y[1] <- 1})[3]
t3 = system.time({f@x = 1})[3]
t4 = system.time({g(f)})[3]
m[i-2, ] = c(t1, t2, t3, t4)
}
Assignment using a reference class structure for large data sets is much more efficient when dealing with functions.
t3
timings for S4 objects were higher.When the class is used by the developer (who knows the design of the class), using the assignment operator @<-
instead of a setter method as setFirstSlot
defined in the question may be better. The reason is that the former avoids returning the whole object.
However, setter methods are desirable to prevent users from trying assignments that do not match the definition of the slot in the class. I know that if we use @<-
to assign a character to the slot x
(which was defined as numeric
) an error is returned.
setClass('foo', representation(x = 'numeric', y = 'numeric'))
f <- new('foo')
f@x <- 1 # this is ok
f@y <- 2 # this is ok
f@x <- "a"
#Error in checkAtAssignment("foo", "x", "character") :
# assignment of an object of class “character” is not valid for @‘x’ in an object of class “foo”; is(value, "numeric") is not TRUE
But imagine a situation where the slot should contain only one element. This requirement in the length of the slot is not caught by @<-
:
# this assignment is allowed
f@x <- c(1, 2, 3, 4)
f@x
#[1] 1 2 3 4
In this situation we would like to define a setter method in order to inform the user about further restrictions in the definition of the slot. But then, we have to return the entire object and this may be an extra burden if the object is big.
As far as I know there is no way to define the length of a slot in its definition. The method setValidity
could be defined in order to check this or other requirements in the slots, but it seems that @<-
does not rely on validObject
and the assignment f@x <- c(1, 2, 3, 4)
would be allowed even if we define setValidity
:
valid.foo <- function(object)
{
if (length(object@x) > 1)
stop("slot ", sQuote("x"), " must be of length 1")
}
setValidity("foo", valid.foo)
# no error is detected and the assignment is allowed
f@x <- c(1, 4, 6)
f@x
#[1] 1 4 6
# we need to call "validObject" to check if everything is correct
validObject(f)
#Error in validityMethod(object) : slot ‘x’ must be of length 1
A possible solution is to modify the object in-place. The method set.x.inplace
below is based on this approach.
setGeneric("set.x.inplace", function(object, val){ standardGeneric("set.x.inplace") })
setMethod("set.x.inplace", "foo", function(object, val)
{
if (length(val) == 1) {
eval(eval(substitute(expression(object@x <<- val))))
} else
stop("slot ", sQuote("x"), " must be of length 1")
#return(object) # not necessary
})
set.x.inplace(f, 6)
f
#An object of class "foo"
#Slot "x":
#[1] 6
#Slot "y":
#[1] 2
# the assignment is not allowed
set.x.inplace(f, c(1,2,3))
#Error in set.x.inplace(f, c(1, 2, 3)) : slot ‘x’ must be of length 1
As this method does not perform a return operation, it can be a good alternative with objects of large size.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With