I am having trouble figuring out the proper way to define the [
, $
, and [[
subset operators for an S4 class.
Can anyone provide me with a basic example of defining these three for an S4 class?
Example 2: Creation of S4 object We can check if an object is an S4 object through the function isS4() . The function setClass() returns a generator function. This generator function (usually having same name as the class) can be used to create new objects. It acts as a constructor.
▶ The S4 class system is a set of facilities provided in R for OO programming. ▶ R also supports an older class system: the S3 class system. like in other OO programming languages.
The S3 and S4 software in R are two generations implementing functional object-oriented programming. S3 is the original, simpler for initial programming but less general, less formal and less open to validation. The S4 formal methods and classes provide these features but require more programming.
Since you can do all of this in any language – even C – it's arguable that multiple dispatch isn't really part of "the R language" at all, but rather "the R system" happens to ship with a standard implementation of hash-based multiple dispatch and a few of the built-in function like "show" and "plot" are defined using ...
Discover the generic so that we know what we are aiming for
> getGeneric("[")
standardGeneric for "[" defined from package "base"
function (x, i, j, ..., drop = TRUE)
standardGeneric("[", .Primitive("["))
<bytecode: 0x32e25c8>
<environment: 0x32d7a50>
Methods may be defined for arguments: x, i, j, drop
Use showMethods("[") for currently available ones.
Define a simple class
setClass("A", representation=representation(slt="numeric"))
and implement a method
setMethod("[", c("A", "integer", "missing", "ANY"),
## we won't support subsetting on j; dispatching on 'drop' doesn't
## make sense (to me), so in rebellion we'll quietly ignore it.
function(x, i, j, ..., drop=TRUE)
{
## less clever: update slot, return instance
## x@slt = x@slt[i]
## x
## clever: by default initialize is a copy constructor, too
initialize(x, slt=x@slt[i])
})
In action:
> a = new("A", slt=1:5)
> a[3:1]
An object of class "A"
Slot "slt":
[1] 3 2 1
There are different strategies for supporting the (implicitly) many signatures, for instance you'd likely also want to support logical and character index values, possibly for both i and j. The most straight-forward is a "facade" pattern where each method does some preliminary coercion to a common type of subset index, e.g., integer
to allow for re-ordering and repetition of index entries, and then uses callGeneric
to invoke a single method that does the work of subsetting the class.
There are no conceptual differences for [[
, other than wanting to respect the semantics of returning the content rather than another instance of the object as implied by [
. For $
we have
> getGeneric("$")
standardGeneric for "$" defined from package "base"
function (x, name)
standardGeneric("$", .Primitive("$"))
<bytecode: 0x31fce40>
<environment: 0x31f12b8>
Methods may be defined for arguments: x
Use showMethods("$") for currently available ones.
and
setMethod("$", "A",
function(x, name)
{
## 'name' is a character(1)
slot(x, name)
})
with
> a$slt
[1] 1 2 3 4 5
I would do as @Martin_Morgan suggested for the operators you mentioned. I would add a couple of points though:
1) I would be careful about defining a $
operator to access an S4 slot (unless you intend to access a column from a data frame which is stored in a specific slot?). The general suggestion is to write accessor functions like getMySlot()
and setMySlot()
to get the information you need. You can use the @
operator to access data from those slots, although get and set are best as a user interface. Using $
could be confusing for the user, who would probably expect a data.frame. See this S4 tutorial by Christophe Genolini for an in-depth discussion of these issues. If this is not how you intended to use $
, disregard my suggestion (but the tutorial is still a great resource!).
2) If you are defining [
and [[
to inherit from another class, like vector, you will also want to define el()
(equivalent to [][[1L]]
, or the first element from a subset []
) and length()
. I am currently writing a class to inherit from numeric, and numeric methods will automatically try to use these functions from your class. If the class is for a more limited or your own personal use, this may not be a problem.
I apologize, I would have left this as a comment, but I'm new to SO and I don't have the rep yet!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With