Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What's happening when changing "class" of an S4 object using the class function?

Tags:

oop

r

s4

If I have an S4 class such as:

setClass("MyClass",
     representation(
       data="data.frame",
       name="character"))

and instantiate it (say to obj),

obj <- new('MyClass', data=data.frame(1:3), name='An S4 class')

I will have the following representation:

An object of class "MyClass"
Slot "data":
  X1.3
1    1
2    2
3    3

Slot "name":
[1] "An S4 class"

So far so good.

However, if I try to change the "class" using:

class(obj) <- "animal"

I now get

An object of class "animal"
<S4 Type Object>
attr(,"data")
  X1.3
1    1
2    2
3    3
attr(,"name")
[1] "An S4 class"

And if I try to check whether it is still an S4 class, it will return true:

>isS4(obj)
[1] TRUE

What is happening exactly? Why did the "slots" changed to attributes? Is this really still an S4 class?

UPDATE:

Thank you for the comprehensive answers. Just to clarify, I wasn't expecting this to work or to be used in any real scenario. I was just wanted to understand better the mechanism behind this behaviour. Also, it's hard to pick a "best" answer (they're all excellent) but, within the spirit of SO, I must pick one.

like image 401
Rui Vieira Avatar asked May 08 '14 13:05

Rui Vieira


People also ask

What is S4 function?

S4 provides a formal approach to functional OOP. The underlying ideas are similar to S3 (the topic of Chapter 13), but implementation is much stricter and makes use of specialised functions for creating classes ( setClass() ), generics ( setGeneric() ), and methods ( setMethod() ).

What is an S4 class in R?

The S4 system in R is a system for object oriented programing. Confusingly, R has support for at least 3 different systems for object oriented programming: S3, S4 and S5 (also known as reference classes).

What does class () do in R?

The class() function in R is used to return the values of the class attribute of an R object.

How are S4 classes better than S3 classes?

S4 Class is stricter, conventional, and closely related to Object-Oriented concepts. The classes are represented by the formal class definitions of S4. More specifically, S4 has setter and getter functions for methods and generics. As compared to the S3 class, S4 can be able to facilitate multiple dispatches.


2 Answers

S4 implements slots as attributes. This is usually hidden from the user, but is easily seen

> attributes(setClass("MyClass", representation(x="integer"))())
$x
integer(0)

$class
[1] "MyClass"
attr(,"package")
[1] ".GlobalEnv"

In a little more gory detail, we have

> .Internal(inspect(setClass("MyClass", representation(x="integer"))()))
@1fe4dfd8 25 S4SXP g0c0 [OBJ,NAM(2),S4,gp=0x10,ATT] 
ATTRIB:
  @1fe4dfa0 02 LISTSXP g0c0 [] 
    TAG: @23c8978 01 SYMSXP g0c0 [MARK,NAM(2)] "x"
    @1fe4df68 13 INTSXP g0c0 [] (len=0, tl=0)
    TAG: @2363208 01 SYMSXP g0c0 [MARK,NAM(2),LCK,gp=0x4000] "class" (has value)
    @1fd9f1b8 16 STRSXP g0c1 [NAM(2),ATT] (len=1, tl=0)
      @2e09e138 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] "MyClass"
    ATTRIB:
      @1fd9fb20 02 LISTSXP g0c0 [] 
    TAG: @236cc00 01 SYMSXP g0c0 [MARK,NAM(2)] "package"
    @1fd9f278 16 STRSXP g0c1 [NAM(2)] (len=1, tl=0)
      @23cc938 09 CHARSXP g0c2 [MARK,gp=0x61] [ASCII] [cached] ".GlobalEnv"

Which shows that the underlying S-expression used to represent all R objects is an S4SXP, with a list of attributes attached.

By using S3-ism class<- you've created, as @hadley points out, a hybrid monster. class<- merely updates the class attribute, without altering the underlying S4SXP. When you print the object, it prints using the print method for objects of class "animal", probably print.default. On the other hand, isS4 tests whether the S-expression is S4SXP, which it is. So you've got some of each...

Coerce, perhaps by implementing the relevent setAs function, usingas(obj, "animal")`.

like image 197
Martin Morgan Avatar answered Oct 18 '22 00:10

Martin Morgan


It is a little bit tricky to ask what is an S4 object. If we take the definition from R internals, yes, it is still an S4 object because the S4 bit is still set.

obj <- new('MyClass', data=data.frame(1:3), name='An S4 class')
attr(obj, 'class')
## [1] "MyClass"
## attr(,"package")
## [1] ".GlobalEnv"

obj2 <- obj
class(obj2) <- 'animal'
attr(obj, 'class')
## [1] "MyClass"

Note that the only difference (as far as memory representation is concerned) between obj and obj2 is in fact the lack of package attribute associated with the class attribute. We can "fix" this by calling:

attr(class(obj2), "package") <- ".GlobalEnv"

But in such a case we also get the same "strange" result:

print(obj2)
## An object of class "animal"
## <S4 Type Object>
## attr(,"data")
##   X1.3
## 1    1
## 2    2
## 3    3
## attr(,"name")
## [1] "An S4 class"

So let's look for the method responsible for printing obj and obj2. In both cases this is done via show with signature ANY. Printing getMethod("show", "ANY") dispatches us to the showDefault function.

And the first thing that showDefault does is:

...
clDef <- getClass(cl <- class(object), .Force = TRUE)
...

You see, getClass cannot find the formal class definition for animal in the GlobalEnv. This is why it calls show(unclass(object)) and we see everything as attributes (cf. print(unclass(obj))) (EDIT: why attributes: explained in @MartinMorgan's answer).

like image 35
gagolews Avatar answered Oct 18 '22 00:10

gagolews