Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is the class of a vector the class of the elements of the vector and not vector itself?

Tags:

r

I don't understand why the class of a vector is the class of the elements of the vector and not vector itself.

vector <- c("la", "la", "la")
class(vector) 
## [1] "character"

matrix <- matrix(1:6, ncol=3, nrow=2)
class(matrix) 
## [1] "matrix"
like image 240
megashigger Avatar asked Jun 05 '14 05:06

megashigger


People also ask

What is the class of a vector?

Class Vector<E> The Vector class implements a growable array of objects. Like an array, it contains components that can be accessed using an integer index. However, the size of a Vector can grow or shrink as needed to accommodate adding and removing items after the Vector has been created.

What is the class of a vector in R?

Atomic Vectors A vector can be a vector of characters, logical, integers or numeric. The general pattern is vector(class of object, length) . You can also create vectors by concatenating them using the c() function.

Is a special type of vector which contains elements?

Lists are sometimes called generic vectors, because the elements of a list can by of any type of R object, even lists containing further lists. This property makes them fundamentally different from atomic vectors. A list is a special type of vector. Each element can be a different type.

Can a vector have different types in R?

In R a vector can not contain different types. Everything must e.g. be an integer or everything must be character etc.


2 Answers

This is what I get from this. class is mainly meant for object oriented programming and there are other functions in R which will give you the storage mod of an object (see ?typeof or ?mode).

When looking at ?class

Many R objects have a class attribute, a character vector giving the names of the classes from which the object inherits. If the object does not have a class attribute, it has an implicit class, "matrix", "array" or the result of mode(x)

It seems like class works as follows

  1. It first looks for a $class attribute

  2. If there isn't any, it checks if the object has a matrix or an array structure by checking the $dim attribute (which is not present in a vector)

    2.1. if $dim contains two entries, it will call it a matrix

    2.2. if $dim contains one entry or more than two entries, it will call it an array

    2.3. if $dim is of length 0, it goes to the next step (mode)

  3. if $dim is of length 0 and there is no $class attribute, it performs mode

So per your example

mat <- matrix(rep("la", 3), ncol=1)
vec <- rep("la", 3)
attributes(vec)
# NULL
attributes(mat)
## $dim
## [1] 3 1

So you can see that vec doesn't contain any attributes whatsoever (see ?c or ?as.vector for explanation)

So in first case, class performs

attributes(vec)$class
# NULL
length(attributes(vec)$dim)
# 0
mode(vec)
## [1] "character"

In the second case it checks

attributes(mat)$class
# NULL
length(attributes(mat)$dim)
##[1] 2

It sees that the object has two dimensions and there for calls it matrix

In order to illustrate that both vec and mat have same storage mode, you can do

mode(vec)
## [1] "character"
mode(mat)
## [1] "character"

You can also see, for example, same behavior with an array

ar <- array(rep("la", 3), c(3, 1)) # two dimensional array
class(ar)
##[1] "matrix"
ar <- array(rep("la", 3), c(3, 1, 1)) # three dimensional array
class(ar)
##[1] "array"

So both array and matrix don't parse a class attribute. Let's check, for example, what data.frame does.

df <- data.frame(A = rep("la", 3))
class(df)
## [1] "data.frame"

Where did class took it from?

attributes(df)    
# $names
# [1] "A"
# 
# $row.names
# [1] 1 2 3
# 
# $class
# [1] "data.frame"

As you can see, data.fram sets a $class attribute, but this could be changed

attributes(df)$class <- NULL
class(df)
## [1] "list"

Why list? Because data.frame don't have a $dim attribute (neither a $class one, because we just deleted it), thus class performs mode(df)

mode(df)
## [1] "list"

Lastly, in order to illustrate how class works, we can manually set the class to whatever we want and see what it will give us back

mat <- structure(mat, class = "vector")
vec <- structure(vec, class = "vector")
class(mat)
## [1] "vector"
class(vec)
## [1] "vector"
like image 108
David Arenburg Avatar answered Nov 15 '22 12:11

David Arenburg


R needs to know the class of the object you are operating on to perform the appropriate method dispatch on that object. The atomic data type in R is a vector, there is no such thing as a scalar, i.e. R considers a single integer a length one vector; is.vector( 1L ).

In order to dispatch the correct method R must know the datatype. It's not much using knowing that something is a vector, when your language is implicitly vectorised and everything is designed to operate on a vector.

is.vector( list( NULL , NULL ) )
is.vector( NA )
is.vector( "a" )
is.vector( c( 1.0556 , 2L ) )

So you can take the return value of class( 1L ) which is [1] "integer" to mean, I am an atomic vector consisting of type integer.

Despite the fact that under the hood a matrix is actually just a vector with two dimension attributes, R must know it is a matrix so that it can operate row-wise or column-wise on the elements of the matrix (or individually on any single subscripted element). After subsetting, you will return a vector of the datatype of the elements in your matrix, which will allow R to dispatch the appropriate method for your data (e.g. performing sort on a character vector or a numeric vector);

/* from the underlying C code in /src/main/subset.c....*/
result = allocVector(TYPEOF(x), (R_xlen_t) nrs * (R_xlen_t) ncs)

You should also note, that before determining the class of an object, R will always check that it is a first a vector, e.g. in the case of running is.matrix(x) on some matrix, x, R checks that it is first a vector, and then it checks for dimension attributes. If the dimension attributes is a vector of INTEGER data types of LENGTH 2 it satisfies the conditions for that object being a matrix (the following code snippet is from Rinlinedfuns.h from /src/include/)

INLINE_FUN Rboolean isMatrix(SEXP s)
  495 {
  496     SEXP t;
  497     if (isVector(s)) {
  498    t = getAttrib(s, R_DimSymbol);
  499    /* You are not supposed to be able to assign a non-integer dim,
  500       although this might be possible by misuse of ATTRIB. */
  501    if (TYPEOF(t) == INTSXP && LENGTH(t) == 2)
  502        return TRUE;
  503     }
  504     return FALSE;
  505 }

#  e.g. create an array with height and width....  
a <- array( 1:4 , dim=c(2,2) )

#  this is a matrix!
class(a)
#[1] "matrix"

# And the class of the first column is an atomic vector of type integer....
class(a[,1])
[1] "integer"
like image 38
Simon O'Hanlon Avatar answered Nov 15 '22 10:11

Simon O'Hanlon