Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is names(x) better than attr(x, "names")?

Tags:

r

I am reading an Advanced R topic on data structures and attributes. It says:

You should always get and set these attributes with their accessor functions: use names(x), class(x) and dim(x), not attr(x, "names"), attr(x, "class"), and attr(x, "dim").

What is the justification for that? Is there an example of an unexpected behaviour? Or is this simply a recommendation? On a trivial level I don't see any difference:

v <- 1:2
names(v) <- 3:4
all(attr(v, "names") == names(v))
#[1] TRUE
attr(v, "names") <- 5:6
all(attr(v, "names") == names(v))
#[1] TRUE

I've tried a more complicated approach by browsing the source, namely do_names and do_attributes. I see that the difference is substantial, so that names(x) is not simply an alias for attr(x, "names"). I would say that the former is presumably faster, but that's a wild guess.

As an additional question, is there any difference between names(), class() and dim() from that perspective?

like image 204
tonytonov Avatar asked Jan 10 '14 14:01

tonytonov


1 Answers

You shouldn't access attributes directly because the author of code should provide an API for you to use to access them. That gives them the flexibility to change the underlying code without changing the API.

The xts package provides a good example of this:

> library(xts)
> x <- xts(1:3, as.POSIXct("2014-01-01")+0:2)
> index(x)
[1] "2014-01-01 00:00:00 CST" "2014-01-01 00:00:01 CST" "2014-01-01 00:00:02 CST"
> attr(x, "index")
[1] 1388556000 1388556001 1388556002
attr(,"tzone")
[1] ""
attr(,"tclass")
[1] "POSIXct" "POSIXt"

At one point in the past, the internal index was stored as POSIXct, but we changed the underlying structure for performance reasons. You can see that the public API didn't change, however.

like image 51
Joshua Ulrich Avatar answered Oct 13 '22 04:10

Joshua Ulrich