Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why is `row.names` preferred over `rownames`?

Tags:

r

There are two functions in the R core library.

  • row.names Get and Set Row Names for Data Frames
  • rownames Retrieve or set the row names of a matrix-like object.

However the docs for row.names specifies For a data frame, ‘rownames’ and ‘colnames’ eventually call ‘row.names’ and ‘names’ respectively, but the latter are preferred. Why are is row.names preferred? Wouldn't it be easier to just ignore row.names and just call rownames?

like image 593
NO WAR WITH RUSSIA Avatar asked Jul 19 '16 18:07

NO WAR WITH RUSSIA


People also ask

What is use of row names ()?

The rownames() and colnames() functions in R are used to obtain or set the names of the row and column of a matrix-like object, respectively.

Do Tibbles create row names?

While a tibble can have row names (e.g., when converting from a regular data frame), they are removed when subsetting with the [ operator. A warning will be raised when attempting to assign non- NULL row names to a tibble.

How do you define a row name in R?

A data frame's rows can be accessed using rownames() method in the R programming language. We can specify the new row names using a vector of numerical or strings and assign it back to the rownames() method. The data frame is then modified reflecting the new row names.


1 Answers

row.names() is an S3 generic function whereas rownames() is a lower level non-generic function. rownames() is in effect the default method for row.names() that is applied to any object in the absence of a more specific method.

If you are operating on a data frame x, then it is more efficient to use row.names(x) because there is a specific row.names() method for data frames. The row.names() method for data frames simply extracts the "row.names" attribute that is already stored in x. By contrast, because of the definition of rownames() and the inter-relationships between the functions, rownames(x) has to extract all the dimension names of x, then drop the column names, then combine with names(x), then drop names(x) again. This process even involves a call to row.names(x) as an intermediate step. This will all usually happen so quickly that you don't notice it, but just extracting the attribute is obviously more efficient.

It would be logical to just use the generic version row.names() all the time, since it always dispatches the appropriate method. There is no practical advance in using rownames(x) over row.names(x). For object classes that have a defined row.names method, then rownames(x) is wrong because it bypasses that method. For object classes with no defined row.names method, then the two functions are equivalent because row.names(x) simply calls rownames(x).

The reason why both functions exist is historical. rownames() is the older function and was part of the R language before generic functions and methods were introduced. It was intended only for use on matrices, but it will work fine on any data object that has a dimnames attribute. I personally use rownames(x) when x is a matrix and row.names(x) otherwise but, as I have said, one could just as well use row.names(x) all the time.

like image 126
Gordon Smyth Avatar answered Sep 19 '22 06:09

Gordon Smyth