Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does class change from integer to character when indexing a data frame with a numeric matrix?

Tags:

r

If I index a data.frame of all integers with a matrix, I get the expected result.

df <- data.frame(c1=1:4, c2=5:8)
df1
#  c1 c2
#1  1  5
#2  2  6
#3  3  7
#4  4  8

df1[matrix(c(1:4,1,2,1,2), nrow=4)]
# [1] 1 6 3 8

If the data.frame has a column of characters, the result is all characters, even though I'm only indexing the integer columns.

df2 <- data.frame(c0=letters[1:4], c1=1:4, c2=5:8)
df2
#  c0 c1 c2
#1  a  1  5
#2  b  2  6
#3  c  3  7
#4  d  4  8

df2[matrix(c(1:4,2,3,2,3), nrow=4)]
# [1] "1" "6" "3" "8"

class(df[matrix(c(1:4,2,3,2,3), nrow=4)])
# [1] "character"

df2[1,2]
# [1] 1

My best guess is that R is too busy to go through the answer to check if they all originated from a certain class. Can anyone please explain why this is happening?

like image 570
N8TRO Avatar asked Sep 27 '22 02:09

N8TRO


1 Answers

In ?Extract it is described that indexing via a numeric matrix is intended for matrices and arrays. So it might be surprising that such indexing worked for a data frame in the first place.

However, if we look at the code for [.data.frame (getAnywhere(`[.data.frame`)), we see that when extracting elements from a data.frame using a matrix in i, the data.frame is first coerced to a matrix with as.matrix:

function (x, i, j, drop = if (missing(i)) TRUE else length(cols) == 
            1) 
{
# snip
  if (Narg < 3L) {
# snip
    if (is.matrix(i)) 
      return(as.matrix(x)[i])

Then look at ?as.matrix:

"The method for data frames will return a character matrix if there is only atomic columns and any non-(numeric/logical/complex) column".

Thus, because the first column in "df2" is of class character, as.matrix will coerce the entire data frame to a character matrix before the extraction takes place.

like image 189
Henrik Avatar answered Oct 19 '22 22:10

Henrik