If I index a data.frame of all integers with a matrix, I get the expected result.
df <- data.frame(c1=1:4, c2=5:8)
df1
# c1 c2
#1 1 5
#2 2 6
#3 3 7
#4 4 8
df1[matrix(c(1:4,1,2,1,2), nrow=4)]
# [1] 1 6 3 8
If the data.frame has a column of characters, the result is all characters, even though I'm only indexing the integer columns.
df2 <- data.frame(c0=letters[1:4], c1=1:4, c2=5:8)
df2
# c0 c1 c2
#1 a 1 5
#2 b 2 6
#3 c 3 7
#4 d 4 8
df2[matrix(c(1:4,2,3,2,3), nrow=4)]
# [1] "1" "6" "3" "8"
class(df[matrix(c(1:4,2,3,2,3), nrow=4)])
# [1] "character"
df2[1,2]
# [1] 1
My best guess is that R is too busy to go through the answer to check if they all originated from a certain class. Can anyone please explain why this is happening?
In ?Extract
it is described that indexing via a numeric matrix is intended for matrices and arrays. So it might be surprising that such indexing worked for a data frame in the first place.
However, if we look at the code for [.data.frame
(getAnywhere(`[.data.frame`)
), we see that when extracting elements from a data.frame
using a matrix
in i
, the data.frame
is first coerced to a matrix
with as.matrix
:
function (x, i, j, drop = if (missing(i)) TRUE else length(cols) ==
1)
{
# snip
if (Narg < 3L) {
# snip
if (is.matrix(i))
return(as.matrix(x)[i])
Then look at ?as.matrix
:
"The method for data frames will return a character matrix if there is only atomic columns and any non-(numeric/logical/complex) column".
Thus, because the first column in "df2" is of class character
, as.matrix
will coerce the entire data frame to a character
matrix before the extraction takes place.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With