I have faced some strange behavior in the R's all.equal function. Basically, I create two same data.frames differently and then call the all.equal function (checking data and attributes as well).
The code to reproduce the behavior is as follows:
var.a <- data.frame(cbind(as.integer(c(1,5,9)), as.integer(c(1,5,9))))
colnames(var.a) <- c("C1", "C2")
rownames(var.a) <- c("1","5","9")
var.b <- data.frame(matrix(NA, nrow = 10, ncol = 2))
var.b[, 1] <- 1:10
var.b[, 2] <- 1:10
colnames(var.b) <- c("C1", "C2")
var.b <- var.b[seq(1, nrow(var.b), 4), ]
all.equal(var.a, var.b)
Is this a bug or am I just missing something? I did quite some debugging of the all.equall function and it appears the problem is the the rownames of the data.frames (once them being a character the other time a numeric vector). The response of the all.equall function:
[1] "Attributes: < Component 2: Modes: character, numeric >"
[2] "Attributes: < Component 2: target is character, current is numeric >"
However,
typeof(rownames(var.a)) == typeof(rownames(var.b))
returns TRUE, which confuses me.
P.S.: The structure of the objects seems the same:
> str(var.a)
'data.frame': 3 obs. of 2 variables:
$ C1: int 1 5 9
$ C2: int 1 5 9
> str(var.b)
'data.frame': 3 obs. of 2 variables:
$ C1: int 1 5 9
$ C2: int 1 5 9
I would appreciate if someone could shed some light on this.
(I'm not exactly clear what bug you are thinking you have found. The data frames were not created the same way.) There are two differences in the structures of var.a and var.b: The mode of the elements in the columns: numeric
in 'var.a' and integer
in 'var.b'; and the mode of the rownames: integer
for 'var.a' and character
in 'var.b':
> dput(var.b)
structure(list(C1 = c(1L, 5L, 9L), C2 = c(1L, 5L, 9L)), .Names = c("C1",
"C2"), row.names = c(1L, 5L, 9L), class = "data.frame")
> dput(var.a)
structure(list(C1 = c(1, 5, 9), C2 = c(1, 5, 9)), .Names = c("C1",
"C2"), row.names = c("1", "5", "9"), class = "data.frame")
> mode(attr(var.b, "row.names"))
[1] "numeric"
> storage.mode(attr(var.b, "row.names"))
[1] "integer"
> mode(attr(var.a, "row.names"))
[1] "character"
Added note: If you wanted to check for numerical equality you should use the 'check.attributes' switch:
> all.equal(var.a, var.b, check.attributes=FALSE)
[1] TRUE
If you look at var.b
with dput
, you can see that the rownames are numeric:
> dput(var.b)
structure(list(C1 = c(1L, 5L, 9L), C2 = c(1L, 5L, 9L)), .Names = c("C1",
"C2"), row.names = c(1L, 5L, 9L), class = "data.frame")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With