Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

combn unclasses factor variables

Tags:

syntax

r

UPDATE: FIXED

This is fixed in the upcoming release of R 3.1.0. From the CHANGELOG:

combn(x, simplify = TRUE) now gives a factor result for factor input x (previously user error).
Related to PR#15442


I just noticed a curious thing. Why does combn appear to unclass factor variables to their underlying numeric values for all except the first combination?

x <- as.factor( letters[1:3] )

combn( x , 2 )
#     [,1] [,2] [,3]
#[1,] "a"  "1"  "2" 
#[2,] "b"  "3"  "3" 

This doesn't occur when x is a character:

x <- as.character( letters[1:3] )

combn( x , 2 )
#     [,1] [,2] [,3]
#[1,] "a"  "a"  "b" 
#[2,] "b"  "c"  "c"

Reproducible on R64 on OS X 10.7.5 and Windows 7.

like image 236
Simon O'Hanlon Avatar asked Sep 04 '13 14:09

Simon O'Hanlon


2 Answers

I think it is due to the conversion to matrix done by the simplify parameter. If you don't use it you get:

combn( x , 2 , simplify=FALSE)
[[1]]
[1] a b
Levels: a b c

[[2]]
[1] a c
Levels: a b c

[[3]]
[1] b c
Levels: a b c

The fact that the first column is OK is due to the way combn works: the first column is specified separately and the other columns are then changed from the existing matrix using [<-. Consider:

m <- matrix(x,3,3)
m[,2] <- sample(x)
m
     [,1] [,2] [,3]
[1,] "a"  "1"  "a" 
[2,] "b"  "3"  "b" 
[3,] "c"  "2"  "c" 

I think the offending function is therefore [<-.

like image 169
James Avatar answered Sep 29 '22 09:09

James


As Konrad said, the treatment of factors is often odd, or at least inconsistent. In this case I think the behaviour is weird enough to constitute a bug. Try submitting it, and see what the response is.

Since the result is a matrix, and there is no factor matrix type, I think that the correct behaviour would be to convert factor inputs to character somewhere near the start of the function.

like image 37
Richie Cotton Avatar answered Sep 29 '22 09:09

Richie Cotton