Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Consistently subset matrix to a vector and avoid colnames?

I would like to know if there is R syntax to extract a column from a matrix and always have no name attribute on the returned vector (I wish to rely on this behaviour).

My problem is the following inconsistency:

  • when a matrix has more than one row and I do myMatrix[, 1] I will get the first column of myMatrix with no name attribute. This is what I want.
  • when a matrix has exactly one row and I do myMatrix[, 1], I will get the first column of myMatrix but it has the first colname as its name.

I would like to be able to do myMatrix[, 1] and consistently get something with no name.

An example to demonstrate this:

# make a matrix with more than one row,
x <- matrix(1:2, nrow=2)
colnames(x) <- 'foo'
#      foo
# [1,]   1
# [2,]   2

# extract first column. Note no 'foo' name is attached.
x[, 1]
# [1] 1 2

# now suppose x has just one row (and is a matrix)
x <- x[1, , drop=F]
# extract first column
x[, 1]
# foo    # <-- we keep the name!!
#   1

Now, the documentation for [ (?'[') mentions this behaviour, so it's not a bug or anything (although, why?! why this inconsistency?!):

A vector obtained by matrix indexing will be unnamed unless ‘x’ is one-dimensional when the row names (if any) will be indexed to provide names for the result.

My question is, is there a way to do x[, 1] such that the result is always unnamed, where x is a matrix?

Is my only hope unname(x[, 1]) or is there something analogous to ['s drop argument? Or is there an option I can set to say "always unname"? Some trick I can use (somehow override ['s behaviour when the extracted result is a vector?)

like image 942
mathematical.coffee Avatar asked Nov 13 '22 10:11

mathematical.coffee


1 Answers

Update on why the code below works (as far as I can tell)

Subsetting with [ is handled using functions contained in the R source file subset.c in ~/src/main. When using matrix indexing to subset a matrix, the function VectorSubset is called. When there is more than one index used (i.e., one each for rows and columns as in x[,1]), then MatrixSubset is called.

The function VectorSubset only assigns names to 1-dimensional arrays being subsetted. Since a matrix is a 2-D array, no names are assigned to the result when using matrix indexing. The function MatrixSubset, however, does attempt to pass on dimnames under certain circumstances.


Therefore, the matrix indexing you refer to in the quote from the help page seems to be the key:

x <- matrix(1)
colnames(x) <- "foo"
x[, 1]  ## 'Normal' indexing
# foo 
#   1 
x[matrix(c(1, 1), ncol = 2)]  ## Matrix indexing
# [1] 1

And with a wider 1-row matrix:

xx <- matrix(1:10, nrow = 1)
colnames(xx) <- sprintf('foo%i', seq_len(ncol(xx)))
xx[, 6]  ## 'Normal' indexing
# foo6 
#    6 
xx[matrix(c(1, 6), ncol = 2)]  ## Matrix indexing
# [1] 6

With a matrix with both dimensions > 1:

yy <- matrix(1:10, nrow = 2, dimnames = list(NULL,
  sprintf('foo%i', 1:5)))

yy[cbind(seq_len(nrow(yy)), 3)]  ## Matrix indexing
# [1] 5 6
like image 60
BenBarnes Avatar answered Nov 15 '22 07:11

BenBarnes