Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Indexing the elements of a matrix in R

Tags:

r

matrix

The problem is pretty silly, but I am wondering if I am missing something. Let's say that there is a vector k that contains some numbers, say

> k
 [1]  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15

I want to transform this to a matrix

> m
     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    0    6    7    8    9
[3,]    0    0   10   11   12
[4,]    0    0    0   13   14
[5,]    0    0    0    0   15

My first idea was to use something with upper.tri(), for example like m[upper.tri(m, diag = TRUE)] <- k, but that will not give the matrix above.

Is there a more intelligent solution to this? Below there's my solution but let's just say I am not too proud of it.


rows <- rep(1:5, 5:1)

cols1 <- rle(rows)$lengths


cols <- do.call(c, lapply(1:length(cols1), function(x) x:5))

for(i in 1:length(k)) {
  m[rows[i], cols[i]] <- k[i]
}
like image 706
Theodor Avatar asked Jun 03 '16 13:06

Theodor


People also ask

How do you index a matrix in R?

Finally, there is one more way of indexing Matrices (for now), that provides only one index: If you give one index, then R will count down the first row, then the second, then the third, etc., until it reaches the index you specified. Notice how this agrees with the 5th element of the matrix V, which was used to make our matrix!

How do we use matrices containing numeric elements in R?

We use matrices containing numeric elements to be used in mathematical calculations. A Matrix is created using the matrix () function. The basic syntax for creating a matrix in R is − data is the input vector which becomes the data elements of the matrix. nrow is the number of rows to be created. ncol is the number of columns to be created.

How do you index a matrix with a single vector?

It is possible to index a matrix with a single vector. While indexing in such a way, it acts like a vector formed by stacking columns of the matrix one after another. The result is returned as a vector. Two logical vectors can be used to index a matrix. In such situation, rows and columns where the value is TRUE is returned.

What is indexing in R?

The process of selecting elements using their indices is called indexing, and R provides multiple ways of indexing vectors. Below we’ll cover some basic indexing and more advanced indexing for the different data structures in R.


3 Answers

Here's an option using lower.tri and t to transpose the result:

k <- 1:15
m <- matrix(0, 5,5)
m[lower.tri(m, diag = TRUE)] <- k
m <- t(m)
m 
#     [,1] [,2] [,3] [,4] [,5]
#[1,]    1    2    3    4    5
#[2,]    0    6    7    8    9
#[3,]    0    0   10   11   12
#[4,]    0    0    0   13   14
#[5,]    0    0    0    0   15

Microbenchmark

Since there was some confusion with Joseph's benchmark, here's another one. I tested the three solutions for matrices of size 10*10; 100*100; 1000*1000; 10000*10000.

Results:

pic

Apparently, the performance depends heavily on the size of the matrix. For large matrices, Joseph's answer performs fastest, while for smaller matrices, mine was the fastest approach. Note that this doesn't take memory efficiency into account.

Reproducible benchmark:

Joseph <- function(k, n) {
  y <- 1L
  t <- rep(0L,n)
  j <- c(y, sapply(1:(n-1L), function(x) y <<- y+(n+1L)-x))
  t(vapply(1:n, function(x) c(rep(0L,x-1L),k[j[x]:(j[x]+n-x)]), t, USE.NAMES = FALSE))
}

Frank <- function(k, n) {
  m = matrix(0L, n, n)
  m[ which(lower.tri(m, diag=TRUE), arr.ind=TRUE)[, 2:1] ] = k
  m
}

docendo <- function(k,n) {
  m <- matrix(0L, n, n)
  m[lower.tri(m, diag = TRUE)] <- k
  t(m)
}

library(microbenchmark)
library(data.table)
library(ggplot2)
n <- c(10L, 100L, 1000L, 10000L)
k <- lapply(n, function(x) seq.int((x^2 + x)/2))

b <- lapply(seq_along(n), function(i) {
  bm <- microbenchmark(Joseph(k[[i]], n[i]), Frank(k[[i]], n[i]), docendo(k[[i]], n[i]), times = 10L)
  bm$n <- n[i]
  bm
})

b1 <- rbindlist(b)

ggplot(b1, aes(expr, time)) +
  geom_violin() +
  facet_wrap(~ n, scales = "free_y") +
  ggtitle("Benchmark for n = c(10L, 100L, 1000L, 10000L)")

Check equality of results:

all.equal(Joseph(k[[1]], n[1]), Frank(k[[1]], n[1]))
#[1] TRUE
all.equal(Joseph(k[[1]], n[1]), docendo(k[[1]], n[1]))
#[1] TRUE

Note: I didn't include George's approach in the comparison since, judging by Joseph's results, it seems to be a lot slower. So all approaches compared in my benchmark are written only in base R.

like image 104
talat Avatar answered Oct 23 '22 14:10

talat


A variation on @docendodiscimus' answer: Instead of transposing you can change row and col indices, which you get by wrapping lower.tri in which:

n = 5
m = matrix(0, n, n)

m[ which(lower.tri(m, diag=TRUE), arr.ind=TRUE)[, 2:1] ] = seq(sum(seq(n)))


     [,1] [,2] [,3] [,4] [,5]
[1,]    1    2    3    4    5
[2,]    0    6    7    8    9
[3,]    0    0   10   11   12
[4,]    0    0    0   13   14
[5,]    0    0    0    0   15

To understand how it works, look at the left-hand side in steps:

  • lower.tri(m, diag=TRUE)
  • which(lower.tri(m, diag=TRUE), arr.ind=TRUE)
  • which(lower.tri(m, diag=TRUE), arr.ind=TRUE)[, 2:1]

I guess transposing might be costly if the matrix is large, which is why I'd consider this option. Note: Joseph Wood's answer suggests that I am wrong, since the transposing way is faster in his benchmark.


(Thanks to @JosephWood:) Instead of enumerating and summing with sum(seq(n)), you can use (n^2 - n)/2 + n.

like image 25
Frank Avatar answered Oct 23 '22 13:10

Frank


library(miscTools)
k <- 1:15
triang(k, 5)
like image 8
George Dontas Avatar answered Oct 23 '22 14:10

George Dontas