Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combine each element of a vector with another vector in R

I have two vectors

x <- c(2, 3, 4)
y <- rep(0, 5)

I want to get the following output:

> z
2, 0, 0, 0, 0, 0, 3, 0, 0, 0, 0, 0, 4, 0, 0, 0, 0, 0

How can I create z? I have tried to use paste and c but nothing seems to work. The only thing I can think of is using a for() and it is terribly slow. I have googled this and I am sure the solution is out there and I am just not hitting the right keywords.

UPDATE: For benchmarking purposes:

Using Nicola's solution:

 > system.time(
+ precipitation <- `[<-`(numeric(length(x)*(length(y)+1)),seq(1,by=length(y)+1,length.out=length(x)),x)
+ )
user  system elapsed 
0.419   0.407   0.827 

This is ridiculously fast! I must say! Can someone please explain this to me? My for() which I know is always wrong in R would have taken at least a day if it even finished.

The other suggestions:

 > length(prate)
[1] 4914594
> length(empty)
[1] 207
> system.time(
+ precipitation <- unlist(sapply(prate, FUN = function(prate) c(prate,empty), simplify=FALSE))
+ )
user  system elapsed 
16.470   3.859  28.904 

I had to kill

len <- length(prate)
precip2 <- c(rbind(prate, matrix(rep(empty, len), ncol = len)))

After 15 minutes.

like image 571
cdd Avatar asked Apr 22 '15 10:04

cdd


5 Answers

This seems faster for some reason:

 unlist(t(matrix(c(as.list(x),rep(list(y),length(x))),ncol=2)))

The above solution is general, in the sense that both x and y can have any value. In the OP case, where y is made just of 0, this is fast as it can be:

 `[<-`(numeric(length(x)*(length(y)+1)),seq(1,by=length(y)+1,length.out=length(x)),x)
 #[1] 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0

Edit

I realise I've been very cryptic and the code I produced is not easy to follow, despite being just one line. I'm gonna explain in detail what the second solution does.

First of all, you notice that the resulting vector will have the values containd in x plus the zeroes in y repeated length(x) times. So in total, it will be length(x) + length(x)*length(y) or length(x)*(length(y)+1) long. So we create a vector with just zeroes as long as needed:

  res<-numeric(length(x)*(length(y)+1))

Now we have to place the x values in res. We notice that the first value of x occupies the first value in res; the second will be after length(y)+1 from the first and so on, until all the length(x) values are filled. We can create a vector of indices in which to put the x values:

  indices<-seq.int(1,by=length(y)+1,length.out=length(x))

And then we make the replacement:

  res[indices]<-x

My line was just a shortcut for the three lines above. Hope this clarifies a little.

like image 191
nicola Avatar answered Oct 28 '22 23:10

nicola


you can try this

unlist(sapply(x, FUN = function(x) c(x,y), simplify=FALSE))
 [1] 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0

or simpler from @docendodiscimus

unlist(lapply(x, FUN = function(x) c(x,y)))
like image 30
Mamoun Benghezal Avatar answered Oct 29 '22 01:10

Mamoun Benghezal


You could also try to vectorize as follows

len <- length(x)
c(rbind(x, matrix(rep(y, len), ncol = len)))
## [1] 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0

A more compact, but potentially slower option (contributed by @akrun) would be

c(rbind(x, replicate(len, y)))
## [1] 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0
like image 45
David Arenburg Avatar answered Oct 29 '22 00:10

David Arenburg


You can try:

 c(sapply(x, 'c', y))
 #[1] 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 

Or a crazy solution with gusb and paste..

library(functional)
p = Curry(paste0, collapse='')

as.numeric(strsplit(p(gsub('(.*)$', paste0('\\1',p(y)),x)),'')[[1]])
#[1] 2 0 0 0 0 0 3 0 0 0 0 0 4 0 0 0 0 0
like image 39
Colonel Beauvel Avatar answered Oct 29 '22 01:10

Colonel Beauvel


Here's another way:

options(scipen=100)
as.numeric(unlist(strsplit(as.character(x * 10^5), "")))

And some benchmarks:

microbenchmark({as.numeric(unlist(strsplit(as.character(x*10^5), "")))}, {unlist(t(matrix(c(as.list(x),rep(list(y),length(x))),ncol=2)))}, {unlist(sapply(x, FUN = function(x) c(x,y), simplify=FALSE))}, times=100000)
Unit: microseconds
                                                                        expr
            {     as.numeric(unlist(strsplit(as.character(x * 10^5), ""))) }
 {     unlist(t(matrix(c(as.list(x), rep(list(y), length(x))), ncol = 2))) }
      {     unlist(sapply(x, FUN = function(x) c(x, y), simplify = FALSE)) }
   min     lq     mean median     uq       max  neval
 9.286 10.644 12.15242 11.678 12.286  1650.133 100000
 9.485 11.164 13.25424 12.288 13.067  1887.761 100000
 5.607  7.429  9.21015  8.147  8.784 30457.994 100000

And here's another idea (but it seems slow):

r = rle(1)
r$lengths = rep(c(1,5), length(x))
r$values =  as.vector(rbind(x, 0))
inverse.rle(r)
like image 23
nsheff Avatar answered Oct 29 '22 00:10

nsheff