Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Repeat vector to fill down column in data frame

Seems like this very simple maneuver used to work for me, and now it simply doesn't. A dummy version of the problem:

df <- data.frame(x = 1:5) # create simple dataframe
df
  x
1 1
2 2
3 3
4 4
5 5

df$y <- c(1:5) # adding a new column with a vector of the exact same length. Works out like it should
df
 x y
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5

df$z <- c(1:4) # trying to add a new colum, this time with a vector with less elements than there are rows in the dataframe.

Error in `$<-.data.frame`(`*tmp*`, "z", value = 1:4) : 
  replacement has 4 rows, data has 5

I was expecting this to work with the following result:

 x y z
1 1 1 1
2 2 2 2
3 3 3 3
4 4 4 4
5 5 5 1

I.e. the shorter vector should just start repeating itself automatically. I'm pretty certain this used to work for me (it's in a script that I've been running a hundred times before without problems). Now I can't even get the above dummy example to work like I want to. What am I missing?

like image 939
Morten Nielsen Avatar asked Dec 25 '22 03:12

Morten Nielsen


1 Answers

If the vector can be evenly recycled, into the data.frame, you do not get and error or a warning:

df <- data.frame(x = 1:10)
df$z <- 1:5

This may be what you were experiencing before.

You can get your vector to fit as you mention with rep_len:

df$y <- rep_len(1:3, length.out=10)

This results in

df
    x z y
1   1 1 1
2   2 2 2
3   3 3 3
4   4 4 1
5   5 5 2
6   6 1 3
7   7 2 1
8   8 3 2
9   9 4 3
10 10 5 1

Note that in place of rep_len, you could use the more common rep function:

df$y <- rep(1:3,len=10)

From the help file for rep:

rep.int and rep_len are faster simplified versions for two common cases. They are not generic.

like image 182
lmo Avatar answered Dec 26 '22 19:12

lmo