df is a frequency table, where the values in a were reported as many times as recorded in column x,y,z. I'm trying to convert the frequency table to the original data, so I use the rep() function.
How do I loop the rep() function to give me the original data for x, y, z without having to repeat the function several times like I did below?
Also, can I input the result into a data frame, bearing in mind that the output will have different column lengths:
a <- (1:10)
x <- (6:15)
y <- (11:20)
z <- (16:25)
df <- data.frame(a,x,y,z)
df
rep(df[,1], df[,2])
rep(df[,1], df[,3])
rep(df[,1], df[,4])
If you don't want to repeat the for loop, you can always try using an apply function. Note that you cannot store it in a data.frame because the objects are of different lengths, but you could store it in a list and access the elements in a similar way to a data.frame. Something like this works:
df2<-sapply(df[,2:4],function(x) rep(df[,1],x))
What this sapply function is saying is for each column in df[,2:4], apply the rep(df[,1],x) function to it where x is one of your columns ( df[,2], df[,3], or df[,4]).
The below code just makes sure the apply function is giving the same result as your original way.
identical(df2$x,rep(df[,1], df[,2]))
[1] TRUE
identical(df2$y,rep(df[,1], df[,3]))
[1] TRUE
identical(df2$z,rep(df[,1], df[,4]))
[1] TRUE
EDIT:
If you want it as a data.frame object you can do this:
res<-as.data.frame(sapply(df2, '[', seq(max(sapply(df2, length)))))
Note this introduces NAs into your data.frame so be careful!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With