I have 4 vectors (d1,d2,d3,d4) of different lengths from which I create a data frame like this
df <- data.frame(
x = c(
seq_along(d1),
seq_along(d2),
seq_along(d3),
seq_along(d4)
),
y = c(
d1,
d2,
d3,
d4
),
id = c(
rep("d1", times = length(d1)),
rep("d2", times = length(d2)),
rep("d3", times = length(d3)),
rep("d4", times = length(d4))
))
Adding a new vector means adding it in 3 different places, this is what I'd like to avoid.
Ideally I would like to pass d1,d2,d3,d4 into a function that then returns the data frame.
The first steps seems to be to wrap the vectors into a list and name them.
l <- list(d1,d2,d3,d4)
names(l) <- c("d1","d2","d3","d4")
But I am struggling with the 2nd part that probably should be something along the lines of this (pseudo code)
df <- data.frame(
x = flatten(map(l, function(a) seq_along(a))),
y = flatten(l),
id = flatten(map(l, function(a) rep(a.name,times=length(a))))
)
What's the correct way to construct the data frame from the list? Or is there a better way of doing this?
UPDATE: For demonstrative purposes d1..d4 could be imagined to be
d1 <- pnorm(seq(-2, 2, 0.05))-3
d2 <- pnorm(seq(-3, 3, 0.10))
d3 <- pnorm(seq(-1, 2, 0.05))-4
d4 <- pnorm(seq(-4, 3, 0.15))
You can define a function that takes any number of vectors:
build_df <- function(...)
{
vec_list <- list(...)
df <- data.frame(x = do.call("c", sapply(vec_list, seq_along)),
y = do.call("c", vec_list),
name = do.call("c", sapply(seq_along(vec_list),
function(i) rep(names(vec_list)[i],
length(vec_list[[i]]))))
)
rownames(df) <- seq(nrow(df))
df
}
build_df(d1 = 1:3, d2 = 6:9, bananas = 4:6)
#> x y name
#> 1 1 1 d1
#> 2 2 2 d1
#> 3 3 3 d1
#> 4 1 6 d2
#> 5 2 7 d2
#> 6 3 8 d2
#> 7 4 9 d2
#> 8 1 4 bananas
#> 9 2 5 bananas
#> 10 3 6 bananas
Created on 2020-08-03 by the reprex package (v0.3.0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With