I want to add variables from dat2
:
concreteness familiarity typicality
amoeba 3.60 1.30 1.71
bacterium 3.82 3.48 2.13
leech 5.71 1.83 4.50
To dat1
:
ID variable value
1 1 amoeba 0
2 2 amoeba 0
3 3 amoeba NA
251 1 bacterium 0
252 2 bacterium 0
253 3 bacterium 0
501 1 leech 1
502 2 leech 1
503 3 leech 0
Giving the following output:
X ID variable value concreteness familiarity typicality
1 1 1 amoeba 0 3.60 1.30 1.71
2 2 2 amoeba 0 3.60 1.30 1.71
3 3 3 amoeba NA 3.60 1.30 1.71
4 251 1 bacterium 0 3.82 3.48 2.13
5 252 2 bacterium 0 3.82 3.48 2.13
6 253 3 bacterium 0 3.82 3.48 2.13
7 501 1 leech 1 5.71 1.83 4.50
8 502 2 leech 1 5.71 1.83 4.50
9 503 3 leech 0 5.71 1.83 4.50
As you can see the info from dat1
has to be replicated over several rows in dat2
.
This was my failed attempt:
dat3 <- merge(dat1, dat2, by=intersect(dat1$variable(dat1), dat2$row.names(dat2)))
Givng the following error:
Error in as.vector(y) : attempt to apply non-function
Please find replicate examples here:
dat1:
structure(list(ID = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L, 3L), variable = structure(c(1L,
1L, 1L, 2L, 2L, 2L, 3L, 3L, 3L), .Label = c("amoeba", "bacterium",
"leech", "centipede", "lizard", "tapeworm", "head lice", "maggot",
"ant", "moth", "mosquito", "earthworm", "caterpillar", "scorpion",
"snail", "spider", "grasshopper", "dust mite", "tarantula", "termite",
"bat", "wasp", "silkworm"), class = "factor"), value = c(0L,
0L, NA, 0L, 0L, 0L, 1L, 1L, 0L)), .Names = c("ID", "variable",
"value"), row.names = c(1L, 2L, 3L, 251L, 252L, 253L, 501L, 502L,
503L), class = "data.frame")
dat2:
structure(list(concreteness = c(3.6, 3.82, 5.71), familiarity = c(1.3,
3.48, 1.83), typicality = c(1.71, 2.13, 4.5)), .Names = c("concreteness",
"familiarity", "typicality"), row.names = c("amoeba", "bacterium",
"leech"), class = "data.frame")
It is possible to join the different columns is using concat() method. DataFrame: It is dataframe name. axis: 0 refers to the row axis and1 refers the column axis. join: Type of join.
You could add a join variable to dat2 then using merge:
dat2$variable <- rownames(dat2)
merge(dat1, dat2)
variable ID value concreteness familiarity typicality
1 amoeba 1 0 3.60 1.30 1.71
2 amoeba 2 0 3.60 1.30 1.71
3 amoeba 3 NA 3.60 1.30 1.71
4 bacterium 1 0 3.82 3.48 2.13
5 bacterium 2 0 3.82 3.48 2.13
6 bacterium 3 0 3.82 3.48 2.13
7 leech 1 1 5.71 1.83 4.50
8 leech 2 1 5.71 1.83 4.50
9 leech 3 0 5.71 1.83 4.50
Try this:
merge(dat1, dat2, by.x = 2, by.y = 0, all.x = TRUE)
This assumes that if there are any rows in dat1
that are unmatched then the dat2
columns in the result should be filled with NA
and if there are unmatched values in dat2
then they are disregarded. For example:
dat2a <- dat2
rownames(2a)[3] <- "elephant"
# the above still works:
merge(dat1, dat2a, by.x = 2, by.y = 0, all.x = TRUE)
The above is known as a left join in SQL and can be done like this in sqldf (ignore the warning):
library(sqldf)
sqldf("select *
from dat1 left join dat2
on dat1.variable = dat2.row_names",
row.names = TRUE)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With