I have several data frames with similar (but not identical) series of variables (columns). I want to find a way for R to tell me what are the common variables across different data frames.
Example:
`a <- c(1, 2, 3)
b <- c(4, 5, 6)
c <- c(7, 8, 9)
df1 <- data.frame(a, b, c)
b <- c(1, 3, 5)
c <- c(2, 4, 6)
df2 <- data.frame(b, c)`
With df1
and df2
, I would want some way for R to tell me that the common variables are b
and c
.
1) For 2 data frames:
intersect(names(df1), names(df2))
## [1] "b" "c"
To get the names that are in df1 but not in df2:
setdiff(names(df1), names(df2))
1a) and for any number of data frames (i.e. get the names common to all of them):
L <- list(df1, df2)
Reduce(intersect, lapply(L, names))
## [1] "b" "c"
2) An alternative is to use duplicated
since the common names will be the ones that are duplicated if we concatenate the names of the two data frames.
nms <- c(names(df1), names(df2))
nms[duplicated(nms)]
## [1] "b" "c"
2a) To generalize that to n data frames use table
and look for the names that occur the same number of times as data frames:
L <- list(df1, df2)
tab <- table(unlist(lapply(L, names)))
names(tab[tab == length(L)])
## [1] "b" "c"
Use intersect
:
intersect(colnames(df1),colnames(df2))
OR
We can also check for the colname using %in%
:
colnames(df1)[colnames(df1) %in% colnames(df2)]
Output:
[1] "b" "c"
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With