Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to find common variables in different data frames?

Tags:

dataframe

r

I have several data frames with similar (but not identical) series of variables (columns). I want to find a way for R to tell me what are the common variables across different data frames.

Example:

`a <- c(1, 2, 3)
b <- c(4, 5, 6)
c <- c(7, 8, 9)
df1 <- data.frame(a, b, c)
b <- c(1, 3, 5)
c <- c(2, 4, 6)
df2 <- data.frame(b, c)`

With df1 and df2, I would want some way for R to tell me that the common variables are b and c.

like image 614
Marco Pastor Mayo Avatar asked Jan 28 '23 07:01

Marco Pastor Mayo


2 Answers

1) For 2 data frames:

intersect(names(df1), names(df2))
## [1] "b" "c"

To get the names that are in df1 but not in df2:

setdiff(names(df1), names(df2))

1a) and for any number of data frames (i.e. get the names common to all of them):

L <- list(df1, df2)
Reduce(intersect, lapply(L, names))
## [1] "b" "c"

2) An alternative is to use duplicated since the common names will be the ones that are duplicated if we concatenate the names of the two data frames.

nms <- c(names(df1), names(df2))
nms[duplicated(nms)]
## [1] "b" "c"

2a) To generalize that to n data frames use table and look for the names that occur the same number of times as data frames:

L <- list(df1, df2)
tab <- table(unlist(lapply(L, names)))
names(tab[tab == length(L)])
## [1] "b" "c"
like image 160
G. Grothendieck Avatar answered Jan 31 '23 08:01

G. Grothendieck


Use intersect:

intersect(colnames(df1),colnames(df2))

OR

We can also check for the colname using %in%:

colnames(df1)[colnames(df1) %in% colnames(df2)]

Output:

[1] "b" "c"
like image 22
Saurabh Chauhan Avatar answered Jan 31 '23 07:01

Saurabh Chauhan