Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Getting pairwise proportions of concordance in a binary dataframe

Tags:

r

lapply

binary

I have a dataframe with binary values like so:

df<-data.frame(a=rep(c(1,0),9),b=rep(c(0,1,0),6),c=rep(c(0,1),9))

Purpose is to first obtain all pairwise combinations :

combos <- function(df, n) {
  unlist(lapply(n, function(x) combn(df, x, simplify=F)), recursive=F)
} 

combos(df,2)->j

Next I want to get the proportion of pairs for which both columns in each dataframe in list j has either (0,0) or (1,1). I can get the proportions like so:

lapply(j, function(x) data.frame(new = rowSums(x[,1:2])))->k
lapply(k, function(x) data.frame(prop1 = length(which(x==1))/18,prop2=length(which(x==0|x==2))/18))

However this seems slow and complicated for larger lists. Couple of questions: 1) Is there a faster/better method than this? My actual list is 20 dataframes each with dim : 250 x 400. I tried dist(df,method=binary)but it looks like the binary method doesnot take into account (0,0) instances.

2) Also why when I try to divide using length(x[1]) or lengths(x[1]) it does not give me 18? In the example I divided it by specifying the length of vector new.

Any help is very much appreciated!

like image 584
thisisrg Avatar asked Nov 14 '25 11:11

thisisrg


1 Answers

#Get the combinations
j = combn(x = df, m = 2, simplify = FALSE)

#Get the Proportions
sapply(j, function(x) length(which(x[1] == x[2]))/NROW(x))

As @thelatemail commented, if you are not concerned with storing the intermediate combinations, you can just do at once using

combn(x = df, m = 2, FUN=function(x) length(which(x[1] == x[2]))/NROW(x))
like image 82
d.b Avatar answered Nov 17 '25 08:11

d.b



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!