I would like to compare two lists (two rows of a data frame) and count how many differences I have between the two lists.
for instance:
list1=a,b,c,a
list2=a,a,d,d
two elements of list 1 are in list 2
I am able to do that with a loop and sum but it is very inefficient. Is there any function to do that in R?
I have checked setdiff and the compare package but did not find anything that helps.
Thanks for your ideas,
Vincent
My function looks like:
NRebalancing=function(NamePresent)
{
Nbexchange=NamePresent[,2]
Nbexchange=NamePresent[1,2]=0
for (i in 2:nrow(NamePresent))
{
print(i)
compteur=0
NameNeeded=NamePresent[i,]
NameNeeded=unique(NameNeeded)
NameNeeded=na.omit(NameNeeded)
for(j in 2:length(NameNeeded))
#j=1 correspond a une date
{
compteur = compteur+(abs(sum(NamePresent[i,]==as.character(NameNeeded[j]))-sum(NamePresent[i-1,]==as.character(NameNeeded[j]))))
}
Nbexchange[i]=compteur
}
return(Nbexchange)
}
One main point: your list isn't an R list - that's something a bit special. You are using vectors:
R> is.vector(l1)
[1] TRUE
R> is.list(l1)
[1] FALSE
don't call variables list1
if they are vectors.
Since you have a vector there are lots of possibilities open.
The %in%
operator
R> l1 = c("a", "b", "c", "d")
R> l2 = c("a", "a", "d", "d")
R> l1[l1 %in% l2]
[1] "a" "d"
Or use is.element
R> l1[is.element(l1, l2)]
[1] "a" "d"
There is also unique
:
R> unique(l2)
[1] "a" "d"
Following your comment to @mrdwab, you can count the number of occurances using a combination of sapply
and unique
sapply(unique(l1), function(i) sum(i==l2))
i==l2
checks for membership, sum
counts the number of times TRUE appears and sapply
is basically just a for loop over unique(l1)
R> sapply(unique(l1), function(i) sum(i==l2))
a b c d
2 0 0 2
A very nice suggestion from @mrdwab is to use table
and colSums
:
R> table(l1, l2)
l2 l1
a d
a 1 0
b 1 0
c 0 1
d 0 1
R> colSums(table(l1, l2))
a d
2 2
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With