Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Computing degree of similarity among a group of sets

Suppose there are 4 sets:

s1={1,2,3,4};
s2={2,3,4};
s3={2,3,4,5};
s4={1,3,4,5};

Is there any standard metric to present the similarity degree of this group of 4 sets?

Thank you for the suggestion of Jaccard method. However, it seems pairwise. How can I compute the similarity degree of the whole group of sets?

like image 694
Soup Avatar asked Jan 09 '10 23:01

Soup


2 Answers

Pairwise, you can compute the Jaccard distance of two sets. It's simply the distance between two sets, if they were vectors of booleans in a space where {1, 2, 3…} are all unit vectors.

like image 79
Tobu Avatar answered Sep 21 '22 15:09

Tobu


Your question isn't very specific. But I suppose you mean something like the "edit distance" between them? I.e. how much you need to change s1 to get to s2?

Check out the Wikipedia article on Edit distance.

like image 41
adamse Avatar answered Sep 19 '22 15:09

adamse