I want to differentiate data vectors to find those that are similar. For example:
A=[4,5,6,7,8];
B=[4,5,6,6,8];
C=[4,5,6,7,7];
D=[1,2,3,9,9];
E=[1,2,3,9,8];
In the previous example I want to distinguish that A,B,C vectors are similar (not the same) to each other and D,E are similiar to each other. The result should be something like: A,B,C are similar and D,E are similar, but the group A,B,C is not similar to the group of D,E. Matlab can do this? I was thinking using some classification algorithm or Kmeans,ROC,etc.. but I'm not sure which one will be the best one.
Any suggestion? Thanks in advance
The difference of two vector x=b−a is formed by placing the tails of the two vectors together. Then, the vector x goes from the head of a to the tail of b.
A vector difference is the result of subtracting one vector from another. A vector difference is denoted using the normal minus sign, i.e., the vector difference of vectors and is written .
One of my new favourite methods for this sort of thing is agglomerate clustering.
First, concatenate all your vectors into a matrix, where each row is a separate vector. This makes such methods much easier to use:
F = [A; B; C; D; E];
Then the linkages can be found:
Z = linkage(F, 'ward', 'euclidean');
This can be plotted using:
dendrogram(Z);
This shows a tree, where each leaf at the bottom is one of the original vectors. Lengths of the branches show similarities and dissimilarities.
As you can see, 1, 2 and 3 are shown to be very close, as are 4 and 5. This even gives a measure of closeness, and shows that vectors 1 and 3 are deemed to be closer than vectors 2 and 3 (in the sense that, percentagewise, 7 is closer to 8 than 6 is to 7).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With