Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

'Similarity' in Data Mining

In the field of Data Mining, is there a specific sub-discipline called 'Similarity'? If yes, what does it deal with. Any examples, links, references will be helpful.

Also, being new to the field, I would like the community opinion on how closely related Data Mining and Artificial Intelligence are. Are they synonyms, is one the subset of the other?

Thanks in advance for sharing your knowledge.

like image 661
Shailesh Tainwala Avatar asked May 22 '10 09:05

Shailesh Tainwala


2 Answers

In the field of Data Mining, is there a specific sub-discipline called 'Similarity'?

Yes. There is a specific subfield in data mining and machine learning called metric learning, which aims to learn a better distance metric among data instances.

Do you know any of the following concepts?

Euclidean distance

Mahalanobis distance

Pearson correlation

Cosine similarity and here

Kernel functions

After you know these, you will know what is 'similarity'.

I would like the community opinion on how closely related Data Mining and Artificial Intelligence are.

It is very hard to distinguish what is data mining, what is AI. Don't discuss this question when you are new in the field. When you have learned 10 algorithms in data mining and read some AI books, you will know the difference and the relation.

like image 127
Yin Zhu Avatar answered Oct 14 '22 14:10

Yin Zhu


Appropriate definitions of 'similarity' (which features you extract, what you do with them afterwards) are almost the definition of clustering, and clustering is a fairly wide sub-field of data mining.

If you make the standard cynical definition of AI as the set of problems we can't solve well (indeed, that we can't specify well enough to start solving), data mining shades into it once the space in which you're looking for correlations starts to be larger than your algorithms can handle.

like image 31
Tom Womack Avatar answered Oct 14 '22 15:10

Tom Womack