Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is train/test-Split in unsupervised learning necessary/useful?

In supervised learning I have the typical train/test split to learn the algorithm, e.g. Regression or Classification. Regarding unsupervised learning, my question is: Is train/test split necessary and useful? If yes, why?

like image 870
Christoph S Avatar asked Jul 28 '15 10:07

Christoph S


2 Answers

Definitely it is useful.

Few points that I know about "why".

When testing a model comes into the story, it should always perform on unseen data. So it is better that you have spitted data using train_test_split.

The second case is that the data should always be shuffled in the format. Otherwise, the n-1 type of data will occur when fitting the model that may not give good results.

like image 87
Yuvraj Takey Avatar answered Sep 20 '22 20:09

Yuvraj Takey


Well This Depend on the Problem, the form of dataset and Class of Unsupervised algorithm used to solve the particular problem.

Roughly:- Dimensionality reduction techniques are usually tested by calculating the error in reconstruction so there we can use k-fold cross-validation procedure

But on clustering algorithm, I would suggest doing statistical testing in order to test performance. There is also little time-consuming trick which splitting dataset and hand label the test set with meaningfull classes and cross validate

In any case unsupervised algorithm is used on supervised data then it always good cross-validate

overall:- It is not necessary to split data in the train-test set but if we can do it it is always better

Here is article which explains how cross-validation is a good tool for unsupervised learning http://udini.proquest.com/view/cross-validation-for-unsupervised-pqid:1904931481/ and the full text is available here http://arxiv.org/pdf/0909.3052.pdf

https:///www.researchgate.net/post/Which_are_the_methods_to_validate_an_unsupervised_machine_learning_algorithm

like image 38
Mangesh Divate Avatar answered Sep 17 '22 20:09

Mangesh Divate