Once in an interview, I encountered a question from the employer. He asked me why KNN classifier is much faster than decision tree for example in letter recognition or in face recognition?
I had completely no idea at that time. So I want to know in which terms should I compare the two classification methods in speed performance? Thanks.
The kNN algorithm has to find the nearest neighbors in the training set for the sample being classified. As the dimensionality (number of features) of the data increases, the time needed to find nearest neighbors rises very quickly.
Accuracy for Models on Diabetic Dataset For Diabetic dataset, KNN with manhattan distance and all features as input outperformed other models with an accuracy of 0.66, while the best Decision Tree model got an accuracy of 0.63 with entropy function as the hyper-parameter and all features as input.
It is a lazy learning algorithm and therefore doesn't require training on all data points (only using the K-Nearest neighbors to predict). This makes the KNN algorithm much faster than other algorithms that require training with the whole dataset such as Support Vector Machines, linear regression, etc.
KNN is widely known as an ML algorithm that doesn't need any training on data. This is much different from eager learning approaches that rely on a training dataset to perform predictions on unseen data. With KNN, you don't need a training phase at all.
Consider the following dataset: N samples, each has k attributes. In general :
1. naive KNN: O(1) [training Time] + O(NK) [query Time] = O (NK)
2. naive decision tree: O(N^2 * K * log(N)) [training Time] + O(log(N)) [query Time] = O(N^2 * K) -- Also for query time, we assume that the tree is balanced.
To calculate the complexities, I considered very simple implementation of each classifier. Already there are few improvements for implementing KNN and Decision Tree.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With