In scikit learn, I make a regression of Boston House Price and get the following learning curve. But what is meaning of score(y axis) in regression?
Interpreting a Validation Curve Ideally, we would want both the validation curve and the training curve to look as similar as possible. If both scores are low, the model is likely to be underfitting. This means either the model is too simple or it is informed by too few features.
A cross-validation generator splits the whole dataset k times in training and test data. Subsets of the training set with varying sizes will be used to train the estimator and a score for each training subset size and the test set will be computed.
Through scikit-learn, we can implement various machine learning models for regression, classification, clustering, and statistical tools for analyzing these models. It also provides functionality for dimensionality reduction, feature selection, feature extraction, ensemble techniques, and inbuilt datasets.
Graph visualizes the learning curves of the model for both training and validation as the size of the training set is increased. The shaded region of a learning curve denotes the uncertainty of that curve (measured as the standard deviation). The model is scored on both the training and testing sets using R2, the coefficient of determination.
It depends on what do you want to measure, you can choose anything from following chart(may be any other metric not present here):
Reference:
http://scikit-learn.org/stable/modules/model_evaluation.html
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With