When it is referred to use min-max-scaler and when Standard Scalar. I think it depends on the data. Is there any features of data to look on to decide to go for which preprocessing method. I looked at the docs but can someone give me more insight into it.
The scaling will indeed depend of the type of data that you will. For most cases, StandardScaler
is the scaler of choice. If you know that you have some outliers, go for the RobustScaler
.
Then, you deal with some features with a weird distribution like for instance the digits, it will not be the best to use these scalers. Indeed, on this dataset, there a lot of pixel at zero meaning that you have a pick at zero for this distribution involving that dividing by the std. dev. will not be beneficial. So basically when the distribution of a feature is far to be Normal then you need to take an alternative.
In the case of the digits, the MinMaxScaler
is a much better choice. However, if you want to keep the zero at zeros (because you use sparse matrices), you will go for a MaxAbsScaler
.
NB: also look at the QuantileTransformer
and the PowerTransformer
if you want a feature to follow a Normal/Uniform distribution whatever the original distribution was.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With