When I am trying to import a metric from sklearn, I get the following error:
from sklearn.metrics import mean_absolute_percentage_error
ImportError: cannot import name 'mean_absolute_percentage_error' from 'sklearn.metrics'
/Users/carter/opt/anaconda3/lib/python3.8/site-packages/sklearn/metrics/__init__.py)
I have used conda update all, and reinstalled scikit-learn to no avail. Any other reasons this might happen and solutions?
Use sklearn. __version__ to display the installed version of scikit-learn. Call sklearn. __version__ to return the current version of scikit-learn .
The function mean_absolute_percentage_error
is new in scikit-learn version 0.24 as noted in the documentation.
As of December 2020, the latest version of scikit-learn available from Anaconda is v0.23.2, so that's why you're not able to import mean_absolute_percentage_error
.
You could try installing the latest version from source instead, or implement the function you need yourself. The source is available here if you'd like to take a look.
The answer above is the right one. For those who cannot upgrade/install from source, below is the required code.
The function itself relies on other functions - one defined in the same module and others is from sklearn.utils.validation.
Here is the required code I pulled from the source - if anyone needs it (and I hope I am not violating any license):
from sklearn.utils.validation import check_consistent_length, check_array
def mean_absolute_percentage_error(y_true, y_pred,
sample_weight=None,
multioutput='uniform_average'):
"""Mean absolute percentage error regression loss.
Note here that we do not represent the output as a percentage in range
[0, 100]. Instead, we represent it in range [0, 1/eps]. Read more in the
:ref:`User Guide <mean_absolute_percentage_error>`.
.. versionadded:: 0.24
Parameters
----------
y_true : array-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
sample_weight : array-like of shape (n_samples,), default=None
Sample weights.
multioutput : {'raw_values', 'uniform_average'} or array-like
Defines aggregating of multiple output values.
Array-like value defines weights used to average errors.
If input is list then the shape must be (n_outputs,).
'raw_values' :
Returns a full set of errors in case of multioutput input.
'uniform_average' :
Errors of all outputs are averaged with uniform weight.
Returns
-------
loss : float or ndarray of floats in the range [0, 1/eps]
If multioutput is 'raw_values', then mean absolute percentage error
is returned for each output separately.
If multioutput is 'uniform_average' or an ndarray of weights, then the
weighted average of all output errors is returned.
MAPE output is non-negative floating point. The best value is 0.0.
But note the fact that bad predictions can lead to arbitarily large
MAPE values, especially if some y_true values are very close to zero.
Note that we return a large value instead of `inf` when y_true is zero.
Examples
--------
>>> from sklearn.metrics import mean_absolute_percentage_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.3273...
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.5515...
>>> mean_absolute_percentage_error(y_true, y_pred, multioutput=[0.3, 0.7])
0.6198...
"""
y_type, y_true, y_pred, multioutput = _check_reg_targets(
y_true, y_pred, multioutput)
check_consistent_length(y_true, y_pred, sample_weight)
epsilon = np.finfo(np.float64).eps
mape = np.abs(y_pred - y_true) / np.maximum(np.abs(y_true), epsilon)
output_errors = np.average(mape,
weights=sample_weight, axis=0)
if isinstance(multioutput, str):
if multioutput == 'raw_values':
return output_errors
elif multioutput == 'uniform_average':
# pass None as weights to np.average: uniform mean
multioutput = None
return np.average(output_errors, weights=multioutput)
def _check_reg_targets(y_true, y_pred, multioutput, dtype="numeric"):
"""Check that y_true and y_pred belong to the same regression task.
Parameters
----------
y_true : array-like
y_pred : array-like
multioutput : array-like or string in ['raw_values', uniform_average',
'variance_weighted'] or None
None is accepted due to backward compatibility of r2_score().
Returns
-------
type_true : one of {'continuous', continuous-multioutput'}
The type of the true target data, as output by
'utils.multiclass.type_of_target'.
y_true : array-like of shape (n_samples, n_outputs)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples, n_outputs)
Estimated target values.
multioutput : array-like of shape (n_outputs) or string in ['raw_values',
uniform_average', 'variance_weighted'] or None
Custom output weights if ``multioutput`` is array-like or
just the corresponding argument if ``multioutput`` is a
correct keyword.
dtype : str or list, default="numeric"
the dtype argument passed to check_array.
"""
check_consistent_length(y_true, y_pred)
y_true = check_array(y_true, ensure_2d=False, dtype=dtype)
y_pred = check_array(y_pred, ensure_2d=False, dtype=dtype)
if y_true.ndim == 1:
y_true = y_true.reshape((-1, 1))
if y_pred.ndim == 1:
y_pred = y_pred.reshape((-1, 1))
if y_true.shape[1] != y_pred.shape[1]:
raise ValueError("y_true and y_pred have different number of output "
"({0}!={1})".format(y_true.shape[1], y_pred.shape[1]))
n_outputs = y_true.shape[1]
allowed_multioutput_str = ('raw_values', 'uniform_average',
'variance_weighted')
if isinstance(multioutput, str):
if multioutput not in allowed_multioutput_str:
raise ValueError("Allowed 'multioutput' string values are {}. "
"You provided multioutput={!r}".format(
allowed_multioutput_str,
multioutput))
elif multioutput is not None:
multioutput = check_array(multioutput, ensure_2d=False)
if n_outputs == 1:
raise ValueError("Custom weights are useful only in "
"multi-output cases.")
elif n_outputs != len(multioutput):
raise ValueError(("There must be equally many custom weights "
"(%d) as outputs (%d).") %
(len(multioutput), n_outputs))
y_type = 'continuous' if n_outputs == 1 else 'continuous-multioutput'
return y_type, y_true, y_pred, multioutput
You can go with one of these two solutions:
!pip install scikit-learn==0.24
Then,
from sklearn.metrics import mean_absolute_percentage_error
def MAPE(y_true, y_pred):
y_true, y_pred = np.array(y_true), np.array(y_pred)
return np.mean(np.abs((y_true - y_pred) / y_true)) * 100
But the problem with the above function is that when you have (0) true value your MAPE will go (inf). So, to solve this problem we use,
def MAPE(y_true, y_pred):
y_true, y_pred = np.array(y_true), np.array(y_pred)
return np.mean(np.abs((y_true - y_pred) / np.maximum(np.ones(len(y_true)), np.abs(y_true))))*100
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With