I want to make prediction in a data science project, and the error is calculate through an asymmetric function.
Is it possible to tune the loss function of random forest or gradient boosting (of sklearn) ?
I have read that it is required to modify a .pyx file but I cannot find any in my sklearn folder (I am on ubuntu 14.04 LTS).
Do you have suggestions ?
A custom loss function can be created by defining a function that takes the true values and predicted values as required parameters. The function should return an array of losses. The function can then be passed at the compile stage.
Yes, it is possible to tune. For example:
class ExponentialPairwiseLoss(object):
def __init__(self, groups):
self.groups = groups
def __call__(self, preds, dtrain):
labels = dtrain.get_label().astype(np.int)
rk = len(np.bincount(labels))
plus_exp = np.exp(preds)
minus_exp = np.exp(-preds)
grad = np.zeros(preds.shape)
hess = np.zeros(preds.shape)
pos = 0
for size in self.groups:
sum_plus_exp = np.zeros((rk,))
sum_minus_exp = np.zeros((rk,))
for i in range(pos, pos + size, 1):
sum_plus_exp[labels[i]] += plus_exp[i]
sum_minus_exp[labels[i]] += minus_exp[i]
for i in range(pos, pos + size, 1):
grad[i] = -minus_exp[i] * np.sum(sum_plus_exp[:labels[i]]) +\
plus_exp[i] * np.sum(sum_minus_exp[labels[i] + 1:])
hess[i] = minus_exp[i] * np.sum(sum_plus_exp[:labels[i]]) +\
plus_exp[i] * np.sum(sum_minus_exp[labels[i] + 1:])
pos += size
return grad, hess
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With