I was wondering, given the regression coefficients of a svm regression model, if one could calculate the predictions 'by hand', made by that model. More precisely, suppose:
svc = SVR(kernel='rbf', epsilon=0.3, gamma=0.7, C=64)
svc.fit(X_train, y_train)
then you can obtain the predictions very easily by using
y_pred = svc.predict(X_test)
I was wondering how one obtains this result by calculating it directly. Starting with the decision function,
where K
is the RBF kernel function, b
is the intercept and the alpha's are the dual coefficients.
Because I work with the RBF-kernel, I started like this:
def RBF(x,z,gamma,axis=None):
return np.exp((-gamma*np.linalg.norm(x-z, axis=axis)**2))
for i in len(svc.support_):
A[i] = RBF(X_train[i], X_test[0], 0.7)
Then I calculated
np.sum(svc._dual_coef_*A)+svc.intercept_
However, the result of this calculation isn't the same as the first term of y_pred
. I suspect my reasoning isn't entirely correct and/or my code isn't what it should be, so apologies if this isn't the right board to ask. I've been staring blind at this problem for the past 2 hours, so any help would be greatly appreciated!
UPDATE
After some more research, I've found the following posts: Replication of scikit.svm.SRV.predict(X) and Calculating decision function of SVM manually. In the first post, they talk about regression, in the second about classification, but the idea stays the same. In both cases, the OP is basically asking the same thing, but when I try to implement their code, I always run into an error in the step
diff = sup_vecs - X_test
of the form
ValueError: operands could not be broadcast together with shapes
(number equal to amount of support vectors,7) (number equal to len(Xtest),7)
I don't understand why the amount of the support vectors should be equal to the amount of test data. From what I understand, that's almost never the case. So can anyone shine some light on how one should approach this problem more generally, i.e. how one can improve the code so that it works for multi-dimensional arrays?
P.S. Not relevant to the problem, but just to be precise: 7 is the number of features.
The mistake you are making is in for i in len(svc.support_):
this loop.
You loop over the first n_SV(number of support vectors) training points and not necessarily the support vectors. So just loop over the svc.support_vectors_
to get the actual support vectors. Rest of your code remains the same. Below I provide the code with corrections.
from sklearn import datasets
from sklearn.svm import SVR
# Load the IRIS dataset for demonstration
iris = datasets.load_iris()
X = iris.data
y = iris.target
# Train-test split
X_train, y_train = X[:140], y[:140]
X_test, y_test = X[140:], y[140:]
print(X.shape, X_train.shape, X_test.shape) # prints (150, 4) (140, 4) (10, 4)
# Fit a rbf kernel SVM
svc = SVR(kernel='rbf', epsilon=0.3, gamma=0.7, C=64)
svc.fit(X_train, y_train)
# Get prediction for a point X_test using train SVM, svc
def get_pred(svc, X_test):
def RBF(x,z,gamma,axis=None):
return np.exp((-gamma*np.linalg.norm(x-z, axis=axis)**2))
A = []
# Loop over all suport vectors to calculate K(Xi, X_test), for Xi belongs to the set of support vectors
for x in svc.support_vectors_:
A.append(RBF(x, X_test, 0.7))
A = np.array(A)
return (np.sum(svc._dual_coef_*A)+svc.intercept_)
for i in range(X_test.shape[0]):
print(get_pred(svc, X_test[i]))
print(svc.predict(X_test[i].reshape(1,-1))) # The same oputput by both
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With