Gaussian Process scikit-learn - Exception

Tags:

I want to use Gaussian Processes to solve a regression task. My data is as follow : each X vector has a length of 37, and each Y vector has a length of 8.

I'm using the sklearnpackage in Python but trying to use gaussian processes leads to an Exception:

from sklearn import gaussian_process

print "x :", x__
print "y :", y__

gp = gaussian_process.GaussianProcess(theta0=1e-2, thetaL=1e-4, thetaU=1e-1)
gp.fit(x__, y__)

x : [[ 136. 137. 137. 132. 130. 130. 132. 133. 134.
135. 135. 134. 134. 1139. 1019. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 70. 24. 55. 0. 9. 0. 0.] [ 136. 137. 137. 132. 130. 130. 132. 133. 134. 135. 135. 134. 134. 1139. 1019. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 70. 24. 55. 0. 9. 0. 0.] [ 82. 76. 80. 103. 135. 155. 159. 156. 145. 138. 130. 122. 122. 689. 569. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 156. 145. 138. 130. 122. 118. 113. 111. 105. 101. 98. 95. 95. 759. 639. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 112. 111. 111. 114. 114. 113. 114. 114. 112. 111. 109. 109. 109. 1109. 989. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 133. 130. 125. 124. 124. 123. 103. 87. 96. 121. 122. 123. 123. 399. 279. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 104. 109. 111. 106. 91. 86. 117. 123. 123. 120. 121. 115. 115. 549. 429. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.] [ 144. 138. 126. 122. 119. 118. 116. 114. 107. 105. 106. 119. 119. 479. 359. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

y : [[ 7. 9. 13. 30. 34. 37. 36. 41. ] [ 7. 9. 13. 30. 34. 37. 36. 41. ] [ -4. -9. -17. -21. -27. -28. -28. -20. ] [ -1. -1. -4. -5. 20. 28. 31. 23. ] [ -1. -2. -3. -1. -4. -7. 8. 58. ] [ -1. -2. -14.33333333 -14. -13.66666667 -32. -26.66666667 -1. ] [ 1. 3.33333333 0. -0.66666667 3. 6. 22. 54. ] [ -2. -8. -11. -17. -17. -16. -16. -23. ]]

--------------------------------------------------------------------------- Exception Traceback (most recent call last) in () 11 gp = gaussian_process.GaussianProcess(theta0=1e-2, thetaL=1e-4, thetaU=1e-1) 12 ---> 13 gp.fit(x__, y__)

/usr/local/lib/python2.7/site-packages/sklearn/gaussian_process/gaussian_process.pyc in fit(self, X, y) 300 if (np.min(np.sum(D, axis=1)) == 0. 301 and self.corr != correlation.pure_nugget): --> 302 raise Exception("Multiple input features cannot have the same" 303 " target value.") 304

Exception: Multiple input features cannot have the same target value.

I've found some topics related to a scikit-learn issue, but my version is up-to-date.

629

asked Jan 11 '16 14:01

Julian

1 Answers

It is known issue and it still has not actually been resolved.

It is happens, because if you have same points , your matrix is not invertible(singular).(meaning you cannot calculate A^-1 - which is part of solution for GP).

In order to solve it, just add some small gaussian noise to your examples or use other GP library.

You can always try to implement it, it is actually not that hard. The most important thing in GP is your kernel function, for example gaussian kernel:

exponential_kernel = lambda x, y, params: params[0] * \
    np.exp( -0.5 * params[1] * np.sum((x - y)**2) )

Now, we need to build covariance matrix, like this:

covariance = lambda kernel, x, y, params: \
    np.array([[kernel(xi, yi, params) for xi in x] for yi in y])

So, when you want to predict new point x calculate its covariance:

sigma1 = covariance(exponential_kernel, x, x, theta)

and apply following:

def predict(x, data, kernel, params, sigma, t):
    k = [kernel(x, y, params) for y in data]
    Sinv = np.linalg.inv(sigma)
    y_pred = np.dot(k, Sinv).dot(t)
    sigma_new = kernel(x, x, params) - np.dot(k, Sinv).dot(k)
    return y_pred, sigma_new

This is very naive implementation and for data with high dimensions, runtime will be high. Hardest thing to calculate here is Sinv = np.linalg.inv(sigma) which takes O(N^3).

114

answered Oct 05 '22 23:10

Farseer

Related questions
                            
                                DataFrame object has no attribute 'sample'
                            
                                convert itertools array into numpy array
                            
                                Rendering text to a kivy canvas
                            
                                Why do tests in derived classes rerun the parent class tests?
                            
                                Defining virtual fields in peewee
                            
                                setting up single sign on with django-cas and django-mama-cas
                            
                                Python - Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-21ft0H/pandas
                            
                                Interactive tooltips in matplotlib plot
                            
                                TensorFlow - why doesn't this sofmax regression learn anything?
                            
                                Python random list
                            
                                Print a big integer with punctions with Python3 string formatting mini-language
                            
                                How to do a where exists in nested query with SQLAlchemy?
                            
                                Syncing VirtualEnvs in multiple computer
                            
                                How can I plot multiple figure in the same line with matplotlib?
                            
                                Telegram bot api keyboard
                            
                                How to convert *any* Python object into a string?
                            
                                Python Neural Network Reinforcement Learning [closed]
                            
                                Getting more info from python sqlite exceptions
                            
                                DRF - is it possible to combine multiple filter parameters in the URL with some kind of OR logical symbol
                            
                                Can't plot ellipse in python without fillcolor

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Gaussian Process scikit-learn - Exception

Tags:

python

scikit-learn

regression

forecasting

gaussian

Julian

People also ask

1 Answers

Farseer

Recent Activity

Donate For Us