When trying to fit a Random Forest Regressor model with y data that looks like this: <pre class="prettyprint"><code>[ 0.00000000e+00 1.36094276e+02 4.46608221e+03 8.72660888e+03 1.31375786e+04 1.73580193e+04 2.29420671e+04 3.12216341e+04 4.11395711e+04 5.07972062e+04 6.14904935e+04 7.34275322e+04 7.87333933e+04 8.46302456e+04 9.71074959e+04 1.07146672e+05 1.17187952e+05 1.26953374e+05 1.37736003e+05 1.47239359e+05 1.53943242e+05 1.78806710e+05 1.92657725e+05 2.08912711e+05 2.22855152e+05 2.34532982e+05 2.41391255e+05 2.48699216e+05 2.62421197e+05 2.79544300e+05 2.95550971e+05 3.13524275e+05 3.23365158e+05 3.24069067e+05 3.24472999e+05 3.24804951e+05 </code></pre> And X data that looks like this: <pre class="prettyprint"><code>[ 735233.27082176 735234.27082176 735235.27082176 735236.27082176 735237.27082176 735238.27082176 735239.27082176 735240.27082176 735241.27082176 735242.27082176 735243.27082176 735244.27082176 735245.27082176 735246.27082176 735247.27082176 735248.27082176 </code></pre> With the following code: <pre class="prettyprint"><code>regressor = RandomForestRegressor(n_estimators=150, min_samples_split=1) rgr = regressor.fit(X,y) </code></pre> I get this error: <pre class="prettyprint"><code>ValueError: Number of labels=600 does not match number of samples=1 </code></pre> I assume one of my sets of values is in the wrong format but its not too clear to me from the documentation.

The shape of <code>X</code> should be <code>[n_samples, n_features]</code>, you can transform <code>X</code> by <pre class="prettyprint"><code>X = X[:, None] </code></pre>

It is treating your list of samples X as 1 sample as a vector so the following works <pre class="prettyprint"><code>rgr = regressor.fit(map(lambda x: [x],X),y) </code></pre> There might be a more efficient way of doing this in numpy with vstack.

Error with Sklearn Random Forest Regressor

Tags:

python

machine-learning

numpy

scikit-learn

random-forest

When trying to fit a Random Forest Regressor model with y data that looks like this:

[  0.00000000e+00   1.36094276e+02   4.46608221e+03   8.72660888e+03
   1.31375786e+04   1.73580193e+04   2.29420671e+04   3.12216341e+04
   4.11395711e+04   5.07972062e+04   6.14904935e+04   7.34275322e+04
   7.87333933e+04   8.46302456e+04   9.71074959e+04   1.07146672e+05
   1.17187952e+05   1.26953374e+05   1.37736003e+05   1.47239359e+05
   1.53943242e+05   1.78806710e+05   1.92657725e+05   2.08912711e+05
   2.22855152e+05   2.34532982e+05   2.41391255e+05   2.48699216e+05
   2.62421197e+05   2.79544300e+05   2.95550971e+05   3.13524275e+05
   3.23365158e+05   3.24069067e+05   3.24472999e+05   3.24804951e+05

And X data that looks like this:

[ 735233.27082176  735234.27082176  735235.27082176  735236.27082176
  735237.27082176  735238.27082176  735239.27082176  735240.27082176
  735241.27082176  735242.27082176  735243.27082176  735244.27082176
  735245.27082176  735246.27082176  735247.27082176  735248.27082176

With the following code:

regressor = RandomForestRegressor(n_estimators=150, min_samples_split=1)
rgr = regressor.fit(X,y)

I get this error:

ValueError: Number of labels=600 does not match number of samples=1

I assume one of my sets of values is in the wrong format but its not too clear to me from the documentation.

471

asked Aug 25 '15 07:08

BLL27

2 Answers

The shape of X should be [n_samples, n_features], you can transform X by

X = X[:, None]

143

answered Oct 07 '22 18:10

yangjie

It is treating your list of samples X as 1 sample as a vector so the following works

rgr = regressor.fit(map(lambda x: [x],X),y)

There might be a more efficient way of doing this in numpy with vstack.

answered Oct 07 '22 18:10

Francisco Vargas

Related questions
                            
                                Cross platform interface for virtualenv
                            
                                Time a while loop python
                            
                                Handling with multiple domains in Flask
                            
                                Scrapy: Define items dynamically
                            
                                Why does S3 (using with boto and django-storages) give signed url even for public files?
                            
                                Selenium webdriver and unicode
                            
                                Python/PIL affine transformation
                            
                                Detect key input in Python
                            
                                Django template for loop
                            
                                Resetting the expiration time for a cookie in Flask
                            
                                How to make markers on lines smaller in matplotlib?
                            
                                Python - Conversion of list of arrays to 2D array
                            
                                How to iterate through a module's functions [duplicate]
                            
                                How to filter filter_horizontal in Django admin?
                            
                                whitespace in regular expression
                            
                                PDB: How to inspect local variables of functions in nested stack frames?
                            
                                matplotlib animation movie: quality of movie decreasing with time
                            
                                sklearn: use Pipeline in a RandomizedSearchCV?
                            
                                How to make two markers share the same label in the legend using matplotlib?
                            
                                Print exception with stack trace to file

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With