Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

ValueError: Wrong number of items passed - Meaning and suggestions?

I am receiving the error: ValueError: Wrong number of items passed 3, placement implies 1, and I am struggling to figure out where, and how I may begin addressing the problem.

I don't really understand the meaning of the error; which is making it difficult for me to troubleshoot. I have also included the block of code that is triggering the error in my Jupyter Notebook.

The data is tough to attach; so I am not looking for anyone to try and re-create this error for me. I am just looking for some feedback on how I could address this error.

KeyError                                  Traceback (most recent call last) C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)    1944             try: -> 1945                 return self._engine.get_loc(key)    1946             except KeyError:  pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()  pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()  pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()  pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()  KeyError: 'predictedY'  During handling of the above exception, another exception occurred:  KeyError                                  Traceback (most recent call last) C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in set(self, item, value, check)    3414         try: -> 3415             loc = self.items.get_loc(item)    3416         except KeyError:  C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\indexes\base.py in get_loc(self, key, method, tolerance)    1946             except KeyError: -> 1947                 return self._engine.get_loc(self._maybe_cast_indexer(key))    1948   pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4154)()  pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4018)()  pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12368)()  pandas\hashtable.pyx in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:12322)()  KeyError: 'predictedY'  During handling of the above exception, another exception occurred:  ValueError                                Traceback (most recent call last) <ipython-input-95-476dc59cd7fa> in <module>()      26     return gp, results      27  ---> 28 gp_dailyElectricity, results_dailyElectricity = predictAll(3, 0.04, trainX_dailyElectricity, trainY_dailyElectricity, testX_dailyElectricity, testY_dailyElectricity, testSet_dailyElectricity, 'Daily Electricity')  <ipython-input-95-476dc59cd7fa> in predictAll(theta, nugget, trainX, trainY, testX, testY, testSet, title)       8        9     results = testSet.copy() ---> 10     results['predictedY'] = predictedY      11     results['sigma'] = sigma      12   C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in __setitem__(self, key, value)    2355         else:    2356             # set column -> 2357             self._set_item(key, value)    2358     2359     def _setitem_slice(self, key, value):  C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\frame.py in _set_item(self, key, value)    2422         self._ensure_valid_index(value)    2423         value = self._sanitize_column(key, value) -> 2424         NDFrame._set_item(self, key, value)    2425     2426         # check if we are modifying a copy  C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\generic.py in _set_item(self, key, value)    1462     1463     def _set_item(self, key, value): -> 1464         self._data.set(key, value)    1465         self._clear_item_cache()    1466   C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in set(self, item, value, check)    3416         except KeyError:    3417             # This item wasn't present, just insert at end -> 3418             self.insert(len(self.items), item, value)    3419             return    3420   C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in insert(self, loc, item, value, allow_duplicates)    3517     3518         block = make_block(values=value, ndim=self.ndim, -> 3519                            placement=slice(loc, loc + 1))    3520     3521         for blkno, count in _fast_count_smallints(self._blknos[loc:]):  C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in make_block(values, placement, klass, ndim, dtype, fastpath)    2516                      placement=placement, dtype=dtype)    2517  -> 2518     return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)    2519     2520 # TODO: flexible with index=None and/or items=None  C:\Users\brennn1\AppData\Local\Continuum\Anaconda3\lib\site-packages\pandas\core\internals.py in __init__(self, values, placement, ndim, fastpath)      88             raise ValueError('Wrong number of items passed %d, placement '      89                              'implies %d' % (len(self.values), ---> 90                                              len(self.mgr_locs)))      91       92     @property  ValueError: Wrong number of items passed 3, placement implies 1 

My code is as follows:

def predictAll(theta, nugget, trainX, trainY, testX, testY, testSet, title):      gp = gaussian_process.GaussianProcess(theta0=theta, nugget =nugget)     gp.fit(trainX, trainY)      predictedY, MSE = gp.predict(testX, eval_MSE = True)     sigma = np.sqrt(MSE)      results = testSet.copy()     results['predictedY'] = predictedY     results['sigma'] = sigma      print ("Train score R2:", gp.score(trainX, trainY))     print ("Test score R2:", sklearn.metrics.r2_score(testY, predictedY))      plt.figure(figsize = (9,8))     plt.scatter(testY, predictedY)     plt.plot([min(testY), max(testY)], [min(testY), max(testY)], 'r')     plt.xlim([min(testY), max(testY)])     plt.ylim([min(testY), max(testY)])     plt.title('Predicted vs. observed: ' + title)     plt.xlabel('Observed')     plt.ylabel('Predicted')     plt.show()      return gp, results  gp_dailyElectricity, results_dailyElectricity = predictAll(3, 0.04, trainX_dailyElectricity, trainY_dailyElectricity, testX_dailyElectricity, testY_dailyElectricity, testSet_dailyElectricity, 'Daily Electricity') 
like image 379
Gary Avatar asked Apr 04 '17 01:04

Gary


1 Answers

In general, the error ValueError: Wrong number of items passed 3, placement implies 1 suggests that you are attempting to put too many pigeons in too few pigeonholes. In this case, the value on the right of the equation

results['predictedY'] = predictedY

is trying to put 3 "things" into a container that allows only one. Because the left side is a dataframe column, and can accept multiple items on that (column) dimension, you should see that there are too many items on another dimension.

Here, it appears you are using sklearn for modeling, which is where gaussian_process.GaussianProcess() is coming from (I'm guessing, but correct me and revise the question if this is wrong).

Now, you generate predicted values for y here:

predictedY, MSE = gp.predict(testX, eval_MSE = True)

However, as we can see from the documentation for GaussianProcess, predict() returns two items. The first is y, which is array-like (emphasis mine). That means that it can have more than one dimension, or, to be concrete for thick headed people like me, it can have more than one column -- see that it can return (n_samples, n_targets) which, depending on testX, could be (1000, 3) (just to pick numbers). Thus, your predictedY might have 3 columns.

If so, when you try to put something with three "columns" into a single dataframe column, you are passing 3 items where only 1 would fit.

like image 194
Savage Henry Avatar answered Sep 22 '22 17:09

Savage Henry