Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas ValueError: Cannot setitem on a Categorical with a new category, set the categories first

Now, I am changing the information inside DataFrame by replacing Yes with 1 and No with 0. Previously, my code worked fine and now I made some changes due to a memory problem.

Previous code "Got Traceback Error mentioned below"

df.loc[df[df.decision == 'Yes'].index, 'decision'] = 1
df.loc[df[df.decision == 'No'].index, 'decision'] = 0

Changed with

df.loc['Yes', "decision"] = 1
df.loc['No', "decision"] = 0

Still, the problem remains the same.

Traceback

Traceback (most recent call last):
  File "/snap/pycharm-community/226/plugins/python-ce/helpers/pydev/pydevd.py", line 1477, in _exec
    pydev_imports.execfile(file, globals, locals)  # execute the script
  File "/snap/pycharm-community/226/plugins/python-ce/helpers/pydev/_pydev_imps/_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "/home/khawar/deepface/tests/Ensemble-Face-Recognition.py", line 148, in <module>
    df.loc['Yes', "decision"] = 1
  File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 670, in __setitem__
    iloc._setitem_with_indexer(indexer, value)
  File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1763, in _setitem_with_indexer
    isetter(loc, value)
  File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1689, in isetter
    ser._mgr = ser._mgr.setitem(indexer=plane_indexer, value=v)
  File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 543, in setitem
    return self.apply("setitem", indexer=indexer, value=value)
  File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/internals/managers.py", line 409, in apply
    applied = getattr(b, f)(**kwargs)
  File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/internals/blocks.py", line 1688, in setitem
    self.values[indexer] = value
  File "/home/khawar/.local/lib/python3.6/site-packages/pandas/core/arrays/categorical.py", line 2011, in __setitem__
    "Cannot setitem on a Categorical with a new "
ValueError: Cannot setitem on a Categorical with a new category, set the categories first
python-BaseException

As suggested I implemented new code

df['decision'] = (df['decision'] == 'Yes').astype(int)

Traceback

Traceback (most recent call last):
  File "/home/khawar/deepface/tests/Ensemble-Face-Recognition.py", line 174, in <module>
    gbm = lgb.train(params, lgb_train, num_boost_round=1000, early_stopping_rounds=15, valid_sets=lgb_test)
  File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/engine.py", line 231, in train
    booster = Booster(params=params, train_set=train_set)
  File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 2053, in __init__
    train_set.construct()
  File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 1325, in construct
    categorical_feature=self.categorical_feature, params=self.params)
  File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 1123, in _lazy_init
    self.__init_from_np2d(data, params_str, ref_dataset)
  File "/home/khawar/.local/lib/python3.6/site-packages/lightgbm/basic.py", line 1162, in __init_from_np2d
    data = np.array(mat.reshape(mat.size), dtype=np.float32)
ValueError: could not convert string to float: 'deepface/tests/dataset/029A33.JPG'
like image 716
Khawar Islam Avatar asked Dec 04 '25 01:12

Khawar Islam


2 Answers

In your solution the problem is that the decision column is a categorical column, so if replace only some rows, pandas expect those to be categorical values. Because 0,1 do not exist in the existing categories, an error is raised.

Sample data with categorical column:

df = pd.DataFrame({'decision':['Yes','No']})

df['decision'] = pd.Categorical(df['decision'])

Solutions with Series.map and cat.rename_categories for categorical output:

df['decision1'] = df['decision'].map({'Yes':1, 'No':0})
df['decision2'] = df['decision'].cat.rename_categories({'Yes':1, 'No':0})

If only Yes and No values are possible, recreate all values by comparing with Yes and cast to integer. This will map True, False to 1,0 like @arhr mentioned, the categorical type is lost:

df['decision3'] = (df['decision'] == 'Yes').astype(int)
print (df)
  decision decision1  decision2 decision3
0      Yes         1          1         1
1       No         0          0         0

print (df.dtypes)
decision     category
decision1    category
decision2    category  
decision3       int32
dtype: object
like image 179
jezrael Avatar answered Dec 06 '25 17:12

jezrael


I got the error while I tried to run whole model which contains different data types ("category" and "float64"). I solved the error just replacing "category" columns to "string" at the beginning, in this way:

for col in df.select_dtypes(include=['category']).columns:
  df[col] = df[col].astype('str')
like image 44
Filippo Avatar answered Dec 06 '25 15:12

Filippo



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!