Example of data that I want to replace
Data has the following attributes
here is what I did
enter code here
#Buying price generalization
df["Buying_Price"]=df["Buying_Price"].replace({"vhigh":4})
df["Buying_Price"]=df["Buying_Price"].replace({"high":3})
df["Buying_Price"]=df["Buying_Price"].replace({"med":2})
df["Buying_Price"]=df["Buying_Price"].replace({"low":1})
#Maintanace generalization
df["Maintanance_price"]=df["Maintanance_price"].replace({"vhigh":4})
df["Maintanance_price"]=df["Maintanance_price"].replace({"high":3})
df["Maintanance_price"]=df["Maintanance_price"].replace({"med":2})
df["Maintanance_price"]=df["Maintanance_price"].replace({"low":1})
#lug_boot generalization
df["Lug_boot"]=df["Lug_boot"].replace({"small":1})
df["Lug_boot"]=df["Lug_boot"].replace({"med":2})
df["Lug_boot"]=df["Lug_boot"].replace({"big":3})
#Safety Generalization
df["Safety"]=df["Safety"].replace({"low":1})
df["Safety"]=df["Safety"].replace({"med":2})
df["Safety"]=df["Safety"].replace({"big":3})
print(df.head())
while printing it showed:
Cannot compare types 'ndarray(dtype=int64)' and 'str'
Some of you string
you passed to replace with an (int
)value, actually is an ndarray
of int64
values.
You only have int64( here actually ndarray(dtype=int64))
type data in this column.
See document pandas.Dataframe.replace().
replace()
try to seek and compare them with the str
values you passed.
df["Buying_Price"]=df["Buying_Price"].replace({"vhigh":4})
find all "vhigh"
value and compare with the value currently contains, the replace it with 4
.
At the comparing it fails as try to compare str
data with int64 ('ndarray(dtype=int64)')
A brief example to simulate this:
import pandas as pd
import numpy as np
a = np.array([1])
df = pd.DataFrame({"Maintanance_price": a})
df["Maintanance_price"] = df["Maintanance_price"].replace({"a":1})
print(df)
Out:
TypeError: Cannot compare types 'ndarray(dtype=int64)' and 'str'
I was facing the same issue and what worked for me was converting the datatype of the feature to an object type.
train['Some_feature']=train.Some_feature.astype(object)
Hope it helps.
You could try the following code:
df['Maintanance_price'].replace(to_replace = ['low', 'med','high','vhigh'], value =[1,2,3,4], inplace=True)
df.head()
Also, as suggested by @ouiemboughrra, check if the values have already been converted to numeric, in case you have rerun the cell.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With