I have a data frame in pandas in python which resembles something like this -
contest_login_count contest_participation_count ipn_ratio
0 1 1 0.000000
1 3 3 0.083333
2 3 3 0.000000
3 3 3 0.066667
4 5 13 0.102804
5 2 3 0.407407
6 1 3 0.000000
7 1 2 0.000000
8 53 91 0.264151
9 1 2 0.000000
Now I want to apply a function to each row of this dataframe The function is written as this -
def findCluster(clusterModel,data):
return clusterModel.predict(data)
I apply this function to each row in this manner -
df_fil.apply(lambda x : findCluster(cluster_all,x.reshape(1,-1)),axis=1)
When I run this code, I get a warning saying -
DataConversionWarning: Data with input dtype object was converted to float64.
warnings.warn(msg, DataConversionWarning)
This warning is printed once for each row. Since, I have around 450K rows in my data frame, my computer hangs while printing all these warning messages that too on ipython notebook.
But to test my function I created a dummy dataframe and tried applying the same function on that and it works well. Here is the code for that -
t = pd.DataFrame([[10.35,100.93,0.15],[10.35,100.93,0.15]])
t.apply(lambda x:findCluster(cluster_all,x.reshape(1,-1)),axis=1)
The output to this is -
0 1 2
0 4 4 4
1 4 4 4
Can anyone suggest what am I doing wrong or what can I change to make this error go away?
Use apply() function when you wanted to update every row in pandas DataFrame by calling a custom function. In order to apply a function to every row, you should use axis=1 param to apply(). By applying a function to each row, we can create a new column by using the values from the row, updating the row e.t.c.
The best way to convert one or more columns of a DataFrame to numeric values is to use pandas.to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.
The Python TypeError is an exception that occurs when the data type of an object in an operation is inappropriate. This can happen when an operation is performed on an object of an incorrect type, or it is not supported for the object.
I think there is problem dtype
of some column is not float
.
You need cast it by astype
:
df['colname'] = df['colname'].astype(float)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With