Setting the value of a new dataframe column:
df.loc[df["Measure"] == metric.label, "source_data_url"] = metric.source_data_url
now (as of Pandas version 2.1.0) gives a warning,
FutureWarning:
Setting an item of incompatible dtype is deprecated and will raise in a future error of pandas. Value ' metric_3' has dtype incompatible with float64, please explicitly cast to a compatible dtype
first.
The Pandas documentation discusses how the problem can be solved for a Series but it is not clear how to do this iteratively (the line above is called in a loop over metrics and it's the final metric that gives the warning) when assigning a new DataFrame column. How can this be done?
I had the same problem. My intuition of this is that when you are setting value for the first time to the column source_data_url, the column does not yet exists, so pandas creates a column source_data_url and assigns value NaN to all of its elements. This makes Pandas think that the column's dtype is float64. Then it raises this warning.
My solution was to create the column with some default value, e.g. empty string, before adding values to it:
df["source_data_url"] = ""
or None seems also to work:
df["source_data_url"] = None
Since Pandas 2.1.0 setitem-like operations on Series (or DataFrame columns) which silently upcast the dtype are deprecated and show a warning.
In a future version, these will raise an error and you should cast to a common dtype first.
Previous behavior:
In [1]: ser = pd.Series([1, 2, 3])
In [2]: ser
Out[2]:
0 1
1 2
2 3
dtype: int64
In [3]: ser[0] = 'not an int64'
In [4]: ser
Out[4]:
0 not an int64
1 2
2 3
dtype: object
New behavior:
In [1]: ser = pd.Series([1, 2, 3])
In [2]: ser
Out[2]:
0 1
1 2
2 3
dtype: int64
In [3]: ser[0] = 'not an int64'
FutureWarning:
Setting an item of incompatible dtype is deprecated and will raise an error in a future version of pandas.
Value 'not an int64' has dtype incompatible with int64, please explicitly cast to a compatible dtype first.
In [4]: ser
Out[4]:
0 not an int64
1 2
2 3
dtype: object
To retain the current behaviour, you could cast ser to object dtype first:
In [21]: ser = pd.Series([1, 2, 3])
In [22]: ser = ser.astype('object')
In [23]: ser[0] = 'not an int64'
In [24]: ser
Out[24]:
0 not an int64
1 2
2 3
dtype: object
Source: https://pandas.pydata.org/docs/dev/whatsnew/v2.1.0.html#deprecated-silent-upcasting-in-setitem-like-series-operations
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With