Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

FutureWarning: specifying 'categories' or 'ordered' in .astype() is deprecated; pass a CategoricalDtype instead

Tags:

python

pandas

The warning in the title is produced by pandas 0.21.0 on Python 3.6.3 with code such as pd.Series(["a", "b", "b"]).astype("category", categories = ["a", "b", "c"]). How exactly is one supposed to write this now?

like image 299
Kodiologist Avatar asked Nov 28 '17 17:11

Kodiologist


People also ask

How to create a categoricaldtype with a specific dtype?

See CategoricalDtype for more. An empty CategoricalDtype with a specific dtype can be created by providing an empty index. As follows, An Index containing the unique categories allowed.

What's the default dtype for empty series?

FutureWarning: The default dtype for empty Series will be 'object' instead of 'float64' in a future version. Specify a dtype explicitly to silence this warning. In the series, I will store pandas timestamps, which I don't know what's the string to use for the dtype. I can't find it. How can I remove those annoying warnings?

What is the purpose of the categoricaldtype class?

This class is useful for specifying the type of a Categorical independent of the values. See CategoricalDtype for more. An empty CategoricalDtype with a specific dtype can be created by providing an empty index.

What is categoricaldtype in pandas?

pandas.api.types.CategoricalDtype (categories = None, ordered = None) : This class is useful for specifying the type of Categorical data independent of the values, with categories and orderness. categories : [index like] Unique categorization of the categories. ordered : [boolean] If false, then the categorical is treated as unordered.


2 Answers

The CategoricalDtype mentioned in the warning is available as pd.api.types.CategoricalDtype. So, you can write pd.Series(["a", "b", "b"]).astype(pd.api.types.CategoricalDtype(categories = ["a", "b", "c"])).

like image 50
Kodiologist Avatar answered Sep 22 '22 05:09

Kodiologist


pd.Categorical(pd.Series(['a','b','b']), categories = ['a', 'b', 'c'])

Also you can use the ordered parameter to create a categorical hierarchy

result = pd.Categorical(pd.Series(['a','b','b']), categories = ['a', 'b', 'c'], ordered = True)

Update to convert to Series dtype

pd.Series(result)
like image 22
dernk Avatar answered Sep 20 '22 05:09

dernk