Get mapping of categorical variables in pandas

Tags:

pandas

I'm doing this to make categorical variables numbers

>>> df = pd.DataFrame({'x':['good', 'bad', 'good', 'great']}, dtype='category')         x 0   good 1    bad 2   good 3  great

How can I get the mapping between the original values and the new values?

461

asked May 28 '15 15:05

1 Answers

Method 1

You can create a dictionary mapping by enumerating (similar to creating a dictionary from a list by creating dictionary keys from the list indices):

dict( enumerate(df['x'].cat.categories ) )  # {0: 'bad', 1: 'good', 2: 'great'}

Method 2

Alternatively, you could map the values and codes in every row:

dict( zip( df['x'].cat.codes, df['x'] ) )  # {0: 'bad', 1: 'good', 2: 'great'}

It's a little more transparent what is happening here, and arguably safer for that reason. It is also much less efficient as the length of the arguments to zip() is len(df) whereas the length of df['x'].cat.categories is only the count of unique values and generally much shorter than len(df).

Additional Discussion

The reason Method 1 works is that the categories have type Index:

type( df['x'].cat.categories )  # pandas.core.indexes.base.Index

and in this case you look up values in an index just as you would a list.

There are a couple of ways to verify that Method 1 works. First, you can just check that a round trip retains the correct values:

(df['x'] == df['x'].cat.codes.map( dict(              enumerate(df['x'].cat.categories) ) ).astype('category')).all() # True

or you can check that Method 1 and Method 2 give the same answer:

(dict( enumerate(df['x'].cat.categories ) ) == dict( zip( df['x'].cat.codes, df['x'] ) ))  # True

answered Oct 01 '22 12:10

JohnE

Related questions
                            
                                Type hinting in Python 2
                            
                                Fast or Bulk Upsert in pymongo
                            
                                Override a form in Django admin
                            
                                How do I create a multiline plot using seaborn?
                            
                                `Sudo pip install matplotlib` fails to find freetype headers. [OS X Mavericks / 10.9] [closed]
                            
                                How to prevent BrokenPipeError when doing a flush in Python?
                            
                                pandas read_json: "If using all scalar values, you must pass an index"
                            
                                Python 3.7.0 No module named 'PyQt5.QtWebEngineWidgets'
                            
                                How do I remove the light grey border around my Canvas widget?
                            
                                create file of particular size in python
                            
                                What's win32con module in python? Where can I find it?
                            
                                How can I break a for loop in jinja2?
                            
                                Python Pylab scatter plot error bars (the error on each point is unique)
                            
                                Is 'encoding is an invalid keyword' error inevitable in python 2.x?
                            
                                Preprocessing in scikit learn - single sample - Depreciation warning
                            
                                Convert UPPERCASE string to sentence case in Python
                            
                                Compare two images the python/linux way
                            
                                Python: get string representation of PyObject?
                            
                                Fitting a Weibull distribution using Scipy
                            
                                Return a requests.Response object from Flask

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With