Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Get a list of categories of categorical variable (Python Pandas)

Tags:

I have a pandas DataFrame with a column representing a categorical variable. How can I get a list of the categories? I tried .values on the column but that does not return the unique levels.

Thanks!

like image 270
Mona Avatar asked Sep 19 '18 11:09

Mona


People also ask

How many categories are in each categorical variable?

Categorical Data Variables A categorical variable is a variable type with two or more categories. Sometimes called a discrete variable, it is mainly classified into two (nominal and ordinal).

How do you find the distribution of categorical data in python?

The frequency distribution of categorical variables is best displayed with bar charts. For this, you first need to compute the frequency of each category with value_counts and then you can conveniently plot that directly with pandas plot. bar . Or else with matplotlib if you prefer, as shown below.


2 Answers

I believe need Series.cat.categories or unique:

np.random.seed(1245)  a = ['No', 'Yes', 'Maybe'] df = pd.DataFrame(np.random.choice(a, size=(10, 3)), columns=['Col1','Col2','Col3']) df['Col1'] = pd.Categorical(df['Col1'])  print (df.dtypes) Col1    category Col2      object Col3      object dtype: object  print (df['Col1'].cat.categories) Index(['Maybe', 'No', 'Yes'], dtype='object')  print (df['Col2'].unique()) ['Yes' 'Maybe' 'No']  print (df['Col1'].unique()) [Maybe, No, Yes] Categories (3, object): [Maybe, No, Yes] 
like image 168
jezrael Avatar answered Sep 22 '22 06:09

jezrael


You can also use value_counts(), but it only works when you use it with a column name, with which you'll get the counts of each category as well. Example:

dataframe['Columnn name'].value_counts()

Alternatively, if you want the total count of categories in a variable, you can do this,

dataframe['Columnn name'].value_counts().count()

like image 34
Anand Avatar answered Sep 23 '22 06:09

Anand