I have a pandas DataFrame with a column representing a categorical variable. How can I get a list of the categories? I tried .values on the column but that does not return the unique levels.
Thanks!
Categorical Data Variables A categorical variable is a variable type with two or more categories. Sometimes called a discrete variable, it is mainly classified into two (nominal and ordinal).
The frequency distribution of categorical variables is best displayed with bar charts. For this, you first need to compute the frequency of each category with value_counts and then you can conveniently plot that directly with pandas plot. bar . Or else with matplotlib if you prefer, as shown below.
I believe need Series.cat.categories
or unique
:
np.random.seed(1245) a = ['No', 'Yes', 'Maybe'] df = pd.DataFrame(np.random.choice(a, size=(10, 3)), columns=['Col1','Col2','Col3']) df['Col1'] = pd.Categorical(df['Col1']) print (df.dtypes) Col1 category Col2 object Col3 object dtype: object print (df['Col1'].cat.categories) Index(['Maybe', 'No', 'Yes'], dtype='object') print (df['Col2'].unique()) ['Yes' 'Maybe' 'No'] print (df['Col1'].unique()) [Maybe, No, Yes] Categories (3, object): [Maybe, No, Yes]
You can also use value_counts(), but it only works when you use it with a column name, with which you'll get the counts of each category as well. Example:
dataframe['Columnn name'].value_counts()
Alternatively, if you want the total count of categories in a variable, you can do this,
dataframe['Columnn name'].value_counts().count()
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With