Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to fix the issue of CategoricalIndex column in pandas?

Tags:

python

pandas

I am working with chicago crime data and want to aggregated count of top 5 crimes for each region/community area. However, my code works but I got unwanted index and CategoricalIndex type column in dataframe columns which stop me to access particular columns for further data manipulation.

what I did:

crimes_2012 = pd.read_csv('Chicago_Crimes_2012_to_2017.csv', sep=',', error_bad_lines=False)
df=crimes_2012[['Primary Type', 'Location Description', 'Community Area']]
crime_catg = df.groupby(['Community Name', 'Primary Type'])['Primary Type'].count().unstack()
crime_catg = crime_catg[['THEFT','BATTERY', 'CRIMINAL DAMAGE', 'NARCOTICS', 'ASSAULT']]
crime_catg = crime_catg.dropna()

here is my current output that needs to be improved:

enter image description here

here is my attempt:

when I tried below code, I still didn't get new index and index name displayed strange in output dataframe. why? how to fix this? any idea? Thanks

enter image description here

even when I tried to reindex dataframe it didn't get new index after all.

crime_catg.reindex(inplace=True, drop=True)

any idea to fix this issue? any thought?

like image 563
beyond_inifinity Avatar asked Apr 18 '19 15:04

beyond_inifinity


1 Answers

There are a couple of ways to handle this.

1) Keep the CategoricalIndex type and the use .add_categories method to update valid categories eg to fix your .reindex problem:

crime_catg.columns = crime_catg.columns.add_categories(['Community Name'])

2) Cast as pandas.Index:

crime_catg.columns = pd.Index(list(crime_catg.columns))
like image 97
Chris Adams Avatar answered Oct 09 '22 07:10

Chris Adams