Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Combining similar rows within a dataframe column into one

I'm working on the Chicago crimes dataset and I created a dataframe called primary which is just the type of crime. Then I grouped by the type of crime and got its count. This is the relevant code.

primary = crimes2012[['Primary Type']].copy()
test=primary.groupby('PrimaryType').size().sort_values().reset_index(name='Count')

Now I have a dataframe 'test' which has the crimes and their count. What I want to do it merge together certain crimes. For example, "Non-Criminal" and "Non - Criminal" and "Non-Criminal(Subject Specified)". But because they're rows now I don't know how to do it. I was trying to use .loc[]

I also tried using

test['Primary Type'=='NON-CRIMINAL'] = test['Primary Type'=='NON - CRIMINAL']+test['Primary Type'=='NON-CRIMINAL']+test['Primary Type'=='NON-CRIMINAL (SUBJECT SPECIFIED)']

but of course that only returned a Boolean value of false

like image 718
Mustafa Moiz Avatar asked Dec 10 '25 21:12

Mustafa Moiz


1 Answers

You can look at map or apply here - https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.map.html

You will have to create a mapping of your inputs to desired outputs as a dictionary desired_output = {"NON CRIMINAL": "NON-CRIMINAL", "NC": "NON-CRIMINAL", ...}

and apply/map it to your primary series as follows -

primary = primary.map(desired_output)

And then groupby as you are doing now

like image 179
Mortz Avatar answered Dec 12 '25 11:12

Mortz