Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to extract unique permutations from pandas DataSeries?

Working in Jupyter with Pandas DataSeries I have a dataset with rows like this:

color: white
engineType: diesel
make: Ford
manufacturingYear: 2004
accidentCount: 123

What I need to do is to plot charts of accident counts (y-axis) by manufacturing year (x-axis) for all permutations of color/engineType/make. Any ideas how to proceed with this?

To speed things up I have this initial setup:

import numpy as np
import pandas as pd
from pandas import DataFrame, Series
import random


colors = ['white', 'black','silver']
engineTypes = ['diesel', 'petrol']
makes = ['ford', 'mazda', 'subaru']
years = range(2000,2005)

rowCount = 100

def randomEl(data):
    rand_items = [data[random.randrange(len(data))] for item in range(rowCount)]
    return rand_items


df = DataFrame({
    'color': Series(randomEl(colors)),
    'engineType': Series(randomEl(engineTypes)),
    'make': Series(randomEl(makes)),
    'year': Series(randomEl(years)),
    'accidents': Series([int(1000*random.random()) for i in range(rowCount)])
})
like image 508
wciesiel Avatar asked Apr 15 '17 12:04

wciesiel


Video Answer


1 Answers

You can get the number of accidents by unique color, engineType, and make combinations using groupby():

accident_counts = df.groupby(['color', 'engineType', 'make'])['accidents'].sum()

Matplotlib is one way of plotting the results:

import matplotlib.pyplot as plt
accident_counts.plot(kind='bar')
plt.show()
like image 187
ASGM Avatar answered Nov 04 '22 16:11

ASGM