Plotting Pandas DataFrame from pivot

Question

I am trying to plot a line graph comparing the Murder Rates of particular States through the years 1960-1962 using Pandas in a Jupyter Notebook.

A little context about where I am now, and how I arrived here:

I'm using a crime csv file, which looks like this: enter image description here

I'm only interested in 3 columns for the time being: State, Year, and Murder Rate. Specifically I was interested in only 5 states - Alaska, Michigan, Minnesota, Maine, Wisconsin.

So to produce the desired table, I did this (only showing top 5 row entries):

al_mi_mn_me_wi = crimes[(crimes['State'] == 'Alaska') | (crimes['State'] =='Michigan') | (crimes['State'] =='Minnesota') | (crimes['State'] =='Maine') | (crimes['State'] =='Wisconsin')]
control_df = al_mi_mn_me_wi[['State', 'Year', 'Murder Rate']]

enter image description here

From here I used the pivot function

df = control_1960_to_1962.pivot(index = 'Year', columns = 'State',values= 'Murder Rate' )

enter image description here

And this is where I get stuck. I received KeyError when doing (KeyError was Year):

df.plot(x='Year', y='Murder Rate', kind='line')

and when attempting just

df.plot()

I get this wonky graph.

enter image description here

How do I get my desired graph?

tel · Accepted Answer

Given a dataframe in a long (tidy) format, pandas.DataFrame.pivot is used to transform to a wide format, which can be plotted directly with pandas.DataFrame.plot

Tested in python 3.8.11, pandas 1.3.3, matplotlib 3.4.3

import numpy as np
import pandas as pd

control_1960_to_1962 = pd.DataFrame({
    'State': np.repeat(['Alaska', 'Maine', 'Michigan', 'Minnesota', 'Wisconsin'], 3),
    'Year': [1960, 1961, 1962]*5,
    'Murder Rate': [10.2, 11.5, 4.5, 1.7, 1.6, 1.4, 4.5, 4.1, 3.4, 1.2, 1.0, .9, 1.3, 1.6, .9]
})

df = control_1960_to_1962.pivot(index='Year', columns='State', values='Murder Rate')

# display(df)
State  Alaska  Maine  Michigan  Minnesota  Wisconsin
Year                                                
1960     10.2    1.7       4.5        1.2        1.3
1961     11.5    1.6       4.1        1.0        1.6
1962      4.5    1.4       3.4        0.9        0.9

The plots

You can tell Pandas (and through it the matplotlib package that actually does the plotting) what xticks you want explicitly:

ax = df.plot(xticks=df.index, ylabel='Murder Rate')

Output:

enter image description here

ax is a matplotlib.axes.Axes object, and there are many, many customizations you can make to your plot through it.

Here's how to plot with the States on the x axis:

ax = df.T.plot(kind='bar', ylabel='Murder Rate')

Output:

enter image description here

Isura Nimalasiri · Answer

try this you can explore more

   pip install pivottablejs

   import pandas as pd
   import numpy as np
   from pivottablejs import pivot_ui
   df = pd.DataFrame({
      'State': np.repeat(['Alaska', 'Maine', 'Michigan', 'Minnesota','Wisconsin'], 3),
      'Year': [1960, 1961, 1962]*5,
      'Murder Rate': [10.2, 11.5, 4.5, 1.7, 1.6, 1.4, 4.5, 4.1, 3.4, 1.2, 1.0, .9, 1.3, 1.6, .9]})

pivot_ui(df)

enter image description here

Plotting Pandas DataFrame from pivot

Tags:

python

pandas

matplotlib

pivot

Chumbawoo

2 Answers

The plots

tel

Isura Nimalasiri

Recent Activity

Donate For Us

Plotting Pandas DataFrame from pivot

Tags:

python

pandas

matplotlib

pivot

Chumbawoo

2 Answers

The plots

tel

Isura Nimalasiri

Related questions

Recent Activity

Donate For Us