I'm trying to build a population pyramid from a pandas df using seaborn. The problem is that some data isn't displayed. As you can see from the plot I created there's some missing data. The Y-axis ticks are 21 and the df's age classes are 21 so why don't they match? What am I missing?
Here's the code I wrote:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
df = pd.DataFrame({'Age': ['0-4','5-9','10-14','15-19','20-24','25-29','30-34','35-39','40-44','45-49','50-54','55-59','60-64','65-69','70-74','75-79','80-84','85-89','90-94','95-99','100+'],
'Male': [-49228000, -61283000, -64391000, -52437000, -42955000, -44667000, -31570000, -23887000, -22390000, -20971000, -17685000, -15450000, -13932000, -11020000, -7611000, -4653000, -1952000, -625000, -116000, -14000, -1000],
'Female': [52367000, 64959000, 67161000, 55388000, 45448000, 47129000, 33436000, 26710000, 25627000, 23612000, 20075000, 16368000, 14220000, 10125000, 5984000, 3131000, 1151000, 312000, 49000, 4000, 0]})
AgeClass = ['100+','95-99','90-94','85-89','80-84','75-79','70-74','65-69','60-64','55-59','50-54','45-49','40-44','35-39','30-34','25-29','20-24','15-19','10-14','5-9','0-4']
bar_plot = sns.barplot(x='Male', y='Age', data=df, order=AgeClass)
bar_plot = sns.barplot(x='Female', y='Age', data=df, order=AgeClass)
bar_plot.set(xlabel="Population (hundreds of millions)", ylabel="Age-Group", title = "Population Pyramid")
Population structure and population pyramids The most common method to show the structure is by using a population pyramid. This graph is made up by putting two bar graphs (one for male, one for female) side by side. From this you can read off what percentage of a population is of a certain gender and age range.
There are five stages of population pyramids: high fluctuating, early expanding, late expanding, low fluctuating, and natural decrease.
As explained by JohanC, the data is not missing, it's just very small compared to the other bars.
Another factor is that you seem to have a white border around each of your bars, which hides the very small bars at the top. Try putting lw=0
in your call to barplot
. This is what I am getting:
bar_plot = sns.barplot(x='Male', y='Age', data=df, order=AgeClass, lw=0)
bar_plot = sns.barplot(x='Female', y='Age', data=df, order=AgeClass, lw=0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With