I have created a matplotlib pie chart:
df.plot(kind='pie', subplots=True, figsize=(6, 4))
My dataframe consists of two columns - Country and Value (% distribution) and has about 25 countries listed. I would like to only plot the top 10 countries by values (by highest %) and within the plot, calculate the remaining countries % value and give it the title of 'All Other Countries'. How do I do this using matplotlib using the .plot function?
Country Value
Albania 4%
Brazil 3%
Denmark 5%
France 10%
Mexico 3%
Nigeria 15%
Spain 4%
U.S. 5%
Use a pie chart if: You have a total number that can be split up into 2-5 categories. One category outweighs the other by a significant margin.
As already stated in the comments, the best way to do this is probably to do the manipulations before plotting. Here's a way how to do it:
import pandas as pd
from matplotlib import pyplot as plt
import numpy as np
countries = [
'Albania',
'Brazil',
'Denmark',
'France',
'Mexico',
'Nigeria',
'Spain',
'Germany',
'Finland',
]
#the full dataframe
df = pd.DataFrame(
data = {'country': countries, 'value' :np.random.rand(len(countries))},
).sort_values('value', ascending = False)
#the top 5
df2 = df[:5].copy()
#others
new_row = pd.DataFrame(data = {
'country' : ['others'],
'value' : [df['value'][5:].sum()]
})
#combining top 5 with others
df2 = pd.concat([df2, new_row])
#plotting -- for comparison left all countries and right
#the others combined
fig, axes = plt.subplots(nrows = 1, ncols = 2, figsize = (9,4))
df.plot(kind = 'pie', y = 'value', labels = df['country'], ax = axes[0])
df2.plot(kind = 'pie', y = 'value', labels = df2['country'], ax = axes[1])
axes[0].set_title('all countries')
axes[1].set_title('top 5')
plt.show()
The result looks like this.
Hope this helps.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With