I have a pandas dataframe like:
In [61]: df = DataFrame(np.random.rand(3,4), index=['art','mcf','mesa'],
columns=['pol1','pol2','pol3','pol4'])
In [62]: df
Out[62]:
pol1 pol2 pol3 pol4
art 0.661592 0.479202 0.700451 0.345085
mcf 0.235517 0.665981 0.778774 0.610344
mesa 0.838396 0.035648 0.424047 0.866920
and I want to generate a row with the average for the policies across benchmarks and then plot it.
Currently, the way I do this is:
df = df.T
df['average'] = df.apply(average, axis=1)
df = df.T
df.plot(kind='bar')
Is there an elegant way to avoid the double transposition?
I tried:
df.append(DataFrame(df.apply(average)).T)
df.plot(kind='bar')
This will append the correct values but does not update the index properly and the graph is messed up.
A clarification. The result of the code with the double transposition is this: This is what I want. To show both the benchmarks and the average of the policies, not just the average. I was just curious if I can do it better.
Note that the legend is usually messed up. For a fix:
ax = df.plot(kind='bar')
ax.legend(patches, list(df.columns), loc='best')
To get column average or mean from pandas DataFrame use either mean() and describe() method. The DataFrame. mean() method is used to return the mean of the values for the requested axis.
Pandas uses the plot() method to create diagrams. We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on the screen.
mean()['weight'] #Create a plot as the variable "ax" ax = data. plot(kind='bar', title="Mean weight by plot", figsize = (10,4)) #Set axis labels for the "ax" plot ax. set(xlabel='Plot ID', ylabel='Mean weight (g)');
You can simply use the instance method mean
of the DataFrame
and than plot the results. There is no need for transposition.
In [14]: df.mean()
Out[14]:
pol1 0.578502
pol2 0.393610
pol3 0.634424
pol4 0.607450
In [15]: df.mean().plot(kind='bar')
Out[15]: <matplotlib.axes.AxesSubplot at 0x4a327d0>
If you want to plot the bars of all columns and the mean you can append
the mean:
In [95]: average = df.mean()
In [96]: average.name = 'average'
In [97]: df = df.append(average)
In [98]: df
Out[98]:
pol1 pol2 pol3 pol4
art 0.661592 0.479202 0.700451 0.345085
mcf 0.235517 0.665981 0.778774 0.610344
mesa 0.838396 0.035648 0.424047 0.866920
average 0.578502 0.393610 0.634424 0.607450
In [99]: df.plot(kind='bar')
Out[99]: <matplotlib.axes.AxesSubplot at 0x52f4390>
If your layout doesn't fit in to the subplot tight_layout
will adjust the matplotlib parameters.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With