Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: generate and plot average

I have a pandas dataframe like:

In [61]: df = DataFrame(np.random.rand(3,4), index=['art','mcf','mesa'],
                        columns=['pol1','pol2','pol3','pol4'])

In [62]: df
Out[62]: 
          pol1      pol2      pol3      pol4
art   0.661592  0.479202  0.700451  0.345085
mcf   0.235517  0.665981  0.778774  0.610344
mesa  0.838396  0.035648  0.424047  0.866920

and I want to generate a row with the average for the policies across benchmarks and then plot it.

Currently, the way I do this is:

df = df.T
df['average'] = df.apply(average, axis=1)
df = df.T
df.plot(kind='bar')

Is there an elegant way to avoid the double transposition?

I tried:

df.append(DataFrame(df.apply(average)).T)
df.plot(kind='bar')

This will append the correct values but does not update the index properly and the graph is messed up.

A clarification. The result of the code with the double transposition is this: enter image description here This is what I want. To show both the benchmarks and the average of the policies, not just the average. I was just curious if I can do it better.

Note that the legend is usually messed up. For a fix:

ax = df.plot(kind='bar')
ax.legend(patches, list(df.columns), loc='best')
like image 911
vkontori Avatar asked Dec 15 '12 08:12

vkontori


People also ask

How do you display average in pandas?

To get column average or mean from pandas DataFrame use either mean() and describe() method. The DataFrame. mean() method is used to return the mean of the values for the requested axis.

Can pandas generate graphics plots?

Pandas uses the plot() method to create diagrams. We can use Pyplot, a submodule of the Matplotlib library to visualize the diagram on the screen.

How do you plot mean in pandas?

mean()['weight'] #Create a plot as the variable "ax" ax = data. plot(kind='bar', title="Mean weight by plot", figsize = (10,4)) #Set axis labels for the "ax" plot ax. set(xlabel='Plot ID', ylabel='Mean weight (g)');


1 Answers

You can simply use the instance method mean of the DataFrame and than plot the results. There is no need for transposition.

In [14]: df.mean()
Out[14]: 
pol1    0.578502
pol2    0.393610
pol3    0.634424
pol4    0.607450

In [15]: df.mean().plot(kind='bar')
Out[15]: <matplotlib.axes.AxesSubplot at 0x4a327d0>

policies.png

Update

If you want to plot the bars of all columns and the mean you can append the mean:

In [95]: average = df.mean()

In [96]: average.name = 'average'

In [97]: df = df.append(average)

In [98]: df
Out[98]: 
             pol1      pol2      pol3      pol4
art      0.661592  0.479202  0.700451  0.345085
mcf      0.235517  0.665981  0.778774  0.610344
mesa     0.838396  0.035648  0.424047  0.866920
average  0.578502  0.393610  0.634424  0.607450

In [99]: df.plot(kind='bar')
Out[99]: <matplotlib.axes.AxesSubplot at 0x52f4390>

second plot

If your layout doesn't fit in to the subplot tight_layout will adjust the matplotlib parameters.

like image 194
bmu Avatar answered Oct 21 '22 15:10

bmu