Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Plot a pandas dataframe grouped by column

I have the following pandas dataframe 'df':

---------------------------------------------------
             C1     C2     C3      C4      Type
---------------------------------------------------
    Name 
---------------------------------------------------
     x1       a1     b1      c1      d1     'A'
     x2       a2     b2      c2      d2     'A'
     x3       a3     b3      c3      d3     'B'
     x4       a4     b4      c4      d4     'B'
     x5       a5     b5      c5      d5     'A'
     x6       a6     b6      c6      d6     'B'
     x7       a7     b7      c7      d7     'B'
---------------------------------------------------

There are 6 columns in this dataframe : Name, C1, C2, C3, C4, and Type. I would like to generate two line plots (separate plots - not two lines on the same plot) using this dataframe grouped by the 'Type' Column. Basically, I want to plot the values of C1 with respect to Name grouped by Type. So, on one graph, I want to have (x1, c1), (x2, c2), (x5, c5) on one plot, and (x3,c3), (x4, c4), (x6,c6), and (x7,c7) on the other.

Please note that Name, and the other columns are in different rows.

I found a similar question on SO for plotting a boxplot here, so I tried modifying it for line plot. I tried using df.plot(column='C1', by='Type') but seems there is no property 'column' for a plot().

Any ideas on how I can achieve my objective?

like image 390
BajajG Avatar asked Nov 27 '15 05:11

BajajG


1 Answers

You can add the column "Type" to the index, and unstack it so as to have the values of C1 split in two columns according to the value of Type, and then plot them, e.g.:

import pandas
df = pandas.DataFrame({'Values': randn(10), 'Categories': list('AABABBABAB')}, index=range(10))
df.set_index('Categories', append=True).unstack().interpolate().plot(subplots=True)

Notice that for a line plot you need the 'interpolate()'.

Alternatively, you can select the data according to the value of "Type" ("Category" in these examples) and plot them separately, e.g.:

fig, axes = plt.subplots(ncols=2)
df[df.Categories=='A'].Values.plot(ax=axes[0])
df[df.Categories=='B'].Values.plot(ax=axes[1])
like image 54
faltarell Avatar answered Sep 18 '22 18:09

faltarell