Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas Dataframe to Seaborn Grouped Barchart

I have the following dataframe which I have obtained from a larger dataframe which lists the worst 10 "Benchmark Returns" and their corresponding portfolio returns and dates:

enter image description here

I've managed to create a Seaborn bar plot which lists Benchmark Returns against their corresponding dates with this script:

import pandas as pd
import seaborn as sns

df = pd.read_csv('L:\\My Documents\\Desktop\\Data NEW.csv', parse_dates = True)

df = df.nsmallest(10, columns = 'Benchmark Returns')
df = df[['Date', 'Benchmark Returns', 'Portfolio Returns']]
p6 = sns.barplot(x = 'Date', y = 'Benchmark Returns', data = df)
p6.set(ylabel = 'Return (%)')
for x_ticks in p6.get_xticklabels():
    x_ticks.set_rotation(90)

And it produces this plot:

enter image description here

However, what I'd like is a grouped bar plot that contains both Benchmark Returns and Portfolio Returns, where two different colours are used to distinguish between these two categories.

I've tried several different methods but nothing seems to work.

Thanks in advance for all your help!

like image 663
Vladimir Nabokov Avatar asked Jul 27 '16 09:07

Vladimir Nabokov


People also ask

How do I make a stacked bar chart in Seaborn?

A stacked Bar plot is a kind of bar graph in which each bar is visually divided into sub bars to represent multiple column data at once. To plot the Stacked Bar plot we need to specify stacked=True in the plot method. We can also pass the list of colors as we needed to color each sub bar in a bar.

How do you plot multiple columns in Seaborn?

In Seaborn, we will plot multiple graphs in a single window in two ways. First with the help of Facetgrid() function and other by implicit with the help of matplotlib. data: Tidy dataframe where each column is a variable and each row is an observation.

How do you display values in Seaborn barplot?

In seaborn barplot with bar, values can be plotted using sns. barplot() function and the sub-method containers returned by sns. barplot(). Import pandas, numpy, and seaborn packages.


1 Answers

The trick is to transform the pandas df from wide to long format

Step 1: Prepare data

import seaborn as sns

np.random.seed(123)
index = np.random.randint(1,100,10)

x1 = pd.date_range('2000-01-01','2015-01-01').map(lambda t: t.strftime('%Y-%m-%d'))
dts = np.random.choice(x1,10)

benchmark = np.random.randn(10)
portfolio = np.random.randn(10)

df = pd.DataFrame({'Index': index,
                   'Dates': dts,
                   'Benchmark': benchmark,
                   'Portfolio': portfolio},
                    columns = ['Index','Dates','Benchmark','Portfolio'])

Step 2: From "wide" to "long" format

df1 = pd.melt(df, id_vars=['Index','Dates']).sort_values(['variable','value'])
df1

    Index   Dates   variable    value
9   48  2012-06-13  Benchmark   -1.410301
1   93  2002-07-31  Benchmark   -1.301489
8   97  2005-01-21  Benchmark   -1.100985
0   67  2011-06-01  Benchmark   0.126526
4   84  2003-09-25  Benchmark   0.465645
3   18  2009-07-13  Benchmark   0.522742
5   58  2007-12-04  Benchmark   0.724915
7   98  2002-12-28  Benchmark   0.746581
6   87  2009-02-07  Benchmark   1.495827
2   99  2000-04-21  Benchmark   2.207427
16  87  2009-02-07  Portfolio   -2.750224
14  84  2003-09-25  Portfolio   -1.855637
15  58  2007-12-04  Portfolio   -1.779455
19  48  2012-06-13  Portfolio   -1.774134
11  93  2002-07-31  Portfolio   -0.984868
12  99  2000-04-21  Portfolio   -0.748569
10  67  2011-06-01  Portfolio   -0.747651
18  97  2005-01-21  Portfolio   -0.695981
17  98  2002-12-28  Portfolio   -0.234158
13  18  2009-07-13  Portfolio   0.240367

Step 3: Plot

sns.barplot(x='Dates', y='value', hue='variable', data=df1)
plt.xticks(rotation=90)
plt.ylabel('Returns')
plt.title('Portfolio vs Benchmark Returns');

enter image description here

like image 51
Sergey Bushmanov Avatar answered Oct 13 '22 01:10

Sergey Bushmanov