Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Seaborn barplot with two y-axis

considering the following pandas DataFrame:

          labels  values_a  values_b  values_x  values_y
  0       date1      1         3        150       170
  1       date2      2         6        200       180

It is easy to plot this with Seaborn (see example code below). However, due to the big difference between values_a/values_b and values_x/values_y, the bars for values_a and values_b are not easily visible (actually, the dataset given above is just a sample and in my real dataset the difference is even bigger). Therefore, I would like to use two y-axis, i.e., one y-axis for values_a/values_b and one for values_x/values_y. I tried to use plt.twinx() to get a second axis but unfortunately, the plot shows only two bars for values_x and values_y, even though there are at least two y-axis with the right scaling. :) Do you have an idea how to fix that and get four bars for each label whereas the values_a/values_b bars relate to the left y-axis and the values_x/values_y bars relate to the right y-axis?

Thanks in advance!

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns

columns = ["labels", "values_a", "values_b", "values_x", "values_y"]
test_data = pd.DataFrame.from_records([("date1", 1, 3, 150, 170),\
                                       ("date2", 2, 6, 200, 180)],\
                                       columns=columns)

# working example but with unreadable values_a and values_b
test_data_melted = pd.melt(test_data, id_vars=columns[0],\
                           var_name="source", value_name="value_numbers")
g = sns.barplot(x=columns[0], y="value_numbers", hue="source",\
                data=test_data_melted)
plt.show()

# values_a and values_b are not displayed
values1_melted = pd.melt(test_data, id_vars=columns[0],\
                         value_vars=["values_a", "values_b"],\
                         var_name="source1", value_name="value_numbers1")
values2_melted = pd.melt(test_data, id_vars=columns[0],\
                         value_vars=["values_x", "values_y"],\
                         var_name="source2", value_name="value_numbers2")
g1 = sns.barplot(x=columns[0], y="value_numbers1", hue="source1",\
                 data=values1_melted)
ax2 = plt.twinx()
g2 = sns.barplot(x=columns[0], y="value_numbers2", hue="source2",\
                 data=values2_melted, ax=ax2)
plt.show()

Working example values_a/values_b missing

like image 490
Flo1895 Avatar asked Aug 16 '18 17:08

Flo1895


People also ask

How do I show two Y-axis in Python?

The easiest way to create a Matplotlib plot with two y axes is to use the twinx() function.

How do I change the Y-axis scale in Seaborn barplot?

Make two variables for max and min values for Y-axis. Use ylim() method to limit the Y-axis range. Use bar() method to plot the bars. To display the figure, use show() method.

How do I make a stacked bar chart in Seaborn?

A stacked Bar plot is a kind of bar graph in which each bar is visually divided into sub bars to represent multiple column data at once. To plot the Stacked Bar plot we need to specify stacked=True in the plot method. We can also pass the list of colors as we needed to color each sub bar in a bar.


1 Answers

This is probably best suited for multiple sub-plots, but if you are truly set on a single plot, you can scale the data before plotting, create another axis and then modify the tick values.

Sample Data

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np

columns = ["labels", "values_a", "values_b", "values_x", "values_y"]
test_data = pd.DataFrame.from_records([("date1", 1, 3, 150, 170),\
                                       ("date2", 2, 6, 200, 180)],\
                                       columns=columns)

test_data_melted = pd.melt(test_data, id_vars=columns[0],\
                           var_name="source", value_name="value_numbers")

Code:

# Scale the data, just a simple example of how you might determine the scaling
mask = test_data_melted.source.isin(['values_a', 'values_b'])
scale = int(test_data_melted[~mask].value_numbers.mean()
            /test_data_melted[mask].value_numbers.mean())
test_data_melted.loc[mask, 'value_numbers'] = test_data_melted.loc[mask, 'value_numbers']*scale

# Plot
fig, ax1 = plt.subplots()
g = sns.barplot(x=columns[0], y="value_numbers", hue="source",\
                data=test_data_melted, ax=ax1)

# Create a second y-axis with the scaled ticks
ax1.set_ylabel('X and Y')
ax2 = ax1.twinx()

# Ensure ticks occur at the same positions, then modify labels
ax2.set_ylim(ax1.get_ylim())
ax2.set_yticklabels(np.round(ax1.get_yticks()/scale,1))
ax2.set_ylabel('A and B')

plt.show()

enter image description here

like image 195
ALollz Avatar answered Sep 28 '22 00:09

ALollz