I'm experience strange behavior when plotting two plots on top of each other in seaborn. The bar plot appears to work fine, but the regplot appears to be off by one. Note the lack of a reg data point for x=1, and compare the x=2 value to the value in the table for x below, it's clearly off by one.

My pandas Dataframe looks like this:
Threshold per Day # Alarms Percent Reduction
0 1 791 96.72
1 2 539 93.90
2 3 439 91.94
3 4 361 89.82
4 5 317 88.26
5 6 263 85.94
6 7 233 84.41
7 8 205 82.78
8 9 196 82.17
9 10 176 80.66
The code I'm using here is:
%matplotlib inline
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt
fig, ax = plt.subplots()
ax2 = ax.twinx()
sns.barplot(x='Threshold per Day', y="# Alarms", data=results_df, ax=ax, color='lightblue')
sns.regplot(x='Threshold per Day', y='Percent Reduction', data=results_df, marker='x', fit_reg=False, ax=ax2)
Any ideas what's going on or how to fix it?
Caveat: This only addresses a possible fix, I don't know why that is happening in seaborn (but see Edit and comment)
If you're looking just to get a decent plot in the meantime, I would recommend just switching to pure matplotlib, at least just for this plot and any others with similarly strange behaviour. You can get a very similar plot with the following code:
fig, ax = plt.subplots(1,1, sharex=True)
ax2 = ax.twinx()
ax.bar(results_df['Threshold per Day'], results_df['# Alarms'], color='lightblue')
ax2.scatter(results_df['Threshold per Day'], results_df['Percent Reduction'], marker='x')
ax.set_ylabel('# of Alarms')
ax2.set_ylabel('Percent Reduction')
ax.set_xlabel('Threshold Per Day')
plt.xticks(range(1,11))
plt.show()

Edit to take into account ImportanceOfBeingErnest's comment:
You can obtain this plot in seaborn using:
fig, ax = plt.subplots()
ax2 = ax.twinx()
sns.barplot(x=results_df['Threshold per Day'],
y=results_df["# Alarms"], ax=ax, color='lightblue')
sns.regplot(x=np.arange(0,len(results_df)),
y=results_df['Percent Reduction'], marker='x',
fit_reg=False, ax=ax2)
plt.show()
Turns out that in matplotlib, a barplot's category seems to be interpreted as a numeric when possible, whereas in seaborn, it is interpreted as a string, and the locations start at location 0 by default; as your regplot is evenly spaced on the x axis, you can just force their locations onto a range from 0 to the length of your dataframe as above.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With