How to plot multiple linear regressions in the same figure

Tags:

Given the following:

import numpy as np
import pandas as pd
import seaborn as sns

np.random.seed(365)
x1 = np.random.randn(50)
y1 = np.random.randn(50) * 100
x2 = np.random.randn(50)
y2 = np.random.randn(50) * 100

df1 = pd.DataFrame({'x1':x1, 'y1': y1})
df2 = pd.DataFrame({'x2':x2, 'y2': y2})

sns.lmplot('x1', 'y1', df1, fit_reg=True, ci = None)
sns.lmplot('x2', 'y2', df2, fit_reg=True, ci = None)

This will create 2 separate plots. How can I add the data from df2 onto the SAME graph? All the seaborn examples I have found online seem to focus on how you can create adjacent graphs (say, via the 'hue' and 'col_wrap' options). Also, I prefer not to use the dataset examples where an additional column might be present as this does not have a natural meaning in the project I am working on.

If there is a mixture of matplotlib/seaborn functions that are required to achieve this, I would be grateful if someone could help illustrate.

523

asked Mar 16 '16 03:03

laszlopanaflex

2 Answers

Option 1: `sns.regplot`

In this case, the easiest to implement solution is to use sns.regplot, which is an axes-level function, because this will not require combining df1 and df2.

import pandas as pd
import seaborn
import matplotlib.pyplot as plt

# create the figure and axes
fig, ax = plt.subplots(figsize=(6, 6))

# add the plots for each dataframe
sns.regplot(x='x1', y='y1', data=df1, fit_reg=True, ci=None, ax=ax, label='df1')
sns.regplot(x='x2', y='y2', data=df2, fit_reg=True, ci=None, ax=ax, label='df2')
ax.set(ylabel='y', xlabel='x')
ax.legend()
plt.show()

enter image description here

Option 2: `sns.lmplot`

As per sns.FacetGrid, it is better to use figure-level functions than to use FacetGrid directly.
Combine df1 and df2 into a long format, and then use sns.lmplot with the hue parameter.
When working with seaborn, it is almost always necessary for the data to be in a long format.
- It's customary to use pandas.DataFrame.stack or pandas.melt to convert DataFrames from wide to long.
- For this reason, df1 and df2 must have the columns renamed, and have an additional identifying column. This allows them to be concatenated on axis=0 (the default long format), instead of axis=1 (a wide format).
There are a number of ways to combine the DataFrames:
1. The combination method in the answer from Primer is fine if combining a few DataFrames.
2. However, a function, as shown below, is better for combining many DataFrames.

def fix_df(data: pd.DataFrame, name: str) -> pd.DataFrame:
    """rename columns and add a column"""
    # rename columns to a common name
    data.columns = ['x', 'y']
    # add an identifying value to use with hue
    data['df'] = name
    return data


# create a list of the dataframes
df_list = [df1, df2]

# update the dataframes by calling the function in a list comprehension
df_update_list = [fix_df(v, f'df{i}') for i, v in enumerate(df_list, 1)]

# combine the dataframes
df = pd.concat(df_update_list).reset_index(drop=True)

# plot the dataframe
sns.lmplot(data=df, x='x', y='y', hue='df', ci=None)

enter image description here

Notes

Package versions used for this answer:
- pandas v1.2.4
- seaborn v0.11.1
- matplotlib v3.3.4

101

answered Oct 27 '22 06:10

Trenton McKinney

You could use seaborn's FacetGrid class to get desired result. You would need to replace your plotting calls with these lines:

# sns.lmplot('x1', 'y1', df1, fit_reg=True, ci = None)
# sns.lmplot('x2', 'y2', df2, fit_reg=True, ci = None)
df = pd.concat([df1.rename(columns={'x1':'x','y1':'y'})
                .join(pd.Series(['df1']*len(df1), name='df')), 
                df2.rename(columns={'x2':'x','y2':'y'})
                .join(pd.Series(['df2']*len(df2), name='df'))],
               ignore_index=True)

pal = dict(df1="red", df2="blue")
g = sns.FacetGrid(df, hue='df', palette=pal, size=5);
g.map(plt.scatter, "x", "y", s=50, alpha=.7, linewidth=.5, edgecolor="white")
g.map(sns.regplot, "x", "y", ci=None, robust=1)
g.add_legend();

This will yield this plot:

enter image description here

Which is if I understand correctly is what you need.

Note that you will need to pay attention to .regplot parameters and may want to change the values I have put as an example.

; at the end of the line is to suppress output of the command (I use ipython notebook where it's visible).
Docs give some explanation on the .map() method. In essence, it does just that, maps plotting command with data. However it will work with 'low-level' plotting commands like regplot, and not lmlplot, which is actually calling regplot behind the scene.
Normally plt.scatter would take parameters: c='none', edgecolor='r' to make non-filled markers. But seaborn is interfering the process and enforcing color to the markers, so I don't see an easy/straigtforward way to fix this, but to manipulate ax elements after seaborn has produced the plot, which is best to be addressed as part of a different question.

answered Oct 27 '22 08:10

Primer

Related questions
                            
                                xlswriter formatting a range
                            
                                ldap3 python search members of a group and retrieve their sAMAcountName (Active Directory)
                            
                                How to get the output from os.system()? [duplicate]
                            
                                How can I extend a library's decorator?
                            
                                How to unzip multiple gz files in python using multi threading?
                            
                                how to get uploaded file name in django
                            
                                unable to load configuration from uwsgi
                            
                                How to use elasticsearch.helpers.streaming_bulk
                            
                                Passing parameter from WHEN to a THEN
                            
                                How to flatten nested lists in PySpark?
                            
                                What is the point of views in pandas if it is undefined whether an indexing operation returns a view or a copy?
                            
                                StratifiedKFold : IndexError: too many indices for array
                            
                                Python3 CSV writerows, TypeError: 'str' does not support the buffer interface
                            
                                Using a range as a dictionary index in Python [duplicate]
                            
                                Prevent CherryPy from automatically reloading
                            
                                How to set timeout to threads? [duplicate]
                            
                                Jupyter using the wrong version of python
                            
                                How to store neural network knowledge data?
                            
                                numpy generate data from linear function
                            
                                how to compare two string variables in pandas?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to plot multiple linear regressions in the same figure

Tags:

python

pandas

matplotlib

plot

seaborn

laszlopanaflex

People also ask

2 Answers

Option 1: `sns.regplot`

Option 2: `sns.lmplot`

Notes

Trenton McKinney

Primer

Recent Activity

Donate For Us

How to plot multiple linear regressions in the same figure

Tags:

python

pandas

matplotlib

plot

seaborn

laszlopanaflex

People also ask

2 Answers

Option 1: sns.regplot

Option 2: sns.lmplot

Notes

Trenton McKinney

Primer

Related questions

Recent Activity

Donate For Us

Option 1: `sns.regplot`

Option 2: `sns.lmplot`