Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to add regression line and regression line equation on graph

I have the below input file and the code/script to add the regression line on the graph but the code gives this error:ValueError: x and y must have same first dimension. I couldn't figure out that error.

How can I add the regression line and regression line equation on graph?

Input file:

-5.06   -4.27
-6.69   -7.28
-3.80   -3.51
-3.88   -2.79
-0.90   -0.81
 2.10    2.59
-1.08    0.28
-5.00   -3.39
 2.67    2.92
 2.48    2.85
-5.10   -3.49
 2.88    3.09
 2.30    2.67
-3.47   -2.20
-0.90   -0.79

Script:

#!/usr/bin/python
import numpy as np
import pylab as plot
import matplotlib.pyplot as plt
import numpy, scipy, pylab, random
from matplotlib.ticker import MultipleLocator
import matplotlib as mpl
from matplotlib.ticker import MaxNLocator
from scipy import stats

with open("input.txt", "r") as f:
    x=[]
    y=[]

    for line in f:
        if not line.strip() or line.startswith('@') or line.startswith('#'): continue
        row = line.split()
        x.append(float(row[0]))
        y.append(float(row[1]))

fig = plt.figure(figsize=(2.2,2.2), dpi=300)
ax = plt.subplot(111)

plt.xlim(4, -8)
plt.ylim(4, -8)

ax.xaxis.set_major_locator(MaxNLocator(6))
ax.yaxis.set_major_locator(MaxNLocator(6))

ax.xaxis.set_minor_locator(MultipleLocator(1))
ax.yaxis.set_minor_locator(MultipleLocator(1))


#regression part
slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)
line = slope*x+intercept
plt.plot(x, line, 'r', label='fitted line')
#end

plt.scatter(x,y,color=['black','black','black','black','black','black','black','black','black','black','black','black','black','black','black'], s=3.5)


plt.savefig("output.png", dpi=300)
like image 346
qasim Avatar asked Jan 21 '18 19:01

qasim


People also ask

How do you add a regression line to a graph?

To add a regression line, choose "Add Chart Element" from the "Chart Design" menu. In the dialog box, select "Trendline" and then "Linear Trendline". To add the R2 value, select "More Trendline Options" from the "Trendline menu. Lastly, select "Display R-squared value on chart".

How do you plot a regression line and determine the regression equation?

The Linear Regression Equation The equation has the form Y= a + bX, where Y is the dependent variable (that's the variable that goes on the Y axis), X is the independent variable (i.e. it is plotted on the X axis), b is the slope of the line and a is the y-intercept.

How do you write the equation of a regression line?

A linear regression line has an equation of the form Y = a + bX, where X is the explanatory variable and Y is the dependent variable. The slope of the line is b, and a is the intercept (the value of y when x = 0).


1 Answers

You cannot multiply a list with a float number. So you may create a numpy array from the input list x,

line = slope*np.array(x)+intercept

More ideally you would read in your data using np.genfromtxt,

x,y = np.genfromtxt("input.txt", unpack=True) 

Complete example:

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MultipleLocator
from matplotlib.ticker import MaxNLocator
from scipy import stats

x,y = np.genfromtxt("input.txt", unpack=True) 

fig = plt.figure(figsize=(2.2,2.2), dpi=300)
ax = plt.subplot(111)

plt.xlim(4, -8)
plt.ylim(4, -8)

ax.xaxis.set_major_locator(MaxNLocator(6))
ax.yaxis.set_major_locator(MaxNLocator(6))

ax.xaxis.set_minor_locator(MultipleLocator(1))
ax.yaxis.set_minor_locator(MultipleLocator(1))


#regression part
slope, intercept, r_value, p_value, std_err = stats.linregress(x,y)

line = slope*x+intercept
plt.plot(x, line, 'r', label='y={:.2f}x+{:.2f}'.format(slope,intercept))
#end

plt.scatter(x,y, color="k", s=3.5)
plt.legend(fontsize=9)

plt.show()

enter image description here

like image 81
ImportanceOfBeingErnest Avatar answered Nov 14 '22 22:11

ImportanceOfBeingErnest