Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Least square method in python?

I have these values:

T_values = (222, 284, 308.5, 333, 358, 411, 477, 518, 880, 1080, 1259) (x values)
C/(3Nk)_values = (0.1282, 0.2308, 0.2650, 0.3120 , 0.3547, 0.4530, 0.5556, 0.6154, 0.8932, 0.9103, 0.9316) (y values)

I know they follow the model:

C/(3Nk)=(h*w/(k*T))**2*(exp(h*w/(k*T)))/(exp(h*w/(k*T)-1))**2

I also know that k=1.38*10**(-23) and h=6.626*10**(-34). I have to find the w that best describes the measurement data. I'd like to solve this using the least square method in python, however I don't really understand how this works. Can anyone help me?

like image 205
Philipp Avatar asked Apr 25 '17 17:04

Philipp


People also ask

What is least squares linear regression in Python?

Least Squares Linear Regression In Python | by Cory Maklin | Towards Data Science As the name implies, the method of Least Squares minimizes the sum of the squares of the residuals between the observed targets in the… Open in app Home Notifications Lists Stories Write Published in Towards Data Science Cory Maklin Follow Aug 16, 2019 6 min read

What is the least squares method?

The least-squares method is one of the most effective ways used to draw the line of best fit. It is based on the idea that the square of the errors obtained must be minimized to the most possible extent and hence the name least squares method.

How do you do a least squares regression on artificial data?

Consider the artificial data created by x = np.linspace (0, 1, 101) and y = 1 + x + x * np.random.random (len (x)). Do a least squares regression with an estimation function defined by y ^ = α 1 x + α 2. Plot the data points along with the least squares regression. Note that we expect α 1 = 1.5 and α 2 = 1.0 based on this data.

How does SciPy find the solution of a linear least-squares problem?

It uses the iterative procedure scipy.sparse.linalg.lsmr for finding a solution of a linear least-squares problem and only requires matrix-vector product evaluations. If None (default), the solver is chosen based on the type of Jacobian returned on the first iteration. Keyword options passed to trust-region solver.


2 Answers

This answer provides a walk-through on using Python to determine fitting parameters for a general exponential pattern. See also a related posts on linearization techniques and using the lmfit library.

Data Cleaning

First, let's input and organize the sampling data as numpy arrays, which will later help with computation and clarity.

import matplotlib.pyplot as plt
import scipy.optimize as opt
import numpy as np


#% matplotlib inline

# DATA ------------------------------------------------------------------------
T_values = np.array([222, 284, 308.5, 333, 358, 411, 477, 518, 880, 1080, 1259])
C_values = np.array([0.1282, 0.2308, 0.2650, 0.3120 , 0.3547, 0.4530, 0.5556, 0.6154, 0.8932, 0.9103, 0.9316])

x_samp = T_values
y_samp = C_values    

There are many curve fitting functions in scipy and numpy and each is used differently, e.g. scipy.optimize.leastsq and scipy.optimize.least_squares. For simplicity, we will use scipy.optimize.curve_fit, but it is difficult to find an optimized regression curve without selecting reasonable starting parameters. A simple technique will later be demonstrated on selecting starting parameters.

Review

First, although the OP provided an expected fitting equation, we will approach the problem of using Python to curve fit by reviewing the general equation for an exponential function:

enter image description here

Now we build this general function, which will be used a few times:

# GENERAL EQUATION ------------------------------------------------------------
def func(x, A, c, d):
    return A*np.exp(c*x) + d

Trends

  • amplitude: a small A gives a small amplitude
  • shape: a small c controls the shape by flattening the "knee" of the curve
  • position: d sets the y-intercept
  • orientation: a negative A flips the curve across a horizontal axis; a negative c flips the curve across a vertical axis

The latter trends are illustrated below, highlighting the control (black line) compared to a line with a varied parameter (red line):

enter image description here

enter image description here

Selecting Initial Parameters

Using the latter trends, let us next look at the data and try to emulate the curve by adjusting these parameters. For demonstration, we plot several trial equations against our data:

# SURVEY ----------------------------------------------------------------------
# Plotting Sampling Data
plt.plot(x_samp, y_samp, "ko", label="Data")

x_lin = np.linspace(0, x_samp.max(), 50)                   # a number line, 50 evenly spaced digits between 0 and max

# Trials
A, c, d = -1, -1e-2, 1
y_trial1 = func(x_lin,  A,     c, d)
y_trial2 = func(x_lin, -1, -1e-3, 1)
y_trial3 = func(x_lin, -1, -3e-3, 1)

plt.plot(x_lin, y_trial1, "--", label="Trial 1")
plt.plot(x_lin, y_trial2, "--", label="Trial 2")
plt.plot(x_lin, y_trial3, "--", label="Trial 3")
plt.legend()

enter image description here

From simple trial and error, we can approximate the shape, amplitude, position and orientation of the curve better. For instance, we know the first two parameters (A and c) must be negative. We also have a reasonable guess for the order of magnitude for c.

Computing Estimated Parameters

We will now use the parameters of the best trial for our initial guesses:

# REGRESSION ------------------------------------------------------------------
p0 = [-1, -3e-3, 1]                                        # guessed params
w, _ = opt.curve_fit(func, x_samp, y_samp, p0=p0)     
print("Estimated Parameters", w)  

# Model
y_model = func(x_lin, *w)

# PLOT ------------------------------------------------------------------------
# Visualize data and fitted curves
plt.plot(x_samp, y_samp, "ko", label="Data")
plt.plot(x_lin, y_model, "k--", label="Fit")
plt.title("Least squares regression")
plt.legend(loc="upper left")

# Estimated Parameters [-1.66301087 -0.0026884   1.00995394]

enter image description here

How Does this Work?

curve_fit is one of many optimization functions offered by scipy. Given an initial value, the resulting estimated parameters are iteratively refined so that the resulting curve minimizes the residual error, or difference between the fitted line and sampling data. A better guess reduces the number of iterations and speeds up the result. With these estimated parameters for the fitted curve, one can now calculate the specific coefficients for a particular equation (a final exercise left to the OP).

like image 70
pylang Avatar answered Oct 17 '22 12:10

pylang


You want to use scipy:

import scipy.optimize.curve_fit

def my_model(T,w):
    return (hw/(kT))**2*(exp(hw/(kT)))/(exp(hw/(kT)-1))**2
w= 0 #initial guess
popt, pcov = curve_fit(my_model, T_values, C_values,p0=[w])      
like image 23
Mohammad Athar Avatar answered Oct 17 '22 12:10

Mohammad Athar