Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python: two-curve gaussian fitting with non-linear least-squares

My knowledge of maths is limited which is why I am probably stuck. I have a spectra to which I am trying to fit two Gaussian peaks. I can fit to the largest peak, but I cannot fit to the smallest peak. I understand that I need to sum the Gaussian function for the two peaks but I do not know where I have gone wrong. An image of my current output is shown:

Current Output

The blue line is my data and the green line is my current fit. There is a shoulder to the left of the main peak in my data which I am currently trying to fit, using the following code:

import matplotlib.pyplot as pt
import numpy as np
from scipy.optimize import leastsq
from pylab import *

time = []
counts = []


for i in open('/some/folder/to/file.txt', 'r'):
    segs = i.split()
    time.append(float(segs[0]))
    counts.append(segs[1])

time_array = arange(len(time), dtype=float)
counts_array = arange(len(counts))
time_array[0:] = time
counts_array[0:] = counts


def model(time_array0, coeffs0):
    a = coeffs0[0] + coeffs0[1] * np.exp( - ((time_array0-coeffs0[2])/coeffs0[3])**2 )
    b = coeffs0[4] + coeffs0[5] * np.exp( - ((time_array0-coeffs0[6])/coeffs0[7])**2 ) 
    c = a+b
    return c


def residuals(coeffs, counts_array, time_array):
    return counts_array - model(time_array, coeffs)

# 0 = baseline, 1 = amplitude, 2 = centre, 3 = width
peak1 = np.array([0,6337,16.2,4.47,0,2300,13.5,2], dtype=float)
#peak2 = np.array([0,2300,13.5,2], dtype=float)

x, flag = leastsq(residuals, peak1, args=(counts_array, time_array))
#z, flag = leastsq(residuals, peak2, args=(counts_array, time_array))

plt.plot(time_array, counts_array)
plt.plot(time_array, model(time_array, x), color = 'g') 
#plt.plot(time_array, model(time_array, z), color = 'r')
plt.show()
like image 539
Harpal Avatar asked Apr 13 '12 15:04

Harpal


1 Answers

This code worked for me providing that you are only fitting a function that is a combination of two Gaussian distributions.

I just made a residuals function that adds two Gaussian functions and then subtracts them from the real data.

The parameters (p) that I passed to Numpy's least squares function include: the mean of the first Gaussian function (m), the difference in the mean from the first and second Gaussian functions (dm, i.e. the horizontal shift), the standard deviation of the first (sd1), and the standard deviation of the second (sd2).

import numpy as np
from scipy.optimize import leastsq
import matplotlib.pyplot as plt

######################################
# Setting up test data
def norm(x, mean, sd):
  norm = []
  for i in range(x.size):
    norm += [1.0/(sd*np.sqrt(2*np.pi))*np.exp(-(x[i] - mean)**2/(2*sd**2))]
  return np.array(norm)

mean1, mean2 = 0, -2
std1, std2 = 0.5, 1 

x = np.linspace(-20, 20, 500)
y_real = norm(x, mean1, std1) + norm(x, mean2, std2)

######################################
# Solving
m, dm, sd1, sd2 = [5, 10, 1, 1]
p = [m, dm, sd1, sd2] # Initial guesses for leastsq
y_init = norm(x, m, sd1) + norm(x, m + dm, sd2) # For final comparison plot

def res(p, y, x):
  m, dm, sd1, sd2 = p
  m1 = m
  m2 = m1 + dm
  y_fit = norm(x, m1, sd1) + norm(x, m2, sd2)
  err = y - y_fit
  return err

plsq = leastsq(res, p, args = (y_real, x))

y_est = norm(x, plsq[0][0], plsq[0][2]) + norm(x, plsq[0][0] + plsq[0][1], plsq[0][3])

plt.plot(x, y_real, label='Real Data')
plt.plot(x, y_init, 'r.', label='Starting Guess')
plt.plot(x, y_est, 'g.', label='Fitted')
plt.legend()
plt.show()

Results of the code.

like image 83
Usagi Avatar answered Sep 27 '22 23:09

Usagi