Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Cosine with a Gaussian envelope fit failing in python using scipy.optimize.curve_fit

I'm trying to fit data that resembles a cosine function with a decaying Gaussian envelope using python, and in particular scipy.optimize.curve_fit. However, it seems like my code is not varying the parameters enough.

For example, in this data, the center of the interferogram is slightly negative (around -0.1), but even when I provide a guess that is very close, the optimized parameters are clearly off. The horizontal offset just stays very close to my initial guess even if it is clearly not an optimal point. I checked that by manually varying the horizontal offset and calculating the coefficient of determination and the values offered by curve_fit are not the most optimal.

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from scipy.optimize import curve_fit

id_ = '0' 

filename = '250516_FF_01_coincidences_'+id_+'mm.txt'


def FF_function(x,l,s,A,xoff):
    d = x - xoff 
    return A*np.cos(2*np.pi/l*d)*np.exp(-0.5*(d/s)**2)+A #

data = pd.read_csv(filename, sep=' ',header= None).to_numpy()

# data = data.astype(dtype=np.float32)

fig = plt.subplots(figsize=(10,5))
ax = plt.subplot(111)

ax.plot(data[:,0],data[:,1],alpha = 0.3)

guess = [4.40673558e-02,  2.63973377e-01,  1.63702697e+02, -0.2]

xx=np.linspace(min(data[:,0]),max(data[:,0]),1000)
# ax.plot(xx,FF_function(xx,*guess)

popt,cov = curve_fit(FF_function,data[:,0],data[:,1],p0 = guess)
ax.plot(xx,FF_function(xx,*popt), color ='C0',alpha = 0.5)

SSE = np.sum(np.square(data[:,1]-FF_function(data[:,0], *popt)))
SST = np.sum(np.square(data[:,1]-np.mean(data[:,1])))
r2 = 1 - SSE/SST
print(r2)


print(popt)

# 
ax.set_xlabel('Displacement (mm)')
ax.set_ylabel('Coincidences')

Example

I can try doing a manual minimization of the determination coefficient, but I'm curious about what is going wrong in my fit.

like image 393
QuantumExplorer Avatar asked Sep 15 '25 12:09

QuantumExplorer


1 Answers

Lets confirm the model can be fitted.

We import libraries and your dataset:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import optimize
data = pd.read_csv("250516_FF_01_coincidences_0mm.txt", sep=" ", header=None, names=["x", "y"])

We define the model:

def model(x, A, omega, sigma, x0):
    return A * (np.sin(2 * np.pi * omega * (x - x0)) * np.exp(-0.5 * np.power((x - x0) / sigma, 2.)) + 1.)

We apply curve fit with some raw guess:

popt, pcov = optimize.curve_fit(model, data['x'], data['y'], p0=[200, 30, 1, 0])
# [ 1.63791101e+02,  2.26881496e+01,  2.62294794e-01, -1.01404332e-02]

We can regress using the parameters:

yhat = model(data['x'], *popt)

Regression looks like:

enter image description here

Luckily the pulsation omega of the curve is found without any trouble. That's a good point because curve_fit is not good at fitting period function such sin.

But your problem mostly reside in the value of x0 which is a very sensitive parameters in this fit because it shapes multiple minima error landscape. Lets investigate how the Loss function varies wrt x0 while other parameters are kept constant:

def loss_factory(x, y, p):
    @np.vectorize
    def wrapped(x0):
        return 0.5 * np.sum(np.power((y - model(x, *p, x0)), 2.))
    return wrapped

loss = loss_factory(data['x'], data['y'], popt[:-1])

x0lin = np.linspace(-0.25, 0.25, 500)

L = loss(x0lin)

The result looks like:

enter image description here

You can see that there is multiple local minima for the Loss function that gives different offset (origin shift) to your regressed parameters depending on the initial guess. This is why you will need to refine the initial guess by hand or choose a global minimum optimizer.

Update

As suggested by @Reinderien, we may decouple x0 into two parameters phase offset phi and envelope mean mu. Such a model is defined as follow:

def model2(x, A, omega, sigma, phi, mu):
    return A * (np.sin(2 * np.pi * omega * x + phi) * np.exp(-0.5 * np.power((x - mu) / sigma, 2.)) + 1.)

Regression performs well provided a viable raw guess:

popt2, pcov2 = optimize.curve_fit(model2, data['x'], data['y'], p0=[160, 22, 0.1, 0, 0])
# array([ 1.63807815e+02,  2.26888064e+01,  2.62230722e-01,  1.44459161e+00, -3.42999984e-02])

yhat2 = model2(data['x'], *popt2)

enter image description here

The error landscape for the position parameters behaves way better:

def loss_factory_2(x, y, p):
    @np.vectorize
    def wrapped(phi, mu):
        return 0.5 * np.sum(np.power((y - model2(x, *p, phi, mu)), 2.))
    return wrapped

philin = np.linspace(-2 * np.pi, 2 * np.pi, 200)
mulin = np.linspace(-2, 2, 200)
Phi, Mu = np.meshgrid(philin, mulin)

loss2 = loss_factory_2(data['x'], data['y'], popt2[:-2])

L2 = loss2(Phi, Mu)

enter image description here

Parameter phi indeed exhibit a 2 pi periodicity, while mu does not.

like image 184
jlandercy Avatar answered Sep 17 '25 03:09

jlandercy