I have a noisy signal and a model function, for example:
x=linspace(0,20);
w=[2 6 -4 5];
y=w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);
y(randi(length(y),[1 10]))=10*rand(1,10)-5;
plot(x,y,'x')
I thought to use RANSAC to find the w
in my model, as this method is robust to noise when finding lines. However, this is not a linear problem, and I couldn't get a proper fit, probably because of the oscillatory nature of the function I'm trying to fit.
I saw matlab has a fitPolynomialRansac function, but even this fails for a a+b*x+c*x^2+d*x^3
simple case (between -1 and 1).
Any ideas how to tame RANSAC? or of a different robust to noise approach?
Accidentaly, I have at hand a picture of original noisy/smoothed data and its first derivative, with reliably estimated uncertainties; see below. Be aware of the simple fact: differentiation always decreases signal/noise ratio! however, simpler approach is to do smoothing filter to alleviate the noise, then do the derivatives.
If you mean a robust method for linear regression, you can just change the loss function from least squares to least absolute values, an M estimator, or other more robust methods. Hi ! 1- Verify normality (test examples: Kolmogorov-Smirnov; Shapiro-Wilk) and homogeneity of variance (Example: White test).
Noisy signal can be defined as:y (t)=x (t)+η (t)where x (t) is informative signal and η (t) is Gaussian white noise. Ian Hickman, in Analog Circuits Cookbook (Second Edition), 1999 A noisy signal may be considered, in the simplest state, to be a steady state CW signal plus narrowband noise.
The detail coefficients of a noisy signal are often such that the coefficients of the signal are confined to coarser scales, while those of the noise are observed in finer scales. The separation of coefficients is illustrated by the example in Figure 4.48, where the detail coefficients dj ( k) are shown for both a clean and a noisy signal.
This is just implementing @mikuszefski's comment to use a soft L1 loss function. It does appear to be more resistant to noise:
x = linspace(0,20);
% model function
yFun=@(w) w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);
% generate training data
N = 20; % number of noisy elements
w=[2 6 -4 5]; % true paramater values
y = yFun(w); % evaluate true model function
y(randi(length(y),[1 N])) = 10*rand(1,N)-5; % add noise
% standard loss function for least sqaure
d = @(w) yFun(w)-y;
v1 = lsqnonlin(d,[1 1 1 1]); % normal least squares
v2 = lsqnonlin(@(w) sqrt(2*(sqrt(1+d(w).^2)-1)),[1 1 1 1]) % soft L1 loss
I have modified the cost function testing a hard(er) L1 assumption and got a much more robust fit vs the answer of David (I was testing with a higher # of noisy elements, see my addition for v3
) :
x = linspace(0,20);
% model function
yFun=@(w) w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);
% generate training data
N = 50; % number of noisy elements
w=[2 6 -4 5]; % true paramater values
y = yFun(w); % evaluate true model function
y(randi(length(y),[1 N])) = 10*rand(1,N)-5; % add noise
% standard loss function for least sqaure
d = @(w) yFun(w)-y;
v1 = lsqnonlin(d,[1 1 1 1]); % normal least squares
v2 = lsqnonlin(@(w) sqrt(2*(sqrt(1+d(w).^2)-1)),[1 1 1 1]) % soft L1 loss
v3 = lsqnonlin(@(w) sqrt(2*(sqrt(1+abs(d(w)))-1)) ,[1 1 1 1]) % the new cost function
plot(x,y,'x',x,yFun(v1),x,yFun(v2),x,yFun(v3))
legend('data','v1','v2','v3')
The ransac
-implementation in MATLAB seems to work natively only with predefined fitting functions.
However, you could create a workaround by wraping a fitting function in a function. The function should be part of the *Computer Vision Toolbox" or the Automated Driving System Toolbox.
x = linspace(0,20).';
w = [2 6 -4 5];
y = w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);
y(randi(length(y),[1 10])) = 10*rand(1,10)-5;
plot(x,y,'x')
sampleSize = 5; % number of points to sample per trial
maxDistance = 2; % max allowable distance for inliers
% creating the function you want to obtain
bFnc = @(x,w) w(1)*besselj(0,x)+w(2)*besselj(1,x)+w(3)*besselj(2,x)+w(4)*besselj(3,x);
% wrapping this function into a cost function with two inputs (the points
% and the "model"
cstFmin = @(xy,w) sum((xy(:,2) - bFnc(xy(:,1),w)).^2);
% creating a function handle that fits the function + returns a single
% object as a model
fitFnc = @(xy) {fminsearch(@(w)cstFmin(xy,w),ones(4,1))}; % returns the "model" as a cell
% build a function that determines a distance measure
evlFnc = @(model,xy)(( xy(:,2) - bFnc(xy(:,1),model{1}) ).^2);
% call RANSAC
[modelRANSAC, inlierIdx] = ransac([x,y],fitFnc,evlFnc, sampleSize,maxDistance);
% plot results
% plot
xIN = x(inlierIdx);
yIN = bFnc(xIN,modelRANSAC);
hold on
plot(xIN,yIN,'r-')
hold off
title(num2str(sampleSize,'sampleSize = %d'))
Note that the fminsearch
-algorithm always starts at ones(4,1)
.. You could also integrate a different optimization algorithm here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With