I'm trying to fit an exponential curve to data sets containing damped harmonic oscillations. The data is a bit complicated in the sense that the sinusoidal oscillations contain many frequencies as seen below:
I need to find the rate of decay in the data. The method I am using can be found here. How it works, is it takes the log of the y values above the steady state value and then uses:
lsqlin(A,y1(:),-A,-y1(:),[],[],[],[],[],optimset('algorithm','active-set','display','off'))
To fit it.
However, this results in the following data fits:
I tried using a linear regression fit which obviously didn't work because it took the average. I also tried RANSAC thinking that there is more data near the peaks. It worked a bit better than the linear regression but the method is flawed as there are times when more points exist at the wrong regions.
Does anyone know of a good method to just fit the peaks for this data?
Currently, I'm thinking of dividing the 500 data points into 10 different regions and in each region find the largest value. At the end, I should have 50 points that I can fit using any of the exponential fitting methods mentioned above. What do you think of this method?
Thought I'd give everyone an update of potential solutions that may work. As mentioned earlier, the data is complicated by the varying sinusoidal frequencies, so certain methods may not work because of this. The methods listed below can be good depending on the data and the frequencies involved.
First off, I assume that the data has the form:
y = average + b*e^-(c*x)
In my case, the average is 290 so we have:
y = 290 + b*e^-(c*x)
With that being said, let's dive into the different methods that I tried:
findpeaks() Method
This is the method that Alexander Büse suggested. It's a pretty good method for most data, but for my data, since there's multiple sinusoidal frequencies, it gets the wrong peaks. The red x's show the peaks.
% Find Peaks Method
[max_num,max_ind] = findpeaks(y(ind));
plot(max_ind,max_num,'x','Color','r'); hold on;
x1 = max_ind;
y1 = log(max_num-290);
coeffs = polyfit(x1,y1,1)
b = exp(coeffs(2));
c = coeffs(1);
RANSAC
RANSAC is good if you have most of your data at the peaks. You see that in mine, because of the multiple frequencies, more peaks exist near the top. However, the problem with my data is that not all the data sets are like this. Hence, it occasionally worked.
% RANSAC Method
ind = (y > avg);
x1 = x(ind);
y1 = log(y(ind) - avg);
iterNum = 300;
thDist = 0.5;
thInlrRatio = .1;
[t,r] = ransac([x1;y1'],iterNum,thDist,thInlrRatio);
k1 = -tan(t);
b1 = r/cos(t);
% plot(x1,k1*x1+b1,'r'); hold on;
b = exp(b1);
c = k1;
Lsqlin Method
This method is the one used here. It uses Lsqlin to constrain the system. However, it seems to ignore the data in the middle. Depending on your data set, this could work really well as it did for the person in the original post.
% Lsqlin Method
avg = 290;
ind = (y > avg);
x1 = x(ind);
y1 = log(y(ind) - avg);
A = [ones(numel(x1),1),x1(:)]*1.00;
coeffs = lsqlin(A,y1(:),-A,-y1(:),[],[],[],[],[],optimset('algorithm','active-set','display','off'));
b = exp(coeffs(2));
c = coeffs(1);
Find Peaks in Period
This is the method I mentioned in my post where I get the peak in each region, . This method works pretty well and from this I realized that my data may not actually have a perfect exponential fit. We see that it is unable to fit the large peaks at the beginning. I was able to make this a bit better by only using the first 150 data points and ignoring the steady state data points. Here I found the peak every 25 data points.
% Incremental Method 2 Unknowns
x1 = [];
y1 = [];
max_num=[];
max_ind=[];
incr = 25;
for i=1:floor(size(y,1)/incr)
[max_num(end+1),max_ind(end+1)] = max(y(1+incr*(i-1):incr*i));
max_ind(end) = max_ind(end) + incr*(i-1);
if max_num(end) > avg
x1(end+1) = max_ind(end);
y1(end+1) = log(max_num(end)-290);
end
end
plot(max_ind,max_num,'x','Color','r'); hold on;
coeffs = polyfit(x1,y1,1)
b = exp(coeffs(2));
c = coeffs(1);
Using all 500 data points:
Using the first 150 data points:
Find Peaks in Period With b Constrained
Since I want it to start at the first peak, I constrained the b value. I know the system is y=290+b*e^-c*x
and I constrain it such that b=y(1)-290
. By doing so, I just need to solve for c where c=(log(y-290)-logb)/x
. I can then take the average or median of c. This method is quite good as well, it doesn't fit the value near the end as well but that isn't as big of a deal since the change there is minimal.
% Incremental Method 1 Unknown (b is constrained y(1)-290 = b)
b = y(1) - 290;
c = [];
max_num=[];
max_ind=[];
incr = 25;
for i=1:floor(size(y,1)/incr)
[max_num(end+1),max_ind(end+1)] = max(y(1+incr*(i-1):incr*i));
max_ind(end) = max_ind(end) + incr*(i-1);
if max_num(end) > avg
c(end+1) = (log(max_num(end)-290)-log(b))/max_ind(end);
end
end
c = mean(c); % Or median(c) works just as good
Here I take the peak for every 25 data points and then take the mean of c
Here I take the peak for every 25 data points and then take the median of c
Here I take the peak for every 10 data points and then take the mean of c
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With