Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to efficiently find correlation and discard points outside 3-sigma range in MATLAB?

I have a data file m.txt that looks something like this (with a lot more points):

286.842995
3.444398
3.707202
338.227797
3.597597
283.740414
3.514729
3.512116
3.744235
3.365461
3.384880

Some of the values (like 338.227797) are very different from the values I generally expect (smaller numbers).

  • So, I am thinking that I will remove all the points that lie outside the 3-sigma range. How can I do that in MATLAB?

  • Also, the bigger problem is that this file has a separate file t.txt associated with it which stores the corresponding time values for these numbers. So, I'll have to remove the corresponding time values from the t.txt file also.

I am still learning MATLAB, and I know there would be some good way of doing this (better than storing indices of the elements that were removed from m.txt and then removing those elements from the t.txt file)

like image 811
Lazer Avatar asked Dec 22 '22 07:12

Lazer


1 Answers

@Amro is close, but the FIND is unnecessary (look up logical subscripting) and you need to include the mean for a true +/-3 sigma range. I would go with the following:

%# load files 
m = load('m.txt'); 
t = load('t.txt'); 

%# find values within range
z = 3;
meanM = mean(m);
sigmaM = std(m);
I = abs(m - meanM) <= z * sigmaM;

%# keep values within range
m = m(I);
t = t(I); 
like image 147
Nzbuu Avatar answered Dec 29 '22 01:12

Nzbuu