The uniquetol
function, introduced in R2015a, computes "unique elements within tolerance". Specifically,
C = uniquetol(A,tol)
returns the unique elements inA
using tolerancetol
.
But the problem of finding unique elements with a given tolerance has several solutions. Which one is actually produced?
Let's see two examples:
Let A = [3 5 7 9]
with absolute tolerance 2.5
. The output can be [3 7]
, or it can be [5 9]
. Both solutions satisfy the requirement.
For A = [1 3 5 7 9]
with absolute tolerance 2.5
, the output can be [1 5 9]
or [3 7]
. So even the number of elements in the output can vary.
See this nice discussion about the transitivity issue that lies at the heart of the problem.
So, how does uniquetol
work? What output does it produce among the several existing solutions?
M = min( A ) returns the minimum elements of an array. If A is a vector, then min(A) returns the minimum of A . If A is a matrix, then min(A) is a row vector containing the minimum value of each column of A .
The tol input argument lets you specify what constitutes "close enough".
n = numel( A ) returns the number of elements, n , in array A , equivalent to prod(size(A)) .
To simplify, I consider the one-output, two-input version of uniquetol
,
C = uniquetol(A, tol);
where the first input is a double
vector A
. In particular, this implies that:
'ByRows'
option of uniquetol
is not used.uniquetol
would implicitly linearize to a column, as usual.The second input, which defines the tolerance, is interpreted as follows:
Two values,
u
andv
, are within tolerance ifabs(u-v) <= tol*max(abs(A(:)))
That is, the specified tolerance is relative by default. The actual tolerance used in the comparisons is obtained by scaling by the maximum absolute value in A
.
With these considerations, it seems that the approach that uniquetol
uses is:
A
.A
, and set this as reference value (this value will have to be updated later).C
.A
until one is found that is not within tolerance of the reference value. When that entry is found, take it as the new reference value and go back to step 3.Of course, I'm not saying that this is what uniquetol
internally does. But the output seems to be the same. So this is functionally equivalent to what uniquetol
does.
The following code implements the approach described above (inefficient code, just to illustrate the point).
% Inputs A, tol
% Output C
tol_scaled = tol*max(abs(A(:))); % scale tolerance
C = []; % initiallize output. Will be extended
ref = NaN; % initiallize reference value to NaN. This will immediately cause
% A(1) to become the new reference
for a = sort(A(:)).';
if ~(a-ref <= tol_scaled)
ref = a;
C(end+1) = ref;
end
end
To verify this, let's generate some random data and compare the output of uniquetol
and of the above code:
clear
N = 1e3; % number of realizations
S = 1e5; % maximum input size
for n = 1:N;
% Generate inputs:
s = randi(S); % input size
A = (2*rand(1,S)-1) / rand; % random input of length S; positive and
% negative values; random scaling
tol = .1*rand; % random tolerance (relative). Change value .1 as desired
% Compute output:
tol_scaled = tol*max(abs(A(:))); % scale tolerance
C = []; % initiallize output. Will be extended
ref = NaN; % initiallize reference value to NaN. This will immediately cause
% A(1) to become the new reference
for a = sort(A(:)).';
if ~(a-ref <= tol_scaled)
ref = a;
C(end+1) = ref;
end
end
% Check if output is equal to that of uniquetol:
assert(isequal(C, uniquetol(A, tol)))
end
In all my tests this has run without the assertion failing.
So, in summary, uniquetol
seems to sort the input, pick its first entry, and keep skipping entries for as long as it can.
For the two examples in the question, the outputs are as follows. Note that the second input is specified as 2.5/9
, where 9
is the maximum of the first input, to achieve an absolute tolerance of 2.5
:
>> uniquetol([1 3 5 7 9], 2.5/9)
ans =
1 5 9
>> uniquetol([3 5 7 9], 2.5/9)
ans =
3 7
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With