Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What does "uniquetol" do, exactly?

The uniquetol function, introduced in R2015a, computes "unique elements within tolerance". Specifically,

C = uniquetol(A,tol) returns the unique elements in A using tolerance tol.

But the problem of finding unique elements with a given tolerance has several solutions. Which one is actually produced?

Let's see two examples:

  1. Let A = [3 5 7 9] with absolute tolerance 2.5. The output can be [3 7], or it can be [5 9]. Both solutions satisfy the requirement.

  2. For A = [1 3 5 7 9] with absolute tolerance 2.5, the output can be [1 5 9] or [3 7]. So even the number of elements in the output can vary.

See this nice discussion about the transitivity issue that lies at the heart of the problem.

So, how does uniquetol work? What output does it produce among the several existing solutions?

like image 503
Luis Mendo Avatar asked Sep 07 '17 16:09

Luis Mendo


People also ask

How do you find the smallest number in an array in Matlab?

M = min( A ) returns the minimum elements of an array. If A is a vector, then min(A) returns the minimum of A . If A is a matrix, then min(A) is a row vector containing the minimum value of each column of A .

What is Tol Matlab?

The tol input argument lets you specify what constitutes "close enough".

How do you find the number of elements in an array in Matlab?

n = numel( A ) returns the number of elements, n , in array A , equivalent to prod(size(A)) .


1 Answers

To simplify, I consider the one-output, two-input version of uniquetol,

 C = uniquetol(A, tol);

where the first input is a double vector A. In particular, this implies that:

  • The 'ByRows' option of uniquetol is not used.
  • The first input is a vector. If it were not, uniquetol would implicitly linearize to a column, as usual.
  • The second input, which defines the tolerance, is interpreted as follows:

    Two values, u and v, are within tolerance if abs(u-v) <= tol*max(abs(A(:)))

    That is, the specified tolerance is relative by default. The actual tolerance used in the comparisons is obtained by scaling by the maximum absolute value in A.

With these considerations, it seems that the approach that uniquetol uses is:

  1. Sort A.
  2. Pick the first entry of sorted A, and set this as reference value (this value will have to be updated later).
  3. Write the reference value into the output C.
  4. Skip subsequent entries of sorted A until one is found that is not within tolerance of the reference value. When that entry is found, take it as the new reference value and go back to step 3.

Of course, I'm not saying that this is what uniquetol internally does. But the output seems to be the same. So this is functionally equivalent to what uniquetol does.

The following code implements the approach described above (inefficient code, just to illustrate the point).

% Inputs A, tol
% Output C
tol_scaled = tol*max(abs(A(:))); % scale tolerance
C = []; % initiallize output. Will be extended
ref = NaN; % initiallize reference value to NaN. This will immediately cause
           % A(1) to become the new reference
for a = sort(A(:)).';
    if ~(a-ref <= tol_scaled)
        ref = a;
        C(end+1) = ref;
    end
end

To verify this, let's generate some random data and compare the output of uniquetol and of the above code:

clear
N = 1e3; % number of realizations
S = 1e5; % maximum input size
for n = 1:N;
    % Generate inputs:
    s = randi(S); % input size
    A = (2*rand(1,S)-1) / rand; % random input of length S; positive and 
                                % negative values; random scaling
    tol = .1*rand; % random tolerance (relative). Change value .1 as desired

    % Compute output:
    tol_scaled = tol*max(abs(A(:))); % scale tolerance
    C = []; % initiallize output. Will be extended
    ref = NaN; % initiallize reference value to NaN. This will immediately cause
               % A(1) to become the new reference 
    for a = sort(A(:)).';
        if ~(a-ref <= tol_scaled)
            ref = a;
            C(end+1) = ref;
        end
    end

    % Check if output is equal to that of uniquetol:
    assert(isequal(C, uniquetol(A, tol)))
end

In all my tests this has run without the assertion failing.

So, in summary, uniquetol seems to sort the input, pick its first entry, and keep skipping entries for as long as it can.

For the two examples in the question, the outputs are as follows. Note that the second input is specified as 2.5/9, where 9 is the maximum of the first input, to achieve an absolute tolerance of 2.5:

>> uniquetol([1 3 5 7 9], 2.5/9)
ans =
     1     5     9
>> uniquetol([3 5 7 9], 2.5/9)
ans =
     3     7
like image 73
Luis Mendo Avatar answered Oct 09 '22 22:10

Luis Mendo