Find elements meeting any of a number of criteria

Question

I am trying to find the indices of elements in a vector that correspond to another vector, preferably without using loops. For example, my input might be:

DJiSet = [5 7 8];                   % elements of which I need the indices
JiSet = [3 4 5 6 7 8 9 11 12 20];   % vector to search

The output here would be:

[3 5 6]

The fastest I've come up with so far is this:

Output = find(ismember(JiSet,DJiSet));

However I have a feeling this could be done faster, especially since I thought the find command is rather slow.

Things to note:

The values in DJiSet are guaranteed to all be present in JiSet
JiSet is always sorted in ascending order, without repeated entries
The vector DJiSet is not guaranteed to be found contiguously in JiSet

Divakar · Accepted Answer

Approach #1

You can avoid find by reversing the places of DJiSet and JiSet inside ismember and then use the second output that gives us the matching indices -

[~,out] = ismember(DJiSet,JiSet)

Approach #2

Loopy approach catering to the specific conditions set in the question could be tried out, not sure if this will be more efficient though -

intv_idx = zeros(1,numel(DJiSet));
intv_idx(1) = find(JiSet==DJiSet(1),1);
start = intv_idx(1)+1;
for k = 2:numel(DJiSet)
    idx = find(JiSet(start:end)==DJiSet(k),1);
    start = idx+start;
    intv_idx(k) = idx;
end
out = cumsum(intv_idx);

Luis Mendo · Answer

Divakar's answer is the way to go. But in case you want to do it more manually:

[~, Output] = max(bsxfun(@eq, DJiSet(:).', JiSet(:)), [], 1);

This finds the first occurrence if there are more than one.

If the values in DJiSet were not guaranteed to be present in JiSet, you could use a small modification:

[val, Output] = max(bsxfun(@eq, DJiSet(:).', JiSet(:))); %'
Output(~val) = 0; %// 0 indicates "not found"

Wouter Kuijsters · Answer

For small datasets, it seems my original approach was faster than both the ismember solution proposed by Divakar and the intersect solution proposed by qmeeeeeee, but all three get beaten by Luis Mendo's solution using good old bsxfun. See below code, which times each approach:

function somescript()

IsmemberTime = timeit(@membersol)
IntersectTime = timeit(@intersectsol)
FindTime = timeit(@naivesol)
BsxTime = timeit(@bsxfunsol)

    function membersol()
        rng(1)
        set = randi(30,[1000 15]);             % generate 1000 vectors of length 15, containing random integers
        for i=1:1000
            [~,out] = ismember(set(i,1:5),set(i,6:end));      % first 5 random integers are the values to be found in the remainder of the vector
        end

    end

    function intersectsol()
        rng(1)
        set = randi(30,[1000 15]);
        for i=1:1000
            [~,~,Output] = intersect(set(i,1:5),set(i,6:end));
        end
    end

    function naivesol()
        rng(1)
        set = randi(30,[1000 15]);
        for i=1:1000
            Output = find(ismember(set(i,6:end),set(i,1:5)));
        end
    end

    function bsxfunsol()
        rng(1)
        set = randi(30,[1000 15]);
        for i=1:1000
            [~, Output] = max(bsxfun(@eq, set(i,1:5).', set(i,6:end)), [], 1);
        end
    end
end

Which on my machine (running R2014b) returns the following timings:

IsmemberTime =

    0.1101


IntersectTime =

    0.2008


FindTime =

    0.0698


BsxTime =

    0.0218

This suggests that, for small data sets at least, using find and ismember on the inverted order of vectors is actually faster than using ismember alone. Since there is also some fixed overhead for all methods from the generation of the datasets set that are used to test with, the difference seems to be pretty big. More thorough tests can ben found in the comments below.

GameOfThrows · Answer

Perhaps you could try intersect? It is suppose to be a lot faster:

[Intersect,indDJiSet,indJiSet] = intersect(DJiSet,JiSet)

The ordering does not matter, as long as the element exist in both list, the ind elements gives the index.

Find elements meeting any of a number of criteria

Tags:

matlab

Wouter Kuijsters

4 Answers

Divakar

Luis Mendo

Wouter Kuijsters

GameOfThrows

Recent Activity

Donate For Us

Find elements meeting any of a number of criteria

Tags:

matlab

Wouter Kuijsters

4 Answers

Divakar

Luis Mendo

Wouter Kuijsters

GameOfThrows

Related questions

Recent Activity

Donate For Us