C/C++ - Efficient way to compare two lists and find missing elements

Question

I have two lists, L₁ and L₂, of data containing multiple elements, each unique, of an abstract data type (ie: structs). Each of the two lists:

May contain between zero and one-hundred (inclusive) elements.
Contains no duplicate elements (each element is unique).
May or may not contain elements in the other list (ie: L₁ and L₂ might be identical, or contain completely different elements).
Is not sorted.
At the lowest level, is stored withing a std::vector<myStruct> container.

What I am typically expecting is that periodically, a new element is added to L₂, or an element is subtracted/removed from it. I am trying to detect the differences in the two lists as efficiently (ie: with minimal comparisons) as possible:

If an entry is not present in L₂ and is present in L₁, carry out one operation: Handle_Missing_Element().
If an entry is present in L₂ and not present in L₁, carry out another operation: Handle_New_Element().

Once the above checks are carried out, L₁ is set to be equal to L₂, and at some time in the future, L₂ is checked again.

How could I go about finding out the differences between the two lists? There are two approaches I can think of:

Compare both lists via every possible combination of elements. Possibly O(n²) execution complexity (horrible).

bool found;
for i in 1 .. L2->length()
  found = false;
  for j in 1 .. L1->length()
    if (L1[j] == L2[i]
      // Found duplicate entry
      found = true;
    fi
  endfor
endfor

Sort the lists, and compare the two lists element-wise until I find a difference. This seems like it would be in near-linear time. The problem is that I would need the lists to be sorted. It would be impractical to manually sort the underlying vector after each addition/removal for the list. It would only be reasonable to do this if it were somehow possible to force vector::push_back() to automatically insert elements such that insertions preseve the sorting of the list.

Is there a straightforward way to accomplish this efficiently in C++? I've found similar such problems, but I need to do more than just find the intersection of two sets, or do such a test with just a set of integers, where sum-related tricks can be used, as I need to carry out different operations for "new" vs "missing" elements.

Thank you.

paddy · Accepted Answer

It would be impractical to manually sort the underlying vector after each addition/removal for the list. It would only be reasonable to do this if it were somehow possible to force vector::push_back() to automatically insert elements such that insertions preseve the sorting of the list.

What you're talking about here is an ordered insert. There are functions in <algorithm> that allow you do do this. Rather than using std::vector::push_back you would use std::vector::insert, and call std::lower_bound which does a binary search for the first element not less than than a given value.

auto insert_pos = std::lower_bound( L2.begin(), L2.end(), value );
if( insert_pos == L2.end() || *insert_pos != value )
{
    L2.insert( insert_pos, value );
}

This makes every insertion O(logN) but if you are doing fewer than N insertions between your periodic checks, it ought to be an improvement.

The zipping operation might look something like this:

auto it1 = L1.begin();
auto it2 = L2.begin();

while( it1 != L1.end() && it2 != L2.end() )
{
    if( *it1 < *it2 ) {
        Handle_Missing( *it1++ );
    } else if( *it2 < *it1 ) {
        Handle_New( *it2++ );
    } else {
        it1++;
        it2++;
    }
}

while( it1 != L1.end() ) Handle_Missing( *it1++ );
while( it2 != L2.end() ) Handle_New( *it2++ );

C/C++ - Efficient way to compare two lists and find missing elements

Tags:

c++

list

sorting

compare

Cloud

1 Answers

paddy

Recent Activity

Donate For Us

C/C++ - Efficient way to compare two lists and find missing elements

Tags:

c++

list

sorting

compare

Cloud

1 Answers

paddy

Related questions

Recent Activity

Donate For Us