Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How much "if" statements effect on performance?

There are a some IPTables with different sizes (e.g 255 or 16384 or 512000!!).Every entry of each table, holds a unique IP Address (hex format) and some other values. The total number of IPs is 8 millions. All IPs of all IPTables are sorted

We need to search IPTable 300,000 times per sec. Our current Algorithm for finding an IP is as follow:

// 10 <number of IPTables <20
//_rangeCount = number of IPTables 
s_EntryItem* searchIPTable(const uint32_t & ip) {
        for (int i = 0; i < _rangeCount; i++) {
            if (ip > _ipTable[i].start && ip < _ipTable[i].end) {
                int index = ip - _ipTable[i].start;
                    return (_ipTable[i].p_entry + index);
                }
            }
            return NULL;
        }

As it can be seen, in worst case, number of comparisons for a given IP address is _rangeCount *2 and number of "if" statement checking is _rangeCount.

Suppose i want to change the searchIPTable and use more efficient way to find an IP address in IPTables. as far as i know, for a sorted array, the best software implementation of a famous search algorithm like binary search needs log(n) comparisons( in worst case).

So, the number of comparisons to find an IP address is log(8000000) that is equal to ~23.

Question 1:

As it can bee seen there is a little gap between the number of comparison needed by two algorithm ( _rangeCount vs 23) but in first method, there are some "if" statement that could effect on performance. if you want to run first algorithm for 10 times, obviously the first algorithm has better performance, but i have know idea about running two algorithm for 3000,000 times! what is your idea?

Question 2:

Is there a more efficient algorithm or solution to search IPs?

like image 772
m.r226 Avatar asked Aug 20 '16 13:08

m.r226


1 Answers

curiosity piqued, I wrote a test program (below) and ran it on my macbook.

It's suggesting that a naiive solution, based on a std::unordered_map (lookup time == constant time) is able to search an ip4 address table with 8 million entries 5.6 million times per second.

This easily outperforms the requirements.

update: responding to my critics, I have increased the test space to the required 8m ip addresses. I have also increased the test size to 100 million searches, 20% of which will be a hit.

With a test this large we can clearly see the performance benefits of using an unordered_map when compared to an ordered map (logarithmic time lookups).

All test parameters are configurable.

#include <iostream>
#include <vector>
#include <algorithm>
#include <chrono>
#include <unordered_map>
#include <unordered_set>
#include <map>
#include <random>
#include <tuple>
#include <iomanip>
#include <utility>

namespace detail
{
    template<class T>
    struct has_reserve
    {
        template<class U> static auto test(U*p) -> decltype(p->reserve(std::declval<std::size_t>()), void(), std::true_type());
        template<class U> static auto test(...) -> decltype(std::false_type());

        using type = decltype(test<T>((T*)0));
    };
}

template<class T>
using has_reserve = typename detail::has_reserve<T>::type;


using namespace std::literals;

struct data_associated_with_ip {};
using ip_address = std::uint32_t;

using candidate_vector = std::vector<ip_address>;

static constexpr std::size_t search_space_size = 8'000'000;
static constexpr std::size_t size_of_test = 100'000'000;

std::vector<ip_address> make_random_ip_set(std::size_t size)
{
    std::unordered_set<ip_address> results;
    results.reserve(size);

    std::random_device rd;
    std::default_random_engine eng(rd());
    auto dist = std::uniform_int_distribution<ip_address>(0, 0xffffffff);
    while (results.size() < size)
    {
        auto candidate = dist(eng);
        results.emplace(candidate);
    }

    return { std::begin(results), std::end(results) };
}

template<class T, std::enable_if_t<not has_reserve<T>::value> * = nullptr>
void maybe_reserve(T& container, std::size_t size)
{
    // nop
}

template<class T, std::enable_if_t<has_reserve<T>::value> * = nullptr>
decltype(auto) maybe_reserve(T& container, std::size_t size)
{
    return container.reserve(size);
}

template<class MapType>
void build_ip_map(MapType& result, candidate_vector const& chosen)
{
    maybe_reserve(result, chosen.size());
    result.clear();

    for (auto& ip : chosen)
    {
        result.emplace(ip, data_associated_with_ip{});
    }
}

// build a vector of candidates to try against our map
// some percentage of the time we will select a candidate that we know is in the map
candidate_vector build_candidates(candidate_vector const& known)
{
    std::random_device rd;
    std::default_random_engine eng(rd());
    auto ip_dist = std::uniform_int_distribution<ip_address>(0, 0xffffffff);
    auto select_known = std::uniform_int_distribution<std::size_t>(0, known.size() - 1);
    auto chance = std::uniform_real_distribution<double>(0, 1);
    static constexpr double probability_of_hit = 0.2;

    candidate_vector result;
    result.reserve(size_of_test);
    std::generate_n(std::back_inserter(result), size_of_test, [&]
                    {
                        if (chance(eng) < probability_of_hit)
                        {
                            return known[select_known(eng)];
                        }
                        else
                        {
                            return ip_dist(eng);
                        }
                    });

    return result;
}


int main()
{

    candidate_vector known_candidates = make_random_ip_set(search_space_size);
    candidate_vector random_candidates = build_candidates(known_candidates);


    auto run_test = [&known_candidates, &random_candidates]
    (auto const& search_space)
    {

        std::size_t hits = 0;
        auto start_time = std::chrono::high_resolution_clock::now();
        for (auto& candidate : random_candidates)
        {
            auto ifind = search_space.find(candidate);
            if (ifind != std::end(search_space))
            {
                ++hits;
            }
        }
        auto stop_time = std::chrono::high_resolution_clock::now();
        using fns = std::chrono::duration<long double, std::chrono::nanoseconds::period>;
        using fs = std::chrono::duration<long double, std::chrono::seconds::period>;
        auto interval = fns(stop_time - start_time);
        auto time_per_hit = interval / random_candidates.size();
        auto hits_per_sec = fs(1.0) / time_per_hit;

        std::cout << "ip addresses in table: " << search_space.size() << std::endl;
        std::cout << "ip addresses searched: " << random_candidates.size() << std::endl;
        std::cout << "total search hits    : " << hits << std::endl;
        std::cout << "searches per second  : " << std::fixed << hits_per_sec << std::endl;
    };

    {
        std::cout << "building unordered map:" << std::endl;
        std::unordered_map<ip_address, data_associated_with_ip> um;
        build_ip_map(um, known_candidates);
        std::cout << "testing with unordered map:" << std::endl;
        run_test(um);
    }

    {
        std::cout << "\nbuilding ordered map  :" << std::endl;
        std::map<ip_address, data_associated_with_ip> m;
        build_ip_map(m, known_candidates);
        std::cout << "testing with ordered map  :" << std::endl;
        run_test(m);
    }

}

example results:

building unordered map:
testing with unordered map:
ip addresses in table: 8000000
ip addresses searched: 100000000
total search hits    : 21681856
searches per second  : 5602458.505577

building ordered map  :
testing with ordered map  :
ip addresses in table: 8000000
ip addresses searched: 100000000
total search hits    : 21681856
searches per second  : 836123.513710

Test conditions:

MacBook Pro (Retina, 15-inch, Mid 2015)
Processor: 2.2 GHz Intel Core i7
Memory: 16 GB 1600 MHz DDR3
Release build (-O2)

Running on mains power.

like image 161
Richard Hodges Avatar answered Oct 04 '22 05:10

Richard Hodges