Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Handling #ifdef's that were used to create multiple versions of an algorithm

I am trying to benchmark many (about 25) variations of an algorithm written in C++.

I implemented these variations using a combination of three methods:

  1. copying code and making minor changes to the copied version

  2. subclassing the base algorithm class

  3. using #ifdefs to switch between snippets of code

The variations that arise from options 1 and 2 are okay because I can select which variation of the algorithm to run in a configuration file. I can then iterate through different configuration files and keep a record of "configuration:results" pairs - keeping these records is very important to my work.

I am currently having a problem with the #ifdefs because I have to compile multiple versions of the code to access these variations, making it much harder to run automated experiment scripts and to keep accurate records of results. The #ifdefs, however, are very useful because if I find a mistake in one copy of the code, then I do not have to remember to correct this mistake in multiple copies.

The #ifdefs expand six variations that I created by both copying code and subclassing into 24 total variations (4 variations for each basic variation).

Here is an example - mostly the I am using the #ifdefs to avoid replicating too much code:

    ....

    double lasso_gam=*gamma;
    *lasso_idx=-1;
    for(int aj=0;aj<(int)a_idx.size();aj++){
        int j=a_idx[aj];
        assert(j<=C*L);
        double inc=wa[aj]*(*gamma)*signs[aj];
        if( (beta_sp(j)>0 && beta_sp(j)+inc<0)
#ifdef ALLOW_NEG_LARS
            || (beta_sp(j)<0 && beta_sp(j)+inc>0)
#else
            || (beta_sp(j)==0 && beta_sp(j)+inc<0)
#endif
            ){
            double tmp_gam=-beta_sp(j)/wa[aj]*signs[aj];

            if(tmp_gam>=0 && tmp_gam<lasso_gam) {
                *lasso_idx=aj;
                *next_active=j;
                lasso_gam=tmp_gam;
            }
        }
    }

    if(lasso_idx>=0){
        *gamma=lasso_gam;
    }

    ....

Question: What is the best way to allow the multiple variations of the algorithm, which are currently specified by #ifdefs, to be run given a configuration file that specifies which variation of the algorithm to run.

Ideally I would like to compile the code only once and select an algorithm variation at runtime using the config file.

like image 601
user1149913 Avatar asked Mar 26 '13 18:03

user1149913


2 Answers

You can augment your algorithm with a (possibly additional) template argument like this:

enum class algorithm_type
{
    type_a,
    type_b,
    type_c
};

template <algorithm_type AlgorithmType>
void foo(int usual, double args)
{
    std::cout << "common code" << std::endl;

    if (AlgorithmType == algorithm_type::type_a)
    {
        std::cout << "doing type a..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_b)
    {
        std::cout << "doing type b..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_c)
    {
        std::cout << "doing type c..." << usual << ", " << args << std::endl;
    }

    std::cout << "more common code" << std::endl;
}

Now you can select your behavior via this template argument:

foo<algorithm_type::type_a>(11, 0.1605);
foo<algorithm_type::type_b>(11, 0.1605);
foo<algorithm_type::type_c>(11, 0.1605);

The type, being a constant expression, yields a compile-time decided branch (that is, the others are known to be dead code and removed). In fact, your compiler should warn you about this (how you deal with that is up to you).

But you can still dispatch off a runtime value just fine:

#include <stdexcept>

void foo_with_runtime_switch(algorithm_type algorithmType,
                             int usual, double args)
{
    switch (algorithmType)
    {
    case algorithm_type::type_a:
        return foo<algorithm_type::type_a>(usual, args);
    case algorithm_type::type_b:
        return foo<algorithm_type::type_b>(usual, args);
    case algorithm_type::type_c:
        return foo<algorithm_type::type_c>(usual, args);
    default:
        throw std::runtime_error("wat");
    }
}

foo_with_runtime_switch(algorithm_type::type_a, 11, 0.1605);
foo_with_runtime_switch(algorithm_type::type_b, 11, 0.1605);
foo_with_runtime_switch(algorithm_type::type_c, 11, 0.1605);

The internals of the algorithm remain the same (dead branches eliminated, same optimizations), just how you get there has changed. (Note that it's possible to generalize the enum idea so that this switch is generated automatically; if you find yourself with handfuls of variations, this might be good to learn.)

And of course you still can #define a specific algorithm as a default:

#define FOO_ALGORITHM algorithm_type::type_a

void foo_with_define(int usual, double args)
{
    return foo<FOO_ALGORITHM>(usual, args);
}

foo_with_define(11, 0.1605);

All these together give you the advantages of all three, with no repetition.

In practice, you can have all three as overloads: one for users who know which algorithm to use at compile-time, those who need to select it at runtime, and those who just want a default (which you can override via a project-wide #define):

// foo.hpp

enum class algorithm_type
{
    type_a,
    type_b,
    type_c
};

// for those who know which algorithm to use
template <algorithm_type AlgorithmType>
void foo(int usual, double args)
{
    std::cout << "common code" << std::endl;

    if (AlgorithmType == algorithm_type::type_a)
    {
        std::cout << "doing type a..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_b)
    {
        std::cout << "doing type b..." << usual << ", " << args << std::endl;
    }
    else if (AlgorithmType == algorithm_type::type_c)
    {
        std::cout << "doing type c..." << usual << ", " << args << std::endl;
    }

    std::cout << "more common code" << std::endl;
}

// for those who will know at runtime
void foo(algorithm_type algorithmType, int usual, double args)
{
    switch (algorithmType)
    {
    case algorithm_type::type_a:
        return foo<algorithm_type::type_a>(usual, args);
    case algorithm_type::type_b:
        return foo<algorithm_type::type_b>(usual, args);
    case algorithm_type::type_c:
        return foo<algorithm_type::type_c>(usual, args);
    default:
        throw std::runtime_error("wat");
    }
}

#ifndef FOO_ALGORITHM
    // chosen to be the best default by profiling
    #define FOO_ALGORITHM algorithm_type::type_b
#endif

// for those who just want a good default
void foo(int usual, double args)
{
    return foo<FOO_ALGORITHM>(usual, args);
}

Of course, if some implementation types are always worse than some other, get rid of it. But if you find there are two useful implementations, there's no harm in keeping both around this way.

like image 186
GManNickG Avatar answered Sep 26 '22 01:09

GManNickG


If you have multiple versions with #ifdefs, its usually best to build multiple executables and have your configuration script decide which executable(s) to run when benchmarking. You then have rules in your Makefile to build the various configurations:

%-FOO.o: %.cc
        $(CXX) -c $(CFLAGS) -DFOO -o $@ $<

%-BAR.o: %.cc
        $(CXX) -c $(CFLAGS) -DBAR -o $@ $<

test-FOO: $(SRCS:%.cc=%-FOO.o)
        $(CXX) $(LDFLAGS) -DFOO -o $@ $^ $(LDLIBS)
like image 39
Chris Dodd Avatar answered Sep 25 '22 01:09

Chris Dodd