Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

What would be a good implementation of iota_n (missing algorithm from the STL)

With C++11, the STL has now a std::iota function (see a reference). In contrast to std::fill_n, std::generate_n, there is no std::iota_n, however. What would be a good implementation for that? A direct loop (alternative 1) or delegation to std::generate_n with a simple lambda expression (alternative 2)?

Alternative 1)

template<class OutputIterator, class Size, class T>
OutputIterator iota_n(OutputIterator first, Size n, T value)
{
        while (n--)
                *first++ = value++;
        return first;
}

Alternative 2)

template<class OutputIterator, class Size, class T>
OutputIterator iota_n(OutputIterator first, Size n, T value)
{
        return std::generate_n(first, n, [&](){ return value++; });
}    

Would both alternatives generate equivalent code with optimizing compilers?

UPDATE: incorporated the excellent point of @Marc Mutz to also return the iterator at its destination point. This is also how std::generate_n got updated in C++11 compared to C++98.

like image 730
TemplateRex Avatar asked Aug 01 '12 21:08

TemplateRex


2 Answers

As a random example, I compiled the following code with g++ -S -O2 -masm=intel (GCC 4.7.1, x86_32):

void fill_it_up(int n, int * p, int val)
{
    asm volatile("DEBUG1");
    iota_n(p, n, val);
    asm volatile("DEBUG2");
    iota_m(p, n, val);
    asm volatile("DEBUG3");
    for (int i = 0; i != n; ++i) { *p++ = val++; }
    asm volatile("DEBUG4");
}

Here iota_n is the first version and iota_m the second. The assembly is in all three cases this:

    test    edi, edi
    jle .L4
    mov edx, eax
    neg edx
    lea ebx, [esi+edx*4]
    mov edx, eax
    lea ebp, [edi+eax]
    .p2align 4,,7
    .p2align 3
.L9:
    lea ecx, [edx+1]
    cmp ecx, ebp
    mov DWORD PTR [ebx-4+ecx*4], edx
    mov edx, ecx
    jne .L9

With -O3, the three versions are also very similar, but a lot longer (using conditional moves and punpcklqdq and such like).

like image 60
Kerrek SB Avatar answered Nov 11 '22 07:11

Kerrek SB


You're so focused on code generation that you forgot to get the interface right.

You correctly require OutputIterator, but what happens if you want to call it a second time?

list<double> list(2 * N);
iota_n(list.begin(), N, 0);
// umm...
iota_n(list.begin() + N, N, 0); // doesn't compile!
iota_n(list.rbegin(), N, 0); // works, but create 0..N,N-1..0, not 0..N,0..N
auto it = list.begin();
std::advance(it, N);
iota_n(it, N, 0); // works, but ... yuck and ... slow (O(N))

inside iota_n, you still know where you are, but you've thrown that information away, so the caller cannot get at it in constant time.

General principle: don't throw away useful information.

template <typename OutputIterator, typename SizeType, typename ValueType>
auto iota_n(OutputIterator dest, SizeType N, ValueType value) {
    while (N) {
        *dest = value;
        ++dest;
        ++value;
        --N;
    }
    // now, what do we know that the caller might not know?
    // N? No, it's zero.
    // value? Maybe, but it's just his value + his N
    // dest? Definitely. Caller cannot easily compute his dest + his N (O(N))
    //       So, return it:
    return dest;
}

With this definition, the above example becomes simply:

list<double> list(2 * N);
auto it = iota_n(list.begin(), N, 0);
auto end = iota_n(it, N, 0);
assert(end == list.end());
like image 21
Marc Mutz - mmutz Avatar answered Nov 11 '22 05:11

Marc Mutz - mmutz