Question

With C++11, the STL has now a std::iota function (see a reference). In contrast to std::fill_n, std::generate_n, there is no std::iota_n, however. What would be a good implementation for that? A direct loop (alternative 1) or delegation to std::generate_n with a simple lambda expression (alternative 2)?

Alternative 1)

template<class OutputIterator, class Size, class T>
OutputIterator iota_n(OutputIterator first, Size n, T value)
{
        while (n--)
                *first++ = value++;
        return first;
}

Alternative 2)

template<class OutputIterator, class Size, class T>
OutputIterator iota_n(OutputIterator first, Size n, T value)
{
        return std::generate_n(first, n, [&](){ return value++; });
}    

Would both alternatives generate equivalent code with optimizing compilers?

UPDATE: incorporated the excellent point of @Marc Mutz to also return the iterator at its destination point. This is also how std::generate_n got updated in C++11 compared to C++98.

Was it helpful?

Solution

As a random example, I compiled the following code with g++ -S -O2 -masm=intel (GCC 4.7.1, x86_32):

void fill_it_up(int n, int * p, int val)
{
    asm volatile("DEBUG1");
    iota_n(p, n, val);
    asm volatile("DEBUG2");
    iota_m(p, n, val);
    asm volatile("DEBUG3");
    for (int i = 0; i != n; ++i) { *p++ = val++; }
    asm volatile("DEBUG4");
}

Here iota_n is the first version and iota_m the second. The assembly is in all three cases this:

    test    edi, edi
    jle .L4
    mov edx, eax
    neg edx
    lea ebx, [esi+edx*4]
    mov edx, eax
    lea ebp, [edi+eax]
    .p2align 4,,7
    .p2align 3
.L9:
    lea ecx, [edx+1]
    cmp ecx, ebp
    mov DWORD PTR [ebx-4+ecx*4], edx
    mov edx, ecx
    jne .L9

With -O3, the three versions are also very similar, but a lot longer (using conditional moves and punpcklqdq and such like).

OTHER TIPS

You're so focused on code generation that you forgot to get the interface right.

You correctly require OutputIterator, but what happens if you want to call it a second time?

list<double> list(2 * N);
iota_n(list.begin(), N, 0);
// umm...
iota_n(list.begin() + N, N, 0); // doesn't compile!
iota_n(list.rbegin(), N, 0); // works, but create 0..N,N-1..0, not 0..N,0..N
auto it = list.begin();
std::advance(it, N);
iota_n(it, N, 0); // works, but ... yuck and ... slow (O(N))

inside iota_n, you still know where you are, but you've thrown that information away, so the caller cannot get at it in constant time.

General principle: don't throw away useful information.

template <typename OutputIterator, typename SizeType, typename ValueType>
auto iota_n(OutputIterator dest, SizeType N, ValueType value) {
    while (N) {
        *dest = value;
        ++dest;
        ++value;
        --N;
    }
    // now, what do we know that the caller might not know?
    // N? No, it's zero.
    // value? Maybe, but it's just his value + his N
    // dest? Definitely. Caller cannot easily compute his dest + his N (O(N))
    //       So, return it:
    return dest;
}

With this definition, the above example becomes simply:

list<double> list(2 * N);
auto it = iota_n(list.begin(), N, 0);
auto end = iota_n(it, N, 0);
assert(end == list.end());

A hypothetical iota_n

std::iota_n(first, count, value)

can be replaced by a one liner.

std::generate_n(first, count, [v=value]()mutable{return v++;})

I prefer this to have a lingering function that is not in the standard. Having said that, I think a std::iota_n should be in the standard.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top