Compile-time recursive function to compute the next power of two of an integer?

Question 1

This ought to implement the algorithm you give:

template<typename T>
constexpr T roundup_helper( T value, unsigned maxb, unsigned curb ) {
    return maxb<=curb
            ? value
            : roundup_helper( ((value-1) | ((value-1)>>curb))+1, maxb, curb << 1 )
            ;
}

template<typename T,
        typename = typename enable_if<is_integral<T>::value>::type,
        typename = typename enable_if<is_unsigned<T>::value>::type>
constexpr T roundup( T value ) {
    return roundup_helper( value, sizeof(T)*CHAR_BIT, 1 );
}

At least, it seems to work fine in my test program.

Alternatively, you can move the v-1 and v+1 out of the helper function like so:

template<typename T>
constexpr T roundup_helper( T value, unsigned maxb, unsigned curb ) {
    return maxb<=curb
            ? value
            : roundup_helper( value | (value>>curb), maxb, curb << 1 )
            ;
}

template<typename T,
        typename = typename enable_if<is_integral<T>::value>::type,
        typename = typename enable_if<is_unsigned<T>::value>::type>
constexpr T roundup( T value ) {
    return roundup_helper( value-1, sizeof(T)*CHAR_BIT, 1 )+1;
}

Another possibility is to take advantage of default arguments and put it all in a single function:

template<typename T,
        typename = typename enable_if<is_integral<T>::value>::type,
        typename = typename enable_if<is_unsigned<T>::value>::type>
constexpr T roundup(
        T value,
        unsigned maxb = sizeof(T)*CHAR_BIT,
        unsigned curb = 1
        ) {
    return maxb<=curb
            ? value
            : roundup( ((value-1) | ((value-1)>>curb))+1, maxb, curb << 1 )
            ;
}

Question 2

This may not be what you can do unfortunately. But if by any chance you have a constexpr count leading zeros compiler intrinsic, the following is very efficient both at compile time, and at run time if you happen to give it run time arguments:

#include <climits>

template <class Int>
inline
constexpr
Int
clp2(Int v)
{
    return v > 1 ? 1 << (sizeof(Int)*CHAR_BIT - __builtin_clz(v-1)) : v;
}

int
main()
{
    static_assert(clp2(0) == 0, "");
    static_assert(clp2(1) == 1, "");
    static_assert(clp2(2) == 2, "");
    static_assert(clp2(3) == 4, "");
    static_assert(clp2(4) == 4, "");
    static_assert(clp2(5) == 8, "");
    static_assert(clp2(6) == 8, "");
    static_assert(clp2(7) == 8, "");
    static_assert(clp2(8) == 8, "");
    static_assert(clp2(42) == 64, "");
}

I compiled the above with tip-of-trunk clang. It is not without its issues. You need to decide what you want to do with negative arguments. But many architectures and compilers have an intrinsic like this (shame it isn't standard C/C++ by now). And some of those may make the intrinsic constexpr.

Without such an intrinsic, I would fall back to something along the lines of Adam H. Peterson's algorithm. But the nice thing about this one is its simplicity and efficiency.

Question 3

Although less efficient in the general, this algorithm will do the job rather concisely:

template <typename T,
          typename = typename std::enable_if<std::is_integral<T>::value>::type,
          typename = typename std::enable_if<std::is_unsigned<T>::value>::type>
constexpr T upperPowerOfTwo(T value, size_t pow = 0)
{
    return (value >> pow) ? upperPowerOfTwo(value, pow + 1)
                          : T(1) << pow;
}

This also allows you to specify a minimum power of 2 - i.e. upperPowerOfTwo(1, 3) return 8.

The reason this is less efficient for most cases is that it makes O(sizeof(Type)*CHAR_BIT) calls whereas the algorithm you linked performs O(log(sizeof(Type)*CHAR_BIT)) operations. The caveat is that this algorithm will terminate after log(v) calls, so if v is small enough (i.e. < log(max value of v-type)) it will be faster.