Here is my take at answering my own question: Using a user-defined literal which is implemented as a constexpr
together with a string literal allows, effectively, to specify a parser for the relevant value! There are two parts which are slightly ugly:
- The actual literal is enclosed by double quotes.
- There is a user-defined literal sitting at the end of said string.
Other than that, this approach actually allows specification of integer literals including digit separators. I'm currently not quite in the position to create a fancy version verifying that the separators sit in the correct location but that should be just a minor detail of proper programming. Below is a program implementing a corresponding user defined literal which allows uses like
use_mask<"0b0101,0101,0101,0101,0011,0101"_sep> um1;
use_mask<"0x0123,4567,89ab,cdef"_sep> um2;
Of course, this is actually putting the literals to use, too. The actual literals are
"0b0101,0101,0101,0101,0011,0101"_sep
"0x0123,4567,89ab,cdef"_sep
... and they can be used entirely separately. The error messages throwing an exception aren't necessarily as pretty as I would like them to be. It was also pointed out that using a user-define literal to deal with the digit separators precludes the use of other user-defined literals.
#include <algorithm>
#include <iostream>
#include <stdexcept>
template <unsigned long long Value>
struct use_mask {
static constexpr unsigned long long value = Value;
};
// ------------------------------------------------------------------------
constexpr bool is_digit(char c, unsigned base)
{
return '0' <= c && c < '0' + int(base < 10u? base: 10u);
}
constexpr bool is_hexdigit(char c)
{
return ('a' <= c && c <= 'f') || ('A' <= c && c <= 'F');
}
constexpr unsigned long long hex_value(char c)
{
return c - (('a' <= c && c <= 'f')? 'a': 'A') + 10;
}
// ------------------------------------------------------------------------
constexpr unsigned long long decode(unsigned long long value,
unsigned base,
char const* str, size_t n)
{
return n == 0
? value
: (str[0] == ','
? decode(value, base, str + 1, n - 1)
: (is_digit(str[0], base)
? decode(value * base + str[0] - '0', base, str + 1, n - 1)
: (base == 16u && is_hexdigit(str[0])
? decode(value * base + hex_value(str[0]),
base, str + 1, n - 1)
: throw "ill-formed constant with digit separators"
)
)
);
}
constexpr unsigned long long operator"" _sep(char const* value,
std::size_t n)
{
return 2 < n && value[0] == '0'
? ((value[1] == 'b' || value[1] == 'B')
? decode(0ull, 2, value + 2, n - 2)
: ((value[1] == 'x' || value[1] == 'X')
? decode(0ull, 16, value + 2, n - 2)
: decode(0ull, 8, value + 1, n - 1)))
: decode(0ull, 10, value, n);
}
int main()
{
std::cout << use_mask<"0b1010,1010"_sep>::value << "\n";
std::cout << use_mask<"02,52"_sep>::value << "\n";
std::cout << use_mask<"1,70"_sep>::value << "\n";
std::cout << use_mask<"0xA,A"_sep>::value << "\n";
#ifdef ERROR
std::cout << use_mask<"0xx,A"_sep>::value << "\n";
#endif
std::cout << use_mask<"0b0101,0101,0101,0101,0011,0101"_sep>::value
<< '\n';
std::cout << use_mask<"0x0123,4567,89ab,cdef"_sep>::value << '\n';
}