Question

Say one has 8 unsigned chars x1,x2,...x8 and we want to calculate:

abs((x1 + x2 + x3 + x4) - (x5 + x6 + x7 + x8)) / 4

What would be the best way to ensure the most accurate results, without introducing large overflow or underflow errors?

I'm using this in a template class, which is why I cannot just convert the unsigned values to signed ones.

Was it helpful?

Solution

The operator + only works for int and larger. Thus when you use it with objects of type char (which is smaller than int) these values are automatically converted to int before the operation happens.

Thus

abs((x1 + x2 + x3 + x4) - (x5 + x6 + x7 + x8)) / 4

Is converted by the compiler to:

abs((static_cast<int>(x1) + static_cast<int>(x2) + static_cast<int>(x3) + static_cast<int>(x4)) - 
    (static_cast<int>(x5) + static_cast<int>(x6) + static_cast<int>(x7) + static_cast<int>(x8))) / 4

So unless you are adding up a lot of char you are unlikely to overflow.

There is an issue assigning the result back to unsigned char. If result of the expression is negative then you are going to have a conversion that makes the value positive (but well defined).

OTHER TIPS

It seems like you want a metafunction to tell you what intermediate data type to use in your calculations.

    template <class T>
    struct arithmetic_type
    {
        typedef std::int64_t type;
    };

    template <>
    struct arithmetic_type<float>
    {
         typedef double type;
    };

    template <>
    struct arithmetic_type<double>
    {
         typedef double type;
    };

    template <>
    struct arithmetic_type<std::uint64_t>
    {
         typedef std::uint64_t type;
    };

    typedef typename arithmetic_type<T>::type ar_type;
    abs(((ar_type)x1 + x2 + x3 + x4) - ((ar_type)x5 + x6 + x7 + x8)) / 4;

You can of course tweak the specializations and add/remove as per your needs, but this should give you the right idea.

As with any fixed-size data, the best approach is to cast them to a type large enough to fit the worst case scenario. In this case, casting them to int will be good enough - it will fit range of any possible values and will allow handling potential underflow.

Note that you will have to be careful about subtraction - the result will depend on the semantics you want to attach to it: either you assume that it will never underflow (and any negative value is an error or should be floored at 0) or negative value has a meaning and you do want to take an absolute value out of it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top