C++ templates optimization [closed]

https://stackoverflow.com/questions/15012896

10-03-2022
|

Question

At which point will/can the part about the template method be optimized by a compiler? Will it remove unreachable code, unwrap unecessary loops? (Bits uses unsigned int blocks, Integer uses unsigned long ones)

Plus, is there a c++ data type meaning "I'm an integer of the size of your processors registries"?

template<size_t bits> class IntegerFactoryImpl : public IntegerFactory<Integer<bits>>{
private:
    template<int sizeOfLong, int sizeOfInt> Integer<bits> getOne(const Bits& b) const{

        Integer<bits> integer = this->getOne();
        size_t roof = (b.blocks() > integer.size()*(sizeOfLong/sizeOfInt))? integer.size()*(sizeOfLong/sizeOfInt) : b.blocks();
        for(size_t i = 0; i < roof; ++i){
            integer.at(i/(sizeOfLong/sizeOfInt)) = 0;
            for(size_t j = 0; j < (sizeOfLong/sizeOfInt); ++j){
                if(i % (sizeOfLong/sizeOfInt) == j){
                    integer.at(i/(sizeOfLong/sizeOfInt)) |= ((unsigned long)b.block(b.blocks()-i-1)) << (sizeOfInt*j);
                    break;
                }
            }
        }
        for(size_t i = roof; i < integer.size()*(sizeOfLong/sizeOfInt); ++i){
            if(i % (sizeOfLong/sizeOfInt) == 0){
                integer.at(i/(sizeOfLong/sizeOfInt)) = 0;
            }
        }
        return integer;
    }

public:

    virtual ~IntegerFactoryImpl() throw(){}

    virtual Integer<bits> getOne() const{
        return Integer<bits>();
    }

    virtual Integer<bits> getOne(const Bits& b) const{
        return this->getOne<sizeof(unsigned long)*8, sizeof(unsigned int)*8>(b);
    }
};

Will there be a difference with this code (without template method):

template<size_t bits> class IntegerFactoryImpl : public IntegerFactory<Integer<bits>>{

public:

    virtual ~IntegerFactoryImpl() throw(){}

    virtual Integer<bits> getOne() const{
        return Integer<bits>();
    }

    virtual Integer<bits> getOne(const Bits& b) const{

        Integer<bits> integer = this->getOne();
        size_t roof = (b.blocks() > integer.size()*((sizeof(unsigned long)/sizeof(unsigned int)))? integer.size()*((sizeof(unsigned long)/sizeof(unsigned int)) : b.blocks();
        for(size_t i = 0; i < roof; ++i){
            integer.at(i/((sizeof(unsigned long)/sizeof(unsigned int))) = 0;
            for(size_t j = 0; j < ((sizeof(unsigned long)/sizeof(unsigned int)); ++j){
                if(i % ((sizeof(unsigned long)/sizeof(unsigned int)) == j){
                    integer.at(i/((sizeof(unsigned long)/sizeof(unsigned int))) |= ((unsigned long)b.block(b.blocks()-i-1)) << ((sizeof(unsigned int)*8)*j);
                    break;
                }
            }
        }
        for(size_t i = roof; i < integer.size()*((sizeof(unsigned long)/sizeof(unsigned int)); ++i){
            if(i % ((sizeof(unsigned long)/sizeof(unsigned int)) == 0){
                integer.at(i/((sizeof(unsigned long)/sizeof(unsigned int))) = 0;
            }
        }
        return integer;
    }
};

(edit: I just discovered the code doesn't work well (I fixed it) but the original question still applies..)

Solution

Right, the compiler will optimise away things that it can calculate at compile time, and if you have a loop that only iterates once (e.g. for(i = 0; i < 1; i++), it will remove the loop completely.

As to integer sizes, it really depends on what you are trying to achieve if it's better to use long or int. In x86-64, for example, a 64-bit operation will take an extra byte to indicate that the instruction following is a 64-bit instruction instead of a 32-bit instruction. If the compiler made int 64-bits long, the code would become (a little bit) larger, and thus fit less nicely in caches, etc, etc. There is no speed benefit between 16-, 32- or 64-bit operations [for 99% of the operations, multiply and divide being some of the obvious exceptions - the bigger the number, the longer it takes to divide or multiply it (( Actually, the number of bits SET in the number affects the multiply time, and I believe divide as well )) ] in x86-64. Of course, if you are, for example, using the values to perform bitmask operations and such, using long will give you 64-bit operations, which take half as many operations to perform the same thing. This is clearly an advantage. So it is "right" to use long in this case, even if it adds an extra byte per instruction.

Also bear in mind that very often, int is used for "smaller numbers", so for a lot of things the extra size of int would simply be wasted, and take up extra data-cache space, etc, etc. So int remains 32-bits also to keep the size of large integer arrays and such at a reasonable size.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow