I might not be right, but after I saw this:
and with a little experiment and using an old friend called "register math", I realised that when you add two 8-bit numbers, you need to consider a 8+1+1 bit register to store the sum because it potentially has carry output. so any result of reduce should have bigger space than the source i.e. if the source is 8-bit unsigned, it should be at least 16-bit unsigned or signed; might as well be 32-bit if it is going to be used for some product calculation and stuff...
NOTE: The destination type must be EXPLICITLY stated in the cv::reduce method. Please follow my openCV link for further information.