Is T Min(T, T); always better than const T& Min(const T&, const T&); if sizeof(T) <= sizeof(void*)?

Question 1

I can't say for sure, but given that

template<class T>
const T& Min(const &T a, const T& b)
{
    return a < b ? a : b;
}

..is basically replaced with (lets say for T = bool):

const bool & Min(const bool & a, const bool & b)
{
    return a < b ? a : b;
}

I would say it is a fair assumption that the compiler would pass the bool in the most efficient way possible. I.E, just because we're using & doesn't mean that it has to be passed by reference.

Another couple of ideas: When a function is called, arguments are either pushed onto the stack, or they're passed via a register. The only way that

bool Min(bool a, bool b)
{
    return a < b ? a : b;
}

would be "better"/"faster" than

const bool & Min(const bool & a, const bool & b)
{
    return a < b ? a : b;
}

is if passing a bool was faster than passing a const bool & (Ignoring dereferencing at the start of the function). I can't see this being true, since unless its being pushed onto the stack (depends on your calling convention), registers are all at least the size of a pointer on the host architecture. (I.E, rax is 64-bit, eax is 32-bit)

Further, I presume it would be easier for the compiler to inline, since (just from the function signature) we can be guaranteed that the function never locally modifies the values of a and b, and thus needs no space for them.

For user defined types, there are two cases.

The type fits in a register, and we can treat it just like a basic type.
The type does not fit in a register and must be passed by reference (if we use const & T) or as a copy (if we use just T). Since copying a class invokes class constructors, const & T will probably be faster in every case.

But, I'm really just speculating here. To check if there is a difference in bool vs const bool &, the best way would be to check for your specific compiler by outputting assembly and seeing if there is any difference.

HTH.

Question 2

I have seen g++ do this automatically when optimising: it can generate a version of Min taking arguments by value when it can tell that this is more efficient and wouldn't violate the as if rule (which I think basically means small types with trivial copy constructors).

I have no idea what the name of this optimisation is or whether other compilers implement it however.

As an aside, I wonder if you can get the best of both worlds in C++11:

template <typename T, typename U>
constexpr auto Min(T&& l, U&& r) -> decltype(l < r ? l : r)
{
    return l < r ? l : r;
}

If my understanding of "universal references" is correct, this should never be less efficient than using lvalue references as in your example, and possibly more efficient if l or r are rvalues (such as integer literals). As an added bonus, you get to compare things of different types where this makes sense (int vs long or what-have-you).