Because having the matrices multiplied in the vertex shader makes the GPU do the full computation for each and every vertex that goes into it (note that recent GLSL compilers will detect the product to be uniform over all vertices and may move the calculation off the GPU to the CPU).
Also when it comes to performing a single 4×4 matrix computation a CPU actually outperforms a GPU, because there's no data transfer and command queue overhead.
The general rule for GPU computing is: If it's uniform over all vertices, and you can easily precompute it on the CPU, do it on the CPU.