The double precision units are actually separate hardware floating point units that do double precision arithmetic. They are independent from the "cuda cores", which roughly speaking, could be considered to be the single-precision units.
So for single precision arithmetic, the throughput can be computed based on the "cuda cores" or single precision units. For double precision arithmetic, the throughput must be computed based on the double precision units.
In a Kepler K20 SMX, the ratio of double-precision units to single precision units is 1:3. Therefore the throughput for each type of arithmetic follows the same ratio. By "arithmetic" I mean here floating point multiply or floating point add.