It depends on the instruction executed.
As a general rule of thumb, simple ALU instructions require no more cycles dealing with Q registers than D registers, but multiply and/or permute instructions need twice the cycles when operating on Q registers. You should also be aware that very often the results in the lower 64-bits of Qd are available earlier than the ones in the upper half.
I don't think Apple's A6 behaves much differently than the "original" CA-15 when it comes to cycles. And since they all share the very same ISA, you can be assured that the registers are the same within the ARMv7 architecture.