Explanation of GPGPU energy efficiency relative to CPU? [closed]

Question 1

TL;DR answer: more of the transistors in a gpu are actually working on the computation than in a cpu.

The big power efficiency-killer of today's cpus is a trade-off to allow general computation on the chip. Whether it is a RISC, x86, or other cpu architecture, there is extra hardware dedicated to the general purpose usage of the cpu. These transistors require electricity, although they are not doing any actual math.

Fast cpus require advanced branch prediction hardware and large cache memory to be able to avoid lengthy processing which could be discarded later in the pipeline. For the most part, cpus execute their instruction one at a time (per cpu core, SIMD helps out cpus as well...), and handle conditions extremely well. Gpus rely on doing the same operation on many pieces of data (SIMD/vector operation), and suffer greatly with simple conditions found in 'if' and 'for' statements.

There is also a lot of hardware used to fetch, decode, and schedule instructions -- this is true for cpus and gpus. This big difference being that the ratio of fetch+decode+schedule transistors to computating transistors tends to be much higher for a gpu.

Here is an AMD presentation (2011) about how their gpus have changed over time, but this really applies to most gpus in general. PDF link. It helped me understand the power advantage of gpus by knowing a bit of the history behind how gpus got to be so good at certain computations.

I gave an answer to a similar question a while ago. SO Link.

Question 2

Usually, these claims are backed by comparing the GFLOPs performance and estimating the power per floating point operation as shown in this post. But this is essentially what you wrote in your last sentence.

You also have to take into account that the CPU and GPU architectures target different problems. Whereas a CPU core (at least on x86) has deep pipelines, a large instruction set and very sophisticated caching strategies to cater to a wide array of problems, a GPU core is rather simple and thus draws a lot less power. To make up for this, there are many more computing cores in a GPU than in a CPU. But you probably know that already.