The difference between accumulator-based and register-based CPU architecture?

https://softwareengineering.stackexchange.com/questions/343036

07-01-2021
|

Question

I don't understand the difference between an accumulator-based CPU architecture and a register-based CPU architecture. I know x86 is register-based but it has an accumulator-like register. I only ever hear people asking the difference between stack-based and register-based, but not register-based and accumulator-based. What are the advantages and disadvantages of each? And can I get some example assembly codes from each, where they differ, as well?

Solution

A register-based CPU architecture has one or more general purpose registers (where "general purpose register" excludes special purpose registers, like stack pointer and instruction pointer).

An accumulator-based CPU architecture is a register-based CPU architecture that only has one general purpose register (the accumulator).

The main advantage/s of "more that one general purpose register" is that the compiler doesn't have to "spill" as many temporary values onto the stack; and it's easier for the CPU to do more independent instruction in parallel.

For an example imagine you want to do a = (b - c) + (d - f) + 123. For an "apples vs apples comparision" I'll use Intel syntax 32-bit 80x86 assembly for both examples (but only use EAX for the accumulator-based CPU architecture).

For accumulator-based CPU architecture this may be:

    mov eax,[b]     ;Group 1

    sub eax,[c]     ;Group 2

    add eax,123     ;Group 3

    mov [a],eax     ;Group 4
    mov eax,[d]

    sub eax,[e]     ;Group 5

    add [a],eax     ;Group 6

Note that most of these instructions depend on the result from the previous instruction, and therefore can't be done in parallel. The ";Group N" comments are there to indicate which groups of instructions can be done in parallel (and show that, assuming some form of internal "register renaming" ability, "group 4" is the only group where 2 instructions are likely to be done in parallel).

Using multiple registers might give you:

    mov eax,[b]           ;Group 1
    mov ebx,[d]

    sub eax,[c]           ;Group 2
    sub ebx,[e]

    lea eax,[eax+ebx+123] ;Group 3        

    mov [a],eax           ;Group 4

In this case, there's one less instruction, and 2 less groups of instructions (more instructions likely to by done in parallel). That might mean "25% faster" in practice.

Of course in practice code does more than a relatively simple calculation; so there's even more chance of "more instructions in parallel". For example; with only 2 more registers (e.g. ECX and EDX) it should be easy to see that you could do a = (b - c) + (d - f) + 123 and g = (h - i) + (j - k) + 456 in the same amount of time (by doing both calculations in parallel with different registers); and it should also be easy to see that for accumulator-based CPU architecture you can't do the calculations in parallel (two calculations would take twice as long as one calculation).

Note: There is at least one "potential technical inaccuracy" in what I've written here (mostly involving the theoretical capabilities of register renaming and it's application on accumulator-based CPU architectures). This is deliberate. I find that going into too much detail (in an attempt to be "100% technically correct" and cover all the little corner cases) makes it significantly harder for people to understand the relevant parts.

OTHER TIPS

The difference, in my mind, is in the way that operands are specified.

At one end of the spectrum, a register architecture can support specifying all three operands for binary operations like add -- namely: input 1 & 2 and output target, whereas another may allow specifying only two operands, and all the way at the other extreme may support only one explicit operand. (Stack design is even more extreme, specifying none if the operands!)

Many architectures will blend one, two, and/or three operand binary operation instructions (e.g. add) into their ISA, making it harder to characterise the architecture as a whole, so we'd have to look more at individual instructions.

The basic trade offs are the size of instructions vs. the ability to reuse intermediate computations. In the style using fewer specified operands, instruction encodings are shorter, but results overwrite source operands, while also limiting other options.

For some code sequences, two operand instructions sometimes require the use of two instructions to accomplish the same results as one three operand instruction. For other code sequences, both take the same number of instructions, which means that the fewer operand version will result in a potentially shorter instruction.

For example, let's say that we have values in registers A and B. To add them using a two operand form, we either add A into B -- destroying B, or B into A (destroying A) If this overwrite/destruction is,ok, then fine. But if not, we have to copy one operand (say A) to register C, then add B to C -- two instructions!

A three register instruction would have been able to add A to B into C without destroying either input source.

In the accumulator based, operations are done via interaction with the accumulator register at each step of the way. For example, (a + b) - d is Done as, put a in the accum, add b to the accum, value in the accumulator now contains the sum of a and b. Subsequently subtract d from accum gives final result. For complex operations there is a significant time sacrifice. Whereas register based can store values in individual registers and operations can be between registers. Hence (a+b)-d involves put a in reg x, put b in reg y, put d in reg Z. Add X and y. Subtract Z. This is more convenient and generates shorter code in complex programs. Saves time.

The practical differences between accumulator and register based architectures are the following:

accumulator based: the code is smaller and runs slower.

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange