Coding in C: efficiency of temporary local variables

Question 1

I performed a little test, generating assembler code for the 2 versions. Simply running a diff command from bash showed that the first version has 2 instructions more than the second one.
If you want to try by yourself simply compile with this command

gcc -S main.c -o asmout.s
gcc -S main2.c -o asmout2.s

and then check differences with

diff asmout.s asmout2.s

I got these 2 instructions more for the first one:

movl    %eax, -8(%rbp)
movl    -8(%rbp), %eax

EDIT:
As Keith Thompson suggested if compiled with optimization options the generated assembler code is the same for both versions.

Question 2

Copied both versions, compiled each with gcc -S to get the machine language output, used sdiff to compare side-by-side.

Results using gcc version 4.1.2 20070115 (SUSE Linux):

No optimization:

main:                                         main:
.LFB2:                                        .LFB2:
        pushq   %rbp                                  pushq   %rbp
.LCFI0:                                       .LCFI0:
        movq    %rsp, %rbp                            movq    %rsp, %rbp
.LCFI1:                                       .LCFI1:
        subq    $16, %rsp                             subq    $16, %rsp
.LCFI2:                                       .LCFI2:
        movl    $0, %eax                              movl    $0, %eax
        call    get_a_value                           call    get_a_value
        movl    %eax, -8(%rbp)              |         movl    %eax, %edi
        movl    -8(%rbp), %edi              <
        movl    $0, %eax                              movl    $0, %eax
        call    calculate_something                   call    calculate_something
        movl    %eax, -4(%rbp)                        movl    %eax, -4(%rbp)
        movl    -4(%rbp), %eax                        movl    -4(%rbp), %eax
        leave                                         leave
        ret                                           ret

Basically, one extra move instruction. Both allocate the same amount of stack space (subq $16, %rsp reserves 16 bytes for the stack), so memory-wise there's no difference.

Level 1 optimization (-O1):

main:                                       main:
.LFB2:                                      .LFB2:
        subq    $8, %rsp                              subq    $8, %rsp
.LCFI0:                                     .LCFI0:
        movl    $0, %eax                              movl    $0, %eax
        call    get_a_value                           call    get_a_value
        movl    %eax, %edi                            movl    %eax, %edi
        movl    $0, %eax                              movl    $0, %eax
        call    calculate_something                   call    calculate_something
        addq    $8, %rsp                              addq    $8, %rsp
        ret                                           ret

No differences.

Results using gcc version 2.96 20000731 (Red Hat Linux 7.2 2.96-112.7.2):

No optimization:

main:                                         main:
        pushl   %ebp                                  pushl   %ebp
        movl    %esp, %ebp                            movl    %esp, %ebp
        subl    $8, %esp                              subl    $8, %esp
                                             >        subl    $12, %esp
                                             >        subl    $4, %esp
        call    get_a_value                           call    get_a_value
                                             >        addl    $4, %esp
        movl    %eax, %eax                            movl    %eax, %eax
        movl    %eax, -4(%ebp)               |        pushl   %eax
        subl    $12, %esp                    <
        pushl   -4(%ebp)                     <
        call    calculate_something                   call    calculate_something
        addl    $16, %esp                             addl    $16, %esp
        movl    %eax, %eax                            movl    %eax, %eax
        movl    %eax, -8(%ebp)               |        movl    %eax, -4(%ebp)
        movl    -8(%ebp), %eax               |        movl    -4(%ebp), %eax
        movl    %eax, %eax                            movl    %eax, %eax
        leave                                         leave
        ret                                           ret

Roughly the same number of instructions, ordered slightly differently.

Level 1 optimization (-O1):

main:                                         main:
        pushl   %ebp                                    pushl   %ebp
        movl    %esp, %ebp                              movl    %esp, %ebp
        subl    $8, %esp                      |         subl    $24, %esp
        call    get_a_value                             call    get_a_value
        subl    $12, %esp                     |         movl    %eax, (%esp)
        pushl   %eax                          <
        call    calculate_something                     call    calculate_something
        leave                                           leave
        ret                                             ret

Looks like the second version reserves a little more stack space.

So, for this particular example with these particular compilers, there's no huge difference between the two versions. In that case, I'd favor the first version for the following reasons:

Easier to trace in a debugger; you can examine the value returned from get_a_value before passing it to calculate_something;
It gives you a place to do some sanity checking, in case calculate_something isn't well-behaved for certain inputs;
It's a little easier on the eyes.

Just remember that terse doesn't necessarily mean fast or efficient, and what's fast/efficient under one particular compiler/hardware combination may be hopelessly busted under a different compiler/hardware combination. Some compilers actually have an easier time optimizing code that's written in a clear manner.

Your code should be, in order:

Correct - it doesn't matter how fast it is or how little memory it uses if it doesn't meet its requirements;
Secure - it doesn't matter how fast it is or how little memory it uses if it's a malware vector or risks exposing sensitive data to unauthorized parties (yes, I'm talking about Heart-frickin'-bleed);
Robust - it doesn't mattter how fast it is or how little memory it uses if it dumps core because somebody sneezed in a different room;
Maintainable - it doesn't matter how fast it is or how little memory it uses if it has to be scrapped and rewritten because the requirements changed (which they do);
Efficient - now you can start worrying about performance and efficiency.

Question 3

It really depends on the platform and the compiler, but with optimization on they should usually generate the same code. At worst version one will allocate space for an extra int. If placing the value of get_a_value in a variable makes your code more readable then I would go ahead and do that. The only time I would advise not doing so is in a deeply recursive function.