Domanda

I'm trying to understand some of the intricacies the preprocessor and of the C compiler (specifically, the gnu gcc) and string literals. Is it more efficient to just assign a global variable for a string literal that occupies only one place in memory vs using a #define preprocessor directive?

As in this example, the string literal is on place in memory and accessed several times:

#include <stdio.h>
#include <string.h>
char OUTPUT[20] = "Hello, world!!!";
int main (){
    printf("%s is %d characters long.\n", OUTPUT, strlen(OUTPUT));
    return 0;
}

vs doing it with the preprocessor:

#include <stdio.h>
#include <string.h>
#define OUTPUT "Hello, world!!!"

int main (){
    printf("%s is %d characters long.\n", OUTPUT, (int) strlen(OUTPUT));
    return 0;
}

which translates as:

#include <stdio.h>
#include <string.h>
#define OUTPUT "Hello, world!!!"

int main (){
    printf("%s is %d characters long.\n", "Hello, world!!!", (int) strlen("Hello, world!!!"));
    return 0;
}

What I'm really asking is in the last two examples example using the preprocessor, does the compiler have two separate instances of "Hello, world!!!" in two separate memory locations or is the compiler smart enough to make it one memory location?

If it is two separate memory locations, then isn't it more resource-friendly to use a global variable rather than macro expansion for program constants?

È stato utile?

Soluzione

Your compiler should be smart enough to store one instance of the string. You can verify this by checking the assembly outputs for your programs.

For example, using GCC:

Assume your first example is called "global.c".

gcc -Wall -S global.c

.file   "global.c"
.globl  OUTPUT
.data
.align 16
.type   OUTPUT, @object
.size   OUTPUT, 20
OUTPUT:
.string "Hello, world!!!"
.zero   4
.section    .rodata
.LC0:
.string "%s is %d characters long.\n"
.text
.globl  main
.type   main, @function
main: 
// More code...

Assume your preprocessor example is called "preproc.c".

gcc -Wall -S preproc.c
.file   "preproc.c"
.section    .rodata
.LC0:
.string "%s is %d characters long.\n"
.LC1:
.string "Hello, world!!!"
.text
.globl  main
.type   main, @function
main:
// More code...

In both cases, only one copy of "Hello, world!!!" and "%s is %d characters long.\n" exist. In the first example, you have to save space for 20 characters because your code has a modifiable array. If you changed this

char OUTPUT[20] = "Hello, world!!!";

to

const char * const OUTPUT = "Hello, world!!!";

You would get:

.file   "global.c"
.globl  OUTPUT
.section    .rodata
.LC0:
.string "Hello, world!!!"
.align 8
.type   OUTPUT, @object
.size   OUTPUT, 8
OUTPUT:
.quad   .LC0
.LC1:
.string "%s is %d characters long.\n"
.text
.globl  main
.type   main, @function
main:
// More code...

Now you are saving space for just the pointer and the string.

Which way is better is negligible in this situation, though I would recommend using the preprocessor so that the scope of your strings stays within the main function.

Both emit almost identical code with optimizations.

Global.c with (const char * const OUTPUT):

gcc -Wall -O3 -S global.c

.file   "global.c"
.section    .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "Hello, world!!!"
.LC1:
.string "%s is %d characters long.\n"
.section    .text.startup,"ax",@progbits
.p2align 4,,15
.globl  main
.type   main, @function
main:
.LFB44:
.cfi_startproc
subq    $8, %rsp
.cfi_def_cfa_offset 16
movl    $15, %ecx
movl    $.LC0, %edx
movl    $.LC1, %esi
movl    $1, %edi
xorl    %eax, %eax
call    __printf_chk
xorl    %eax, %eax
addq    $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE44:
.size   main, .-main
.globl  OUTPUT
.section    .rodata
.align 8
.type   OUTPUT, @object
.size   OUTPUT, 8
OUTPUT:
.quad   .LC0
.ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section    .note.GNU-stack,"",@progbits

Preproc.c

gcc -Wall -O3 -S preproc.c

    .file   "preproc.c"
.section    .rodata.str1.1,"aMS",@progbits,1
.LC0:
.string "Hello, world!!!"
.LC1:
.string "%s is %d characters long.\n"
.section    .text.startup,"ax",@progbits
.p2align 4,,15
.globl  main
.type   main, @function
main:
.LFB44:
.cfi_startproc
subq    $8, %rsp
.cfi_def_cfa_offset 16
movl    $15, %ecx
movl    $.LC0, %edx
movl    $.LC1, %esi
movl    $1, %edi
xorl    %eax, %eax
call    __printf_chk
xorl    %eax, %eax
addq    $8, %rsp
.cfi_def_cfa_offset 8
ret
.cfi_endproc
.LFE44:
.size   main, .-main
.ident  "GCC: (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3"
.section    .note.GNU-stack,"",@progbits

Looking at both main functions, you can see that the instructions are identical.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top