Question

I see that on Linux systems with GCC the address of string literals seems to be much smaller than for other variables. For instance the following code generates the o/p shown below it.

#include <stdio.h>

int main()
{
    char *str1 = "Mesg 1";
    char *str2 = "Mesg 2";
    char str3[] = "Mesg 3";
    char str4[] = "Mesg 4";

    printf("str1 = %p\n", (void *) str1);
    printf("str2 = %p\n", (void *) str2);
    printf("&str3 = %p\n", (void *) str3);
    printf("&str4 = %p\n", (void *) str4);

    return 0;
}

Output:

str1 = 0x400668
str2 = 0x40066f
&str3 = 0x7fffcc990b10
&str4 = 0x7fffcc990b00

Is there a constant address space separate for such usage?

Was it helpful?

Solution

The standard does not specify where string literals will reside but most likely it would be in the read only data section. For example on a Unix system using objdump you can inspect the read only data section like this:

objdump -s -j .rodata a.out

and using a Live Example we can see output similar to this:

Contents of section .rodata:
 400758 01000200 4d657367 20310073 74723120  ....Mesg 1.str1 
 400768 3d202570 0a004d65 73672032 00737472  = %p..Mesg 2.str
 400778 32203d20 25700a00 26737472 33203d20  2 = %p..&str3 = 
 400788 25700a00 26737472 34203d20 25700a00  %p..&str4 = %p..

The C99 draft standard section 6.4.5 String literals paragraph 5 says:

[...] The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence.[...]

which means the lifetime of the string literal is the lifetime of the program and paragraph 6 says:

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

So we don't know if they are distinct, that is going to be an implementation choice but we do know that we can not modify them. Otherwise it does not specify how they should be stored.

OTHER TIPS

char *str1 = "Mesg 1";
char *str2 = "Mesg 2";
char str3[] = "Mesg 3";
char str4[] = "Mesg 4";

str1 and str2 are pointer objects, pointing to string literals -- or, more precisely, to the anonymous static array objects associated with those string literals. Those arrays have static storage duration, which means that they exist for the entire execution of the program. They're also read-only, which can affect where the implementation chooses to store them. (BTW, since string literals are read-only, the pointers to them should be declared as const.)

str3 and str4 are not pointers; they're array objects initialized with the specified values. They have automatic storage duration, which means that they exist only during the execution of the nearest enclosing block (in this case, while the main function is executing). For main, there's not much practical difference unless you play tricks with recursive calls or atexit handlers, but for other functions it matters. Objects with automatic storage duration are typically allocated on the stack, and deallocated when the function returns.

(An array expression, in most contexts, is implicitly converted to a pointer to the array's first element. See section 6 of the comp.lang.c FAQ for details.)

On your system, apparently read-only static objects are allocated at low addresses around 0x400000, and the stack is at much higher addresses just below 0x800000000000 (247). This can vary from one system to another.

It's important to note that all these addresses have the same length. You seem to be using a 64-bit system. 0x400668 is not a 32-bit address; it's a 64-bit address that happens to have a small numeric value. The output format used by printf for %p is implementation-defined; it could have printed:

str1 = 0x0000000000400668
str2 = 0x000000000040066f
&str3 = 0x00007fffcc990b10
&str4 = 0x00007fffcc990b00

Is there a constant address space separate for such usage?

No, this is completely implementation dependent. Only things assured are:

  • A string literal remains alive throughout the lifetime of your program and
  • You get a undefined behavior if you modify a string literal in anyway.

Some implementations put string literals in a read-only data segment, which will likely have significantly different addresses from regular data. This varies by implementation, though, so don't assume it's universal.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top