Question

Possible Duplicate:
strtok wont accept: char *str

When using the strtok function, using a char * instead of a char [] results in a segmentation fault.

This runs properly:

char string[] = "hello world";
char *result = strtok(string, " ");

This causes a segmentation fault:

char *string = "hello world";
char *result = strtok(string, " ");

Can anyone explain what causes this difference in behaviour?

Was it helpful?

Solution

char string[] = "hello world";

This line initializes string to be a big-enough array of characters (in this case char[12]). It copies those characters into your local array as though you had written out

char string[] = { 'h', 'e', 'l', 'l', 'o', ' ', 'w', 'o', 'r', 'l', 'd', '\0' };

The other line:

char* string = "hello world";

does not initialize a local array, it just initializes a local pointer. The compiler is allowed to set it to a pointer to an array which you're not allowed to change, as though the code were

const char literal_string[] = "hello world";
char* string = (char*) literal_string;

The reason C allows this without a cast is mainly to let ancient code continue compiling. You should pretend that the type of a string literal in your source code is const char[], which can convert to const char*, but never convert it to a char*.

OTHER TIPS

In the second example:

char *string = "hello world";
char *result = strtok(string, " ");

the pointer string is pointing to a string literal, which cannot be modified (as strtok() would like to do).

You could do something along the lines of:

char *string = strdup("hello world");
char *result = strtok(string, " ");

so that string is pointing to a modifiable copy of the literal.

strtok modifies the string you pass to it (or tries to anyway). In your first code, you're passing the address of an array that's been initialized to a particular value -- but since it's a normal array of char, modifying it is allowed.

In the second code, you're passing the address of a string literal. Attempting to modify a string literal gives undefined behavior.

In the second case (char *), the string is in read-only memory. The correct type of string constants is const char *, and if you used that type to declare the variable you would get warned by the compiler when you tried to modify it. For historical reasons, you're allowed to use string constants to initialize variables of type char * even though they can't be modified. (Some compilers let you turn this historic license off, e.g. with gcc's -Wwrite-strings.)

The first case creates a (non const) char array that is big enough to hold the string and initializes it with the contents of the string. The second case creates a char pointer and initializes it to point at the string literal, which is probably stored in read only memory.

Since strtok wants to modify the memory pointed at by the argument you pass it, the latter case causes undefined behavior (you're passing in a pointer that points at a (const) string literal), so its unsuprising that it crashes

Because the second one declares a pointer (that can change) to a constant string...

So depending on your compiler / platform / OS / memory map... the "hello world" string will be stored as a constant (in an embedded system, it may be stored in ROM) and trying to modify it will cause that error.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top