Question

I'm studying the code of the function strtok from bsd's libc, when I ran it in my machine, the program received signal SIGSEGV in s[-1] = 0. Here's the link to the code.

Is s[-1] = 0 right?

This is my code:

#include <stdio.h>
#include <stdlib.h>
#include "strtok.c"

int main(int argc, char* argv[]) {
    char* str = "xxxx xxxyy fdffd";
    const char* s = " ";

    char* token = strtok(str, s);

    while (token != NULL) {
        printf("%s\n", token);
        token = strtok(NULL, s);
    }

    return 0;
}
Was it helpful?

Solution

s[-1]

Is expanded to:

*( s - 1 )

Therefore, if the result points to valid memory, the code is defined.

OTHER TIPS

This is okay because s is a pointer we can see from the draft C99 standard that E1[E2] is identical to (*((E1)+(E2))) from section 6.5.2.1 Array subscripting says (emphasis mine):

A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).

If s was an array however this would not be valid code since we would be accessing memory that does not belong to the array which would be undefined behavior.

s[-1] refers to the object preceding the object that s points to.

Per the rules of C, s[-1] is equivalent to *(s-1). This:

  • Calculates s-1. The result is a pointer to the object before the object that s points to, the same way that s+1 is a pointer to the object after s.
  • Deferences it, which produces an lvalue for the pointed-to object.

Thus, s[-1] = 0 assigns 0 to the object before the object that s points to.

s[-1] is legal code if s is pointing to an element of an array after the first element (thus ensuring there is an element before it) or s is pointing one beyond the end of the array. (It is also legal if s points one beyond an individual object that is not in array, which is something of an unusual use.)

It should be OK cause a few lines above it does s++, so worst case scenario we are working with (s + 1) -1.

Whether s[-1] = 0 is "right" or "wrong" will depend on the run-time value of s.

There's nothing inherently wrong or unusual about s[-1] = 0 by itself.

Just a hunch, but I'm fairly certain that FreeBSD's strtok(3) is both pretty stable and pretty well tested.

s is a char*; s[-1] sets the character preceding that pointed to by s to NUL.

Could we see your actual code that's invoking strtok(3)? The problem is likely in your set, so to speak. Further, did you read the man page?

The first time that strtok() is called, str should be specified; subsequent calls, wishing to obtain further tokens from the same string, should pass a null pointer instead. The separator string, sep, must be supplied each time, and may change between calls.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top