Question

I'm using strtok() to parse a string I get from fgets() that is separated by the ~ character

e.g. data_1~data_2

Here's a sample of my code:

fgets(buff, LINELEN, stdin);
pch = strtok(buff, " ~\n");
//do stuff
pch = strtok(NULL, " ~\n");
//do stuff

The first instance of strtok breaks it apart fine, I get data_1 as is, and strlen(data_1) provides the correct length of it. However, the second instance of strtok returns the string, with something appended to it.

With an input of andrewjohn ~ jamessmith, I printed out each character and the index, and I get this output:

a0
n1
d2
r3
e4
w5
j6
o7
h8
n9

j0
a1
m2
e3
s4
s5
m6
i7
t8
h9
10

What is that "11th" value corresponding to?

EDIT:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main()
{
    char buff[100];
    char * pch;
    fgets(buff, 100, stdin);
    pch = strtok(buff, " ~\n");
    printf("FIRST NAME\n");
    for(i = 0; i < strlen(pch); i++)
    {
        printf("%c %d %d\n", *(pch+i), *(pch+i), i);
    }
    printf("SECOND NAME\n");
    pch = strtok(NULL, " ~\n");
    for(i = 0; i < strlen(pch); i++)
    {
        printf("%c %d %d\n", *(pch+i), *(pch+i), i);
    }
}

I ran it by:

cat sample.in | ./myfile

Where sample.in had

andrewjohn ~ johnsmith

Output was: FIRST NAME a 97 0 n 110 1 d 100 2 r 114 3 e 101 4 w 119 5 j 106 6 o 111 7 h 104 8 n 110 9 SECOND NAME j 106 0 o 111 1 h 104 2 n 110 3 s 115 4 m 109 5 i 105 6 t 116 7 h 104 8 13 9

So the last character is ASCII value 13, which says it's a carriage return ('\r'). Why is this coming up?

Was it helpful?

Solution

Based on your edit, the input line ends in \r\n. As a workaround you could just add \r to your list of tokens in strtok.

However, this should be investigated further. \r\n is the line ending in a Windows file, but stdin is a text stream, so \r\n in a file would be converted to just \n in the fgets result.

Are you perhaps piping in a file that contains something weird like \r\r\n ? Try hex-dumping the file you're piping in to check this.

Another possible explanation might be that your Cygwin (or whatever) environment has somehow been configured not to translate line endings in a file piped in.

edit: Joachim's suggestion is much more likely - using a \r\n file on a non-Windows system. If this is the case , you can fix it by running dos2unix on the file. But in accordance with the principle "accept everything, generate correctly" it would be useful for your program to handle this file.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top