Question

EDIT: Minimal compiling code replicating the behaviour.

This code reads a crappy dictionary file, in order to try to extract some interesting info out of it. Each line translates to a struct entry. A word is always extracted, hence newentry() doesn't check for the validity of its word argument.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct entry {
char *word;
char *cat;
char *gen;
} entry;

entry *newentry(char *word, char *cat, char *gen) {
entry *w = malloc(sizeof(entry));
w->word = malloc(sizeof(strlen(word)) + 1);
strcpy(w->word, word);
if (cat) {
    w->cat = malloc(sizeof(strlen(cat)) + 1);
    strcpy(w->cat, cat);
}
else {
    w->cat = "";
}
if (gen) {
    w->gen = malloc(sizeof(strlen(gen)) + 1);
    strcpy(w->gen, gen);
}
else {
    w->gen = "";
}
return w;
}


int main() {
FILE *original = fopen("French.txt", "r");
char *line = NULL;
size_t len = 0;
ssize_t read;
while ((read = getline(&line, &len, original)) != -1) {
    char *word = strtok(strdup(line), "\t");
    char *tmp = strtok(NULL, "[\n");
    char *cat = strtok(NULL, "]\n");
    newentry(word, cat, tmp);           //bugs here
}
return 0;
}

This code fails on its line (code) line, and I have absolutely no knowledge why. If I were to replace tmp by word, cat, or a constant, it works every single time. If I were to change the order of newentry()'s arguments, it fails every single time, so long as tmp is an argument. I tried to debug by breaking at the while. The file that is being parsed is about 4 thousand lines, so I imagined some line (it is a very crappy file) was corrupted somehow, and I tried continue 1000, and got an exception. So I restarted and tried other values of continue - but by doing continue 100 11 times, I was able to exceed the former 1000.

My conclusion is that tmp is corrupted somehow by the following strtok. Therefore I tried char *tmp = strdup(strtok(NULL, "[\n")); and it didn't work better.

Replacing the newentry() failing line by printf("%s %s %s", word, tmp, cat); works 100% of the times, though I can't check the 4000 values by eye.

I have really no idea how to get out of this mess, and would appreciate any pointers.

EDIT: a few lines from the data file:

courthouse  palais de justice[Noun]
courtier    courtisan[Noun]
courtliness e/le/gance[Adjective]
courtly e/le/gant[Adjective]
courtmartial    conseil de guerre[Noun]
courtroom   salle d'audience[Noun]

Thanks.

Whole input file, in case someone is really curious : http://pastebin.com/VPp8WpuK

Was it helpful?

Solution 2

The parsing strings you are choosing for strtok may need to be adjusted. With the example input file. I am getting a null on the second line of strtok because the entire line is read on the first call to strtok (i.e. there is no "\t" in the line

char *word = strtok(StrDup(line), "\t");//reads entire line of input
char *tmp = strtok(NULL, "[\n");//NULL is returned here
char *cat = strtok(NULL, "]\n");

Therefore, you are passing a null into your function newentry(,,)

Would it work to change parsing string to:

char *word = strtok(StrDup(line), "\t ");//added space  

Also the following lines

w->word = malloc(sizeof(strlen(word)) + 1);  
w->cat = malloc(sizeof(strlen(cat)) + 1);
w->gen = malloc(sizeof(strlen(gen)) + 1);

should be:

w->word = malloc(strlen(word) + 1);
w->cat = malloc(strlen(cat) + 1);
w->gen = malloc(strlen(gen) + 1);  

One other thing, you need to free the memory allocated in newentry(), which will be a problem if you need to return the struct w. Would suggest allocating it all in main(), passing the struct as a pointer, then freeing it all when it comes back.

This is how to do that...
Create an array of struct entry:

typedef struct {
    char *word;
    char *cat;
    char *gen;
} ENTRY;  
ENTRY entry[linesInFile], *pEntry;  

Then in main() initialize it:

int main(void)
{
    pEntry = &entry[0];
    //allocate memory 
    //call redefined newentry() function
    //use results of newentry() function
    //free memory
}  

Now, because pEntry is a pointer to the entire array of entry it can be easily passed as an argument after calling malloc for the char * members of entry. (don't forget free() when it returns)

Here are the edits I had to do to make it run (does not include the re-write to get free() calls in)

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

typedef struct entry {
    char *word;
    char *cat;
    char *gen;
} entry;

entry *newentry(char *word, char *cat, char *gen) 
{
    entry *w = malloc(sizeof(entry));
    w->word = malloc(strlen(word) + 1);
    strcpy(w->word, word);
    if (cat) 
    {
        w->cat = malloc(strlen(cat) + 1);
        strcpy(w->cat, cat);
    }
    else 
    {
        w->cat = "";
    }
    if (gen) 
    {
        w->gen = malloc(strlen(gen) + 1);
        strcpy(w->gen, gen);
    }
    else 
    {
        w->gen = "";
    }
    return w;

}


int main() 
{
    FILE *original = fopen("French.txt", "r");
    char line[260];
    int len = 260;
    //ssize_t read;
    while ( fgets(line, len, original))            
    {
        //char *word = strtok(StrDup(line), "\t ");//I dont have strdup, had to use this
        char *word = strtok(strdup(line), "\t ");
        char *tmp = strtok(NULL, "[\n");
        char *cat = strtok(NULL, "]\n");

        if((!word)||(!tmp)||(!cat)) return 0;
        word[strlen(word)]=0;
        tmp[strlen(tmp)]=0;
        cat[strlen(cat)]=0;

        newentry(word, cat, tmp);           //bugs here
    }
    return 0;
}

OTHER TIPS

This is wrong:

entry *w = malloc(sizeof(entry *));

You want:

entry *w = malloc( sizeof *w );

or:

entry *w = malloc( sizeof( entry ))
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top