Question

I'm finally finishing K&R, but encountered yet another unclear code. Chapter 6.3/6.4

  1. Referring to getword. How can it return int and that is supposed to be a word. I understand that it returns word[0] which is the first letter. However in my opinion, if I wanted to return a word, I'd introduce something like char *getword. Am I right?
    How is int able to indicate it is a word?

  2. Still about getword: Supposing I enter "in " and after space I push enter. getword reads 'i' as it is not a space, and isalpha so the first if is omitted. What happens then?

  3. I marked the line in binsearch. Don't You think it should be high = mid - 1; there?

int getword(char *word, int lim) {
    char *w = word;
    int c;

    while (isspace(c = getch()))
    {}
    if (c != EOF) {
        *w++ = c;
    }

    if (!isalpha(c)) {
        *w = '\0';
        return c;
    }
    for ( ; --lim > 0; w++) {
        if (!isalnum(*w = getch())) {
            ungetch(*w);
            break;
        }
    }
    *w = '\0';
    return word[0];
}

/* binsearch: find word in tab[0]...tab[n-1] */
struct key *binsearch(char *word, struck key *tab, int n)
{
    int cond;
    struct key *low = &tab[0];
    struct key *high = &tab[n];
    struct key *mid;
    while (low < high) {
        mid = low + (high-low) / 2;
        if ((cond = strcmp(word, mid->word)) < 0)
            high = mid; /* [3] */
        else if (cond > 0)
            low = mid + 1;
        else
            return mid;
    }
    return NULL;
}
Was it helpful?

Solution

You are correct, if the function would return a word, it'd be rather char *getword(). However, according to K&R

The function value is the first character of the word, or EOF for end of file, or the character itself if it is not alphabetic

Returning an int is ok, as in C, a character is like an int having only 8 bits, in the [-128, +127] range.

So where the word is returned?
In the char *word given as parameter. Initially char *w gets a copy of the word pointer, and then the characters read are set into the memory pointed to by w.

Having "in " in the input buffer, isspace would return false, and c is assigned the non-space character. Then, *w++ put that character at position [0] of word (i) increments the w pointer (++). word[0] contains 'i'.

The !isalpha test is false, thus that part is skipped.

Then characters are read from the input and stored into the next w position, until a non alphanumeric entry is read (or limit lim is reached) - in this non-alphanumeric case, the character read is actually put back into the input buffer, and w - which contains that undesired char - is not incremented (due to break). Then the following *w = '\0' overwrites that non-alpha char, and "close" the C string (in C strings ends with a character having a 0 value).

In your example, that stores 'n' in w, increments w, then stores ' ' into w and performs the code for !isalnum, i.e. breaks the loop. Then since w was not incremented after storing ' ', the *w = '\0' replaces the space, and "closes" the string.

[the other half of the question has already been answered by someone else]

OTHER TIPS

high = mid is correct. The right boundary is not included. Notice that initially high = &tab[n] i.e., it points past the last element in tab.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top