Question

I've included an example program using getchar() below, for reference (not that anyone probably needs it), and feel free to address concerns with it if you desire. But my question is:

What exactly is going on when the program calls getchar()?

Here is my understanding (please clarify or correct me):

  1. When getchar is called, it checks the STDIN buffer to see if there is any input.
  2. If there isn't any input, getchar sleeps.
  3. Upon wake, getchar checks to see if there is any input, and if not, puts it self to sleep again.
  4. Steps 2 and 3 repeat until there is input.
  5. Once there is input (which by convention includes an 'EOF' at the end), getchar returns the first character of this input and does something to indicate that the next call to getchar should return the second letter from the same buffer? I'm not really sure what that is.
  6. When there are no more characters left other than EOF, does getchar flush the buffer?

The terms I used are probably not quite correct.

#include <stdio.h>

int getLine(char buffer[], int maxChars);

#define MAX_LINE_LENGTH 80

int main(void){

    char line[MAX_LINE_LENGTH];
    int errorCode;

    errorCode = getLine(line, sizeof(line));
    if(errorCode == 1)
        printf("Input exceeded maximum line length of %d characters.\n", MAX_LINE_LENGTH);
    printf("%s\n", line);

    return 0;

}

int getLine(char buffer[], int maxChars){
    int c, i = 0;
    while((c = getchar()) != EOF && c != '\n' && i < maxChars - 1)
        buffer[i++] = c;
    buffer[i++] = '\0';
    if(i == maxChars)
        return 1;
    else
        return 0;
}
Was it helpful?

Solution

Step 2-4 are slightly off.

If there is no input in the standard I/O buffer, getchar() calls a function to reload the buffer. On a Unix-like system, that normally ends up calling the read() system call, and the read() system call puts the process to sleep until there is input to be processed, or the kernel knows there will be no input to be processed (EOF). When the read returns, the code adjusts the data structures so that getchar() knows how much data is available. You description implies polling; the standard I/O system does not poll for input.

Step 5 uses the adjusted pointers to return the correct values.

There really isn't an EOF character; it is a state, not a character. Even though you type Control-D or Control-Z to indicate 'EOF', that character is not inserted into the input stream. In fact, those characters cause the system to flush any typed characters that are still waiting for 'line editing' operations (like backspace) to change them so that they are made available to the read() system call. If there are no such characters, then read() returns 0 as the number of available characters, which means EOF. Then getchar() returns the value EOF (usually -1 but guaranteed to be negative whereas valid characters are guaranteed to be non-negative (zero or positive)).

So basically, rather than polling, is it that hitting Return causes a certain I/O interrupt, and then when the OS receives this, it wakes up any processes that are sleeping for I/O?

Yes, hitting Return triggers interrupts and the OS kernel processes them and wakes up processes that are waiting for the data. The terminal driver is woken by the kernel when interrupt occurs, and decides what to do with the character(s) that were just received. They may be stashed for further processing (canonical mode) or made available immediately (raw mode), etc. Assuming, of course, that the input is a terminal; if the input is from a disk file, it is simpler in many ways — or if it is a pipe, or …

Nominally, it isn't the terminal app that gets woken by the interrupt; it is the kernel that wakes first, then the shell running in the terminal app that is woken because there's data for it to read, and only when there's output does the terminal app get woken.

I say 'nominally' because there's an outside chance that in fact the terminal app does mediate the I/O via a pty (pseudo-tty), but I think it happens at the kernel level and the terminal application is involved fairly late in the process. There's a huge disconnect really between the keyboard where you type and the display where what you type appears.

See also Canonical vs non-canonical terminal input.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top