Question

Please have a look at this code snippet:

char line1[10], line2[10];
int rtn;
rtn = scanf("%9[a]%9[^\n]", line1, line2);
printf("line1 = %s|\nline2 = %s|\n", line1, line2);
printf("rtn = %d\n", rtn);

Output:

$ gcc line.c -o line
$ ./line
abook
line1 = a|
line2 = book|
rtn = 2
$./line
book
line1 = |
line2 = �Js�|
rtn = 0
$

For input abook, %9[a] fails at b from the book and stores previously parsed a+\0 at line1.
Then %9[^\n] parses the remaining line and stores just now parsed book+\0 at line2.
Please note 2 points here:

  1. At the time of storing the parsed input, \0 is appended at the end of it since %[] is a conversion specifier for a string.
  2. When %9[a] failed at b, scanf didn't exit. It simply went on scanning further input.

Now for input book, %9[a] should fail at b from the book and should store just \0 at line1 since here nothing was parsed. Then %9[^\n] should parse the remaining line and should store just now parsed book+\0 at line2.

Now, let's see what exactly happened:
Here return value is 0 that means scanf didn't assign value to any variable. scanf simply exited without assigning any values. So garbage data at line2. And in the case of line1 that garbage data happen to be a NULL character.

But this is quite strange! Isn't it?
I mean scanf exits if %[...] fails at the very first character of input. (Even if additional conversion specifier is there in scanf statement.)
But if the same %[...] fails at any other character other than first one then scanf simply continues scanning the further input. (If additional conversion specifier is there of course.) It doesn't exit.

So why this inconsistency? Why not let scanf statement continue scan the input (if additional conversion specifier is there of course) even if %[...] fails at the very first char of input? Exactly like what happens in other case.
Is there any special reason behind this inconsistency?

$ gcc --version
gcc (Ubuntu 4.4.3-4ubuntu5.1) 4.4.3
Was it helpful?

Solution

2) When %9[a] failed at b, scanf didn't exit. It simply went on scanning further input.

Yes, the %9[a] directive means "store up to 9 'a's, but at least one"(1), so the conversion %9[a] did not fail, it succeeded. It found fewer 'a's than it could have consumed, but that's not a failure. The input matching failed at the 'b', but the conversion succeeded.

(1) Specified in 7.21.6.2 (12) where the conversions are described:

[ Matches a nonempty sequence of characters from a set of expected characters (the scanset).

Now for input book, %9[a] should fail at b from the book and should store just '\0' at line1 since here nothing was parsed. Then %9[^\n] should parse the remaining line and should store just now parsed book+\0 at line2.

No. It is supposed to exit when a conversion fails. The first conversion %9[a] failed, so scanf is supposed to stop and return 0, since no conversion succeeded.

Always check the return value of scanf.

That is specified (for fscanf, but scanf is equivalent to fscanf with stdin as input stream) in 7.21.6.2 (16):

The fscanf function returns the value of the macro EOF if an input failure occurs before the first conversion (if any) has completed. Otherwise, the function returns the number of input items assigned, which can be fewer than provided for, or even zero, in the event of an early matching failure.

Here output for line1 is nothing which is exactly what we expected. An empty string!

You can't expect anything. The arrays line1 and line2 aren't initialised, so when the conversion fails, their contents is still indeterminate. In this case, line1 contained no printable character before the first 0 byte.

But for line2 it's garbage chars! We didn't expect this. So how did this happen ?

That's what happened to be the contents of line2. There were never any values assigned to the elements, so they are whatever they happened to be before the call to scanf.

OTHER TIPS

Transferred from comments to the question since the response to the reply question requires more space than the comments allow.

This comment refers to an earlier version of the code:

Since you didn't check the return value from scanf(), you've no idea whether it said "I failed" or not. You can't blame it when you ignore its error returns; in the second example, it will have said '0 items scanned successfully', which means that none of the variables were set to anything useful at all. You must always check the return value from scanf() so you know whether it did what you expected.

The reply question is:

I updated the code and output to show the return value of scanf. And yes for case 2 the return value is 0. But this doesn't answer the question. Clearly scanf exited in case 2. But for case 1, return value is 2 which means scanf successfully assigned values to both the variables. So why this inconsistency?

I don't see any inconsistency. The fscanf() specification (copied from ISO/IEC 9899:2011, but the URL links to POSIX rather than the C standard) says:

¶3 [...] Each conversion specification is introduced by the character %. After the %, the following appear in sequence:

— An optional assignment-suppressing character *.
— An optional decimal integer greater than zero that specifies the maximum field width (in characters).
— An optional length modifier that specifies the size of the receiving object.
— A conversion specifier character that specifies the type of conversion to be applied.

Later, it says:

¶8 [...] Input white-space characters (as specified by the isspace function) are skipped, unless the specification includes a [, c, or n specifier.284)

¶9 An input item is read from the stream, unless the specification includes an n specifier. An input item is defined as the longest sequence of input characters which does not exceed any specified field width and which is, or is a prefix of, a matching input sequence.285) The first character, if any, after the input item remains unread. If the length of the input item is zero, the execution of the directive fails; this condition is a matching failure unless end-of-file, an encoding error, or a read error prevented input from the stream, in which case it is an input failure.

¶12 [...]

[ Matches a nonempty sequence of characters from a set of expected characters (the scanset).286)

[Bold italic emphasis added. I've left the footnote references in place, but the contents of the footnotes are not material to the discussion so I've omitted them.]

So, the behaviour you are seeing is exactly what the standard demands. When %9[a] is applied to the string abook, there is a sequence of one a which matches the %9[a] conversion specification, so the directive is successful, and the scan continues with book. When %9[a] is applied to the string book, there are zero characters matching the item, so the execution of the directive fails and it is a matching error and since it is the first conversion specification, the return value of 0 is correct.

Note that the length specifies a maximum field width, so the 9 in %9[a] means 1-9 letters a.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top