Why both clang and gcc only give a warning when there is a space after backslash if C standard says that whitespace is forbidden?

StackOverflow https://stackoverflow.com/questions/21635063

Question

When compiling the following snippet, both gcc and clang only issue a warning. Notice space after \ next to int:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int \ 
        a = 10;

    printf("%d\n", a);
}

gcc:

main.c:7:6: warning: backslash and newline separated by space [enabled by default]

clang:

main.c:7:7: warning: backslash and newline separated by space int ^

In c99 standard in 5.1.1.2 it says:

Each instance of a backslash character () immediately followed by a new-line character is deleted, splicing physical source lines to form logical source lines.

Why don't C compilers conform to C standard here? I think it's only their creators decision not to. I found a message on gcc mailing list that I think introduced this behavior: http://gcc.gnu.org/ml/gcc-patches/2000-09/msg00430.html. There, they say that this is done because trailing whitespaces are common and they don't want to treat them as an error. How common is that?

Was it helpful?

Solution

The compiler is allowed to extend the language as long as the document the change which gcc does in their docs in section 6.21 Slightly Looser Rules for Escaped Newlines.

Recently, the preprocessor has relaxed its treatment of escaped newlines. Previously, the newline had to immediately follow a backslash. The current implementation allows whitespace in the form of spaces, horizontal and vertical tabs, and form feeds between the backslash and the subsequent newline. The preprocessor issues a warning, but treats it as a valid escaped newline and combines the two lines to form a single logical line. This works within comments and tokens, as well as between tokens. Comments are not treated as whitespace for the purposes of this relaxation, since they have not yet been replaced with spaces.

and clang strives to support gcc extensions and points to the gcc docs on them:

this document describes the language extensions provided by Clang. In addition to the language extensions listed here, Clang aims to support a broad range of GCC extensions. Please see the GCC manual for more information on these extensions.

So their obligations with respect to the standard are fulfilled. In fact Linux depends on many gcc extensions. We can see this by looking at the draft C99 standard section 4. Conformance paragraphs 6 which say:

[...]A conforming implementation may have extensions (including additional library functions), provided they do not alter the behavior of any strictly conforming program.3)

footnote 3 says:

This implies that a conforming implementation reserves no identifiers other than those explicitly reserved in this International Standard.

and paragraph 8:

An implementation shall be accompanied by a document that defines all implementation defined and locale-specific characteristics and all extensions.

gcc also documents that you can use the -pedantic flag to generate a warning when using extensions and you can use -pedantic-errors flag to make it an error:

[...] to obtain all the diagnostics required by the standard, you should also specify -pedantic (or -pedantic-errors if you want them to be errors rather than warnings).

OTHER TIPS

Compile with the right options and gcc and clang will refuse to do the translation:

$ gcc -Wall -Werror -std=c11 -pedantic tst.c
tst.c: In function ‘main’:
tst.c:6:9: error: backslash and newline separated by space [-Werror]
cc1: all warnings being treated as errors
$

By default gcc compiles in c89 mode with GNU extensions enabled and is pretty indulgent.

A warning is generated when this happens. Don't write code that depends on this behavior; it is provided because trailing whitespace is significant (almost) nowhere else, and we get much better error recovery if we treat them as line continuations. Especially don't write code that depends on being able to put a comment after a line continuation, that is an accident of the implementation and will change in the future.

If you want to disable extensions, like the other answer says, enable warnings and use -pedantic-errors. Otherwise, feel free to use whatever features the compiler gives you as long as you understand the implications and what benefits they provide. If you want C99 mode, make sure to add -std=c99 to the command line.

You quoted text from the standard that says what happens when a backslash is followed by a newline.

That text does not tell us what the compiler (or C implementation generally) must do when there is a backslash followed by a space.

I do not see any particular rule for that in the C 2011 standard. A backslash outside of its proper place simply does not conform to the general syntax and grammar of C, so it becomes some sort of violation of those, depending on where it appears.

C 2011 5.1.1.3 says:

A conforming implementation shall produce at least one diagnostic message (identified in an implementation-defined manner) if a preprocessing translation unit or translation unit contains a violation of any syntax rule or constraint,…

GCC did that; it produces a diagnostic message. (A warning is a diagnostic.) So it conformed to C 2011 in this regard.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top