質問

Consider the function call (calling int sum(int, int))

printf("%d", sum(a,b));

How does the compiler decide that the , used in the function call sum(int, int) is not a comma operator?

NOTE: I didn't want to actually use the comma operator in the function call. I just wanted to know how the compiler knows that it is not a comma operator.

役に立ちましたか?

解決

Look at the grammar for the C language. It's listed, in full, in Appendix A of the standard. The way it works is that you can step through each token in a C program and match them up with the next item in the grammar. At each step you have only a limited number of options, so the interpretation of any given character will depend on the context in which it appears. Inside each rule in the grammar, each line gives a valid alternative for the program to match.

Specifically, if you look for parameter-list, you will see that it contains an explicit comma. Therefore, whenever the compiler's C parser is in "parameter-list" mode, commas that it finds will be understood as parameter separators, not as comma operators. The same is true for brackets (that can also occur in expressions).

This works because the parameter-list rule is careful to use assignment-expression rules, rather than just the plain expression rule. An expression can contain commas, whereas an assignment-expression cannot. If this were not the case the grammar would be ambiguous, and the compiler would not know what to do when it encountered a comma inside a parameter list.

However, an opening bracket, for example, that is not part of a function definition/call, or an if, while, or for statement, will be interpreted as part of an expression (because there's no other option, but only if the start of an expression is a valid choice at that point), and then, inside the brackets, the expression syntax rules will apply, and that allows comma operators.

他のヒント

From C99 6.5.17:

As indicated by the syntax, the comma operator (as described in this subclause) cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers). On the other hand, it can be used within a parenthesized expression or within the second expression of a conditional operator in such contexts. In the function call

f(a, (t=3, t+2), c)

the function has three arguments, the second of which has the value 5.

Another similar example is the initializer list of arrays or structs:

int array[5] = {1, 2};
struct Foo bar = {1, 2};

If a comma operator were to be used as the function parameter, use it like this:

sum((a,b))

This won't compile, of course.

The reason is the C Grammar. While everyone else seems to like to cite the example, the real deal is the phrase structure grammar for function calls in the Standard (C99). Yes, a function call consists of the () operator applied to a postfix expression (like for example an identifier):

 6.5.2 postfix-expression:
       ...
       postfix-expression ( argument-expression-list_opt )

together with

argument-expression-list:
       assignment-expression
       argument-expression-list , assignment-expression    <-- arglist comma

expression:
       assignment-expression
       expression , assignment-expression                  <-- comma operator

The comma operator can only occur in an expression, i.e. further down the in the grammar. So the compiler treats a comma in a function argument list as the one separating assignment-expressions, not as one separating expressions.

Existing answers say "because the C language spec says it's a list separator, and not an operator".

However, your question is asking "how does the compiler know...", and that's altogether different: It's really no different from how the compiler knows that the comma in printf("Hello, world\n"); isn't a comma operator: The compiler 'knows' because of the context where the comma appears - basically, what's gone before.

The C 'language' can be described in Backus-Naur Form (BNF) - essentially, a set of rules that the compiler's parser uses to scan your input file. The BNF for C will distinguish between these different possible occurences of commas in the language.

There are lots of good resources on how compilers work, and how to write one.

The draft C99 standard says:

As indicated by the syntax, the comma operator (as described in this subclause) cannot appear in contexts where a comma is used to separate items in a list (such as arguments to functions or lists of initializers). On the other hand, it can be used within a parenthesized expression or within the second expression of a conditional operator in such contexts. In the function call f(a, (t=3, t+2), c) the function has three arguments, the second of which has the value 5.

In other words, "because".

There are multiple facets to this question. One par is that the definition says so. Well, how does the compiler know what context this comma is in? That's the parser's job. For C in particular, the language can be parsed by an LR(1) parser (http://en.wikipedia.org/wiki/Canonical_LR_parser).

The way this works is that the parser generates a bunch of tables that make up the possible states of the parser. Only a certain set of symbols are valid in certain states, and the symbols may have different meaning in different states. The parser knows that it is parsing a function because of the preceding symbols. Thus, it knows the possible states do not include the comma operator.

I am being very general here, but you can read all about the details in the Wiki.

ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top