Question

I'm studying the C++ standard on the exact behaviour the preprocessor (I need to implement some sort of C++ preprocessor). From what I understand, the example I made up (to aid my understanding) below should be valid:

#define dds(x) f(x,
#define f(a,b) a+b
dds(eoe)
su)

I expect the first function like macro invocation dds(eoe) be replaced by f(eoe, (note the comma within the replacement string) which then considered as f(eoe,su) when the input is rescanned.

But a test with VC++2010 gave me this (I told the VC++ to output the preprocessed file):

eoe+et_leoe+et_l
su)

This is counter-intuitive and is obviously incorrect. Is it a bug with VC++2010 or my misunderstanding of the C++ standard? In particular, is it incorrect to put a comma at the end of the replacement string like I did? My understanding of the C++ standard grammar is that any preprocessing-token's are allowed there.

EDIT:

I don't have GCC or other versions of VC++. Could someone help me to verify with these compilers.

Was it helpful?

Solution 2

To the best of my understanding there is nothing in the [cpp.subst/rescan] portions of the standard that makes what you do illegal, and clang and gcc are right in expanding it as eoe+su, and the MSC (Visual C++) behaviour has to be reported as a bug.

I failed to make it work but I managed to find an ugly MSC workaround for you, using variadics - you may find it helpful, or you may not, but in any event it is:

#define f(a,b) (a+b
#define dds(...) f(__VA_ARGS__)

It is expanded as:

(eoe+
su)

Of course, this won't work with gcc and clang.

OTHER TIPS

My answer is valid for the C preprocessor, but according to Is a C++ preprocessor identical to a C preprocessor?, the differences are not relevant for this case.

From C, A Reference Manual, 5th edition:

When a functionlike macro call is encoutered, the entire macro call is replaced, after parameter processing, by a copy of the body. Parameter processing proceeds as follows. Actual argument token strings are associated with the corresponding formal parameter names. A copy of the body is then made in which every occurrence of a formal parameter name is replace by a copy of the actual parameter token sequence associated with it. This copy of the body then replaces the macro call. [...] Once a macro call has been expanded, the scan for macro calls resumes at the beginning of the expansion so that names of macros may be recognized within the expansion for the purpose of further macro replacement.

Note the words within the expansion. That's what makes your example invalid. Now, combine it with this: UPDATE: read comments below.

[...] The macro is invoked by writing its name, a left parenthesis, then once actual argument token sequence for each formal parameter, then a right parenthesis. The actual argument token sequences are separated by commas.

Basically, it all boils down to whether the preprocessor will rescan for further macro invocations only within the previous expansion, or if it will keep reading tokens that show up even after the expansion.

This may be hard to think about, but I believe that what should happen with your example is that the macro name f is recognized during rescanning, and since subsequent token processing reveals a macro invocation for f(), your example is correct and should output what you expect. GCC and clang give the correct output, and according to this reasoning, this would also be valid (and yield equivalent outputs):

#define dds f
#define f(a,b) a+b

dds(eoe,su)

And indeed, the preprocessing output is the same in both examples. As for the output you get with VC++, I'd say you found a bug.

This is consistent with C99 section 6.10.3.4, as well as C++ standard section 16.3.4, Rescanning and further replacement:

After all parameters in the replacement list have been substituted and # and ## processing has taken place, all placemarker preprocessing tokens are removed. Then, the resulting preprocessing token sequence is rescanned, along with all subsequent preprocessing tokens of the source file, for more macro names to replace.

Well, the problem i see is that the preprocessor does the following

ddx(x) becomes f(x,

However, f(x, is defined as well (even thou it's defined as f(a,b) ), so f(x, expands to x+ garbage.

So ddx(x) finally transforms into x + garbage (because you defined f(smthing, ).

Your dds(eoe) actually expands into a+b where a is eoe and b is et_l . And it does that twice for whatever reason :).

This scenario you made is compiler specific, depends how the preprocessor chooses to handle the defines expansion.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top