Multiple Include Optimization

Question 1

At what point when parsing header2.h does the preprocessor skip this file?

The file is not skipped.

My understanding is that it skips this file immediately after the #if directive on line 1, i.e. it does not have to wait for the matching #endif. Is this correct?

Yes and No. Some compilers identify the sentry macro when it parses the first header file and if it finds it in a second file, it will immediately stop parsing. Other compilers will parse the header again (looking for the matching #endif).

What can I add to the example above to demonstrate how this works?

Add a print message inside and outside the sentry macro

#ifdef   _HEADER_INCLUDED
#define  _HEADER_INCLUDED
...
#pragma message ("inside sentry in " __FILE__ "\n")
#endif //#ifdef   _HEADER_INCLUDED

#pragma message ("outside sentry in " __FILE__ "\n")

Relevant material:

You can use #pragma once instead of the sentry macro. Faster compilation since very little of the file is parsed. No worries about macro name collisions.
You can wrap the includes if checks to sentry macro so the header file isn't loaded again. This is usually used in library headers that include multiple headers many times. Can significantly speed up compilation at the expense of ugly code:

#ifndef __LIST_H_

#include "list.h"

#endif

Question 2

At what point when parsing header2.h does the preprocessor skip this file?

As @Sean says, header2.h will never be skipped, but the content between the ifndef ... endif will be ignored in this case.

What can I add to the example above to demonstrate how this works?

Add something (for example, a #define B 123) after the #endif in header2.h. Now try to access it in the main. It will be accessible.

Now, try to add it before the #endif. You'll see, that it's not accessible in the `main.

Question 3

The pre-processor will never skip header2.h. It will always include it, and when expanding it will ignore the stuff in the #ifndef block.

In your example A will be 32, as the #define in herader2.h will never be reached. If it was reached you'd get some sort of "macro redefinition error" as you'd have multiple #defines for "A". To fix this you#d need to #undef A.

Most compilers support the #pragma once directive these days to save you having to write include guards in header files.

Question 4

The preprocessor starts blocking all input that follows a false #if[[n]def] to go to through subsequent compiler steps.

The preprocessor does however continues reading the input, to keep track of nesting depth of all those conditional compilation #-directives.

When it finds the matching #endif, of where it started blocking input, it simply stops blocking.

Question 5

If I understand this correctly, this means that any header file is read only once even it is included multiple times for a given compile process. And so, additional include guards in application code or header file provide no benefit.

No gcc compiler only does this optimization for files that it knows to be safe following the rules:

There must be no tokens outside the controlling #if-#endif pair, but whitespace and comments are permitted.
There must be no directives outside the controlling directive pair, but the null directive (a line containing nothing other than a single ‘#’ and possibly whitespace) is permitted.

The opening directive must be of the form

  #ifndef FOO

or

  #if !defined FOO     [equivalently, #if !defined(FOO)]