Historically, typedefs were a relatively late addition to C. Before they were added to the language, type names consisted of keywords (int
, char
, double
, struct
, etc.) and punctuation characters (*
, []
, ()
), and so were easy to recognize unambiguously. An identifier could never be a type name, so an identifier in parentheses followed by an expression could not be a cast expression.
Typedefs made it possible for a user-defined identifier to be a type name, which rather seriously messed up the grammar.
Take a look at the syntax of type-specifier in the C standard (I'll use the C90 version since it's slightly simpler):
type-specifier:
void
char
short
int
long
float
double
signed
unsigned
struct-or-union-specifier
enum-specifier
typedef-name
All but the last can be easily recognized because they either are keywords, or start with a keyword. But a typedef-name is just an identifier.
When a C compiler processes a typedef
declaration, it needs to, in effect, introduce the typedef name as a new keyword. Which means that, unlike for a language with a context-free grammar, there needs to be feedback from the symbol table to the parser.
And even that's a bit of an oversimplification. A typedef name can still be redefined, either as another typedef or as something else, in an inner scope:
{
typedef int foo; /* foo is a typedef name */
{
int foo; /* foo is now an ordinary identifier, an object name */
}
/* And now foo is a typedef name again */
}
So a typedef name is effectively a user-defined keyword if it's used in a context where a type name is valid, but is still an ordinary identifier if it's redeclared.
TL;DR: Parsing C is hard.