Question

A few years ago, before standardization of C, it was allowed to use struct selectors on addresses. For example, the following code was allowed and frequently used.

#define PTR 0xAA000
struct {  int integ; };

func() {
   int i;
   i = PTR->integ;    /* here, c is set to the first int at PTR */
   return c;
}

Maybe it wasn't very neat, but I like it. In my opinion, the power and the versatility of this language relies also on its lack of constraints. Nowadays, compilers just dump an error. I'd like to know if it is possible to remove this restraint in the GNU C compiler.

PS: similar code was used on the UNIX kernel by the inventors of C. (in V6, some dummy structures have been declared in param.h)

Was it helpful?

Solution

'A few years ago' is actually a very, very long time ago. AFAICR, the C in 7th Edition UNIX™ (1979, a decade before the C89 standard was defined) didn't support that notation any more (but see below).

The code shown in the question only worked when all structure members of all structures shared the same name space. That meant that structure.integ or pointer->integ always referred to an int at the start of a structure because there was only one possible structure member integ across the entire program.

Note that in 'modern' C (1978 onwards), you cannot reference the structure type; there's neither a structure tag nor a typedef for it — the type is useless. The original code also references an undefined variable c.

To make it work, you'd need something like:

#define PTR 0xAA000
struct integ {  int integ; };

int func(void)
{
   struct integ *ptr = (struct integ *)PTR;
   return ptr->integ;
}

C for 7th Edition UNIX

I suggested that the C with 7th Edition UNIX supported separate namespaces for separate structure types. However, the C Reference Manual published with the UNIX Programmer's Manual Vol 2 mentions in §8.5 Structures:

The names of structure members and structure tags may be the same as ordinary variables, since a distinction can be made by context. However, names of tags and members must be distinct. The same member name can appear in different structures only if the two members are of the same type and if their origin with respect to their structure is the same; thus separate structures can share a common initial segment.

However, that same manual also mentions the notations (see also What does =+ mean in C):

§7.14.2 lvalue =+ expression
§7.14.3 lvalue =- expression
§7.14.4 lvalue =* expression
§7.14.5 lvalue =/ expression
§7.14.6 lvalue =% expression
§7.14.7 lvalue =>> expression
§7.14.8 lvalue =<< expression
§7.14.9 lvalue =& expression
§7.14.10 lvalue =^ expression
§7.14.11 lvalue = | expression

The behavior of an expression of the form ‘‘E1 =op E2’’ may be inferred by taking it as equivalent to ‘‘E1 = E1 op E2’’; however, E1 is evaluated only once. Moreover, expressions like ‘‘i =+ p’’ in which a pointer is added to an integer, are forbidden.

AFAICR, that was not supported in the first C compilers I used (1983 — I'm ancient, but not quite that ancient); only the modern += notations were allowed. In other words, I don't think the C described by that reference manual was fully current when the product was released. (I've not checked my 1st Edition of K&R — does anyone have one on hand to check?) You can find the UNIX 7th Edition manuals online at http://cm.bell-labs.com/7thEdMan/.

OTHER TIPS

By giving the structure a type name and adjusting your macro slightly you can achieve the same effect in your code:

typedef struct { int integ; } PTR_t;
#define PTR ((PTR_t*)0xAA000)

I'd like to know if it is possible to remove this restraint in the GNU C compiler.

I'm reasonably sure the answer is no -- that is, unless you rewrite gcc to support the older version of the language.

The gcc manual documents the -traditional command-line option:

'-traditional' '-traditional-cpp'

Formerly, these options caused GCC to attempt to emulate a pre-standard C compiler. They are now only supported with the `-E' switch. The preprocessor continues to support a pre-standard mode. See the GNU CPP manual for details.

This implies that modern gcc (the quote is from the 4.8.0 manual) no longer supports pre-ANSI C.

The particular feature you're referring to isn't just pre-ANSI, it's very pre-ANSI. The ANSI standard was published in 1989. The first edition of K&R was published in 1978, and as I recall the language it described didn't support the feature you're looking for. The initial release of gcc was in 1987, so it's very likely that no version of gcc has ever supported that feature.

Furthermore, enabling such a feature would break existing code which may depend on the ability to use the same member name in different structures. (Traces of the old rules survive in the standard C library, where for example the members of type struct tm all have names starting with tm_; in modern C that would not be necessary.)

You might be able to find sources for an ancient C compiler that works the way you want. The late Dennis Ritchie's home page would be a good starting point for that. It's not at all obvious that you'd be able to get such a compiler working on any modern system without a great deal of work. And the result would be a compiler that doesn't support a number of newer features of C that you might find useful, such as the long, signed, and unsigned keywords, the ability to pass structures by value, function prototypes, and diagnostics for attempts to mix pointers and integers.

C is better now than it was then. There are a few dangerous things that are slightly more difficult than they were, but I'm not aware that any actual expressive power has been lost.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top