Question

I can't find the relevant bits in the standard, but gcc and clang allow it, so I guess I' wondering if it's a compiler extension or part of the language. Provide a link if you can.

This can arise with things such as this:

extern char arr[];

func(arr[7]); /*No error.*/

LATE EDIT: I figured I'd better get a clear understanding of this, which I never did although I had moved on, so starting a bounty which I will award to the first person to give me a clear, concise reference(es) in the C89 standard as to why this is allowed. C99 is acceptable if nobody can find the answer in C89, but you need to look in the C89 standard first.

Was it helpful?

Solution

The following statement

extern char arr[];

is a declaration with external linkage, and says that arr has a type of array of char, which implies that arr can have an incomplete type.

According to "6.7 Declarations" (n1570):

7 If an identifier for an object is declared with no linkage, the type for the object shall be complete by the end of its declarator, or by the end of its init-declarator if it has an initializer; in the case of function parameters (including in prototypes), it is the adjusted type (see 6.7.6.3) that is required to be complete.

And arr[7] equals *(arr + 7), and arr need to have a type of "pointer to complete object type", and the type of arr will be converted from "array of char" to "pointer to char" in this case.

According to "6.3.2.1 Lvalues, arrays, and function designators" (n1570):

3 Except when it is the operand of the sizeof operator, the_Alignof operator, or the unary & operator, or is a string literal used to initialize an array, an expression that has type ‘‘array of type’’ is converted to an expression with type ‘‘pointer to type’’ that points to the initial element of the array object and is not an lvalue.

OTHER TIPS

"A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2)))"

From ISO/IEC 9899:201x Committee Draft — April 12, 2011

So arr[7] is perfectly legal, as is 7[arr]. It being a legal expression does not mean it is referring to memory locations your process has permission to access, or the memory locations you intend.

Citing from WG14/N1124 Committee Draft May 6, 2005 ISO/IEC 9899:TC2

6.2.5 Types

[22] An array type of unknown size is an incomplete type. It is completed, for an identifier of that type, by specifying the size in a later declaration (with internal or external linkage).

extern char arr[];

shall be an incomplete type.

6.5.2.1 Array subscripting

[2] A postfix expression followed by an expression in square brackets [] is a subscripted designation of an element of an array object. The definition of the subscript operator [] is that E1[E2] is identical to (*((E1)+(E2))). Because of the conversion rules that apply to the binary + operator, if E1 is an array object (equivalently, a pointer to the initial element of an array object) and E2 is an integer, E1[E2] designates the E2-th element of E1 (counting from zero).

func(arr[7]); /* or func(7[arr]); */ 
is identical to

func( *(arr + 7) ); /* Memory for arr allocated in other module; */ 

6.7.5.2 Array Declarators

[4] If the size is not present, the array type is an incomplete type.
[8] EXAMPLE 2 Note the distinction between the declarations
extern int *x;
extern int y[];

The first declares x to be a pointer to int; the second declares y to be an array of int of unspecified size (an incomplete type), the storage for which is defined elsewhere.

x is complete because sizeof x is known. y is incomplete because size of y is unknown while compiling this unit.

extern char arr[];

is not identical to

extern char *arr;

Footnote[92]

If prior invalid pointer operations (such as accesses outside array bounds) produced undefined behavior, subsequent comparisons also produce undefined behavior.

Annex J.2 Undefined behavior

— An array subscript is out of range, even if an object is apparently accessible with the given subscript (as in the lvalue expression a1[7] given the declaration int a[4][5]) (6.5.6).
— Addition or subtraction of a pointer into, or just beyond, an array object and an integer type produces a result that points just beyond the array object and is used as the operand of a unary * operator that is evaluated (6.5.6).

Compiler is expected to generate code to access 7th element of arr(0-based) and if definition of arr does not contain 7+1 i.e. 8 elements or more, behavior shall be undefined. But code would be compilable and would exhibit well defined behavior as long as linked arr has sufficient size.

For incomplete types you have to do your own memory management and bounds checking, meaning that you know if arr[7] is a valid location or not.

Because of this accessing and indexed location within that array is the only way to use incomplete types.

For example you can't initalise an incomplete type arr[] with the value of a complete type such as arr = arr2[5] even if you know that you allocated enough memory for arr2 to fit into arr you can only memcpy or iterate through each slot.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top