Question

To my astonishment, this compiles:

const char* c_str()
{
    static const char nullchar = '\0';
    return nullchar;
}

and it introduced a bug in my code. Thankfully, I caught it.

Is this intentional by C++, or a compiler bug? Is there a reason why the data type is actively ignored?
It worked in Visual C++ 2010 and GCC, but I don't understand why it should work, given the obvious data type mismatch. (The static isn't necessary, either.)

Was it helpful?

Solution

As you've defined it, nullchar is an integer constant expression with the value 0.

The C++03 standard defines an null pointer constant as: "A null pointer constant is an integral constant expression (5.19) rvalue of integer type that evaluates to zero." To make a long story short, your nullchar is a null pointer constant, meaning it can be implicitly converted and assigned to essentially any pointer.

Note that all those elements are required for that implicit conversion to work though. For example, if you had used '\1' instead of '\0', or if you had not specified the const qualifier for nullchar, you wouldn't get the implicit conversion -- your assignment would have failed.

Inclusion of this conversion is intentional but widely known as undesirable. 0 as a null pointer constant was inherited from C. I'm fairly sure Bjarne and most of the rest of the C++ standard committee (and most of the C++ community in general) would dearly love to remove this particular implicit conversion, but doing so would destroy compatibility with a lot of C code (probably close to all of it).

OTHER TIPS

This is an old history: it goes back to C.

There is no null keyword in C. A null pointer constant in C is either:

  • an integral constant expression with value 0, like 0, 0L, '\0' (remember that char is an integral type), (2-4/2)
  • such expression cast to void*, like (void*)0, (void*)0L, (void*)'\0', (void*)(2-4/2)

The NULL macro (not a keyword!) expands to such null pointer constant.

In the first C++ design, only the integral constant expression was allowed as a null pointer constant. Recently std::nullptr_t was added to C++.

In C++, but not in C, a const variable of integral type initialized with an integral constant expression is an integral constant expression:

const int c = 3;
int i;

switch(i) {
case c: // valid C++
// but invalid C!
}

So a const char initialized with the expression '\0' is a null pointer constant:

int zero() { return 0; }

void foo() {
    const char k0 = '\0',
               k1 = 1,
               c = zero();
    int *pi;

    pi = k0; // OK (constant expression, value 0)
    pi = k1; // error (value 1)
    pi = c; // error (not a constant expression)
}

And you think this is not sound language design?


Updated to include relevant parts of C99 standard... According to §6.6.6...

An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof operator.

Some clarifications for C++-only programmers:

  • C uses the term "constant" for what C++ programmers know as a "literal".
  • In C++, sizeof is always a compile time constant; but C has variable length arrays, so sizeof is sometimes not a compile time constant.

Then, we see §6.3.2.3.3 states...

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.


To see just how old this functionality is, see the identical mirrored parts in the C99 standard...

§6.6.6

An integer constant expression shall have integer type and shall only have operands that are integer constants, enumeration constants, character constants, sizeof expressions whose results are integer constants, and floating constants that are the immediate operands of casts. Cast operators in an integer constant expression shall only convert arithmetic types to integer types, except as part of an operand to the sizeof operator.

§6.3.2.3.3

An integer constant expression with the value 0, or such an expression cast to type void *, is called a null pointer constant. If a null pointer constant is converted to a pointer type, the resulting pointer, called a null pointer, is guaranteed to compare unequal to a pointer to any object or function.

nullchar is a (compile-time-)constant expression, with value 0. So it's fair game for implicit conversion to a null pointer.

In more detail: I'm quoting from a 1996 draft standard here.

char is an integral type. nullchar is const, so it is a (compile-time) integral constant expression, as per section 5.19.1:

5.19 Constant expressions [expr.const]

1 In several places, C++ requires expressions that evaluate to an inte- gral or enumeration constant ... An integral constant-expression can involve ... const variables ...

Moreover, nullchar evaluates to 0, allowing it to be implicitly converted to a pointer, as per section 4.10.1:

4.10 Pointer conversions [conv.ptr]

1 An integral constant expression (expr.const) rvalue of integer type that evaluates to zero (called a null pointer constant) can be con- verted to a pointer type.

Perhaps an intuitive reason "why" this might be allowed (just off the top of my head) is that pointer width isn't specified, and so conversion from any size integral constant expression to a null pointer is allowed.


Updated with the relevant parts of the (newer) C++03 standard... According to §5.19.1...

An integral constant-expression can involve only literals (2.13), enumerators, const variables or static data members of integral or enumeration types initialized with constant expressions (8.5), non-type template parameters of integral or enumeration types, and sizeof expressions.

Then, we look to §4.10.1...

A null pointer constant is an integral constant expression (5.19) rvalue of integer type that evaluates to zero. A null pointer constant can be converted to a pointer type; the result is the null pointer value of that type and is distinguishable from every other value of pointer to object or pointer to function type. Two null pointer values of the same type shall compare equal.

It compiles for the very same reason this compiles

const char *p = 0; // OK

const int i = 0;
double *q = i; // OK

const short s = 0;
long *r = s; // OK

Expressions on the right have type int and short, while the object being initialized is a pointer. Does this surprise you?

In C++ language (as well as in C) integral constant expressions (ICEs) with value 0 have special status (although ICEs are defined differently in C and C++). They qualify as null-pointer constants. When they are used in pointer contexts, they are implicitly converted to null pointers of the appropriate type.

Type char is an integral type, not much different from int in this context, so a const char object initialized by 0 is also a null-pointer constant in C++ (but not in C).

BTW, type bool in C++ is also an integral type, which means that a const bool object initialized by false is also a null-pointer constant

const bool b = false;
float *t = b; // OK

A later defect report against C++11 has changed the definition of null-pointer constant. After the correction, null pointer constant can only be "an integer literal with value zero or a prvalue of type std::nullptr_t". The above pointer initializations are no longer well-formed in C++11 after the correction.

It is not ignoring the data type. It's not a bug. It's taking advantage of the const you put in there and seeing that its value is actually an integer 0 (char is an integer type).

Integer 0 is a valid (by definition) null pointer constant, which can be converted to a pointer type (becomes the null pointer).

The reasons why you'd want the null pointer is to have some pointer value which "points to nowhere" and can be checkable (i.e. you can compare a null pointer to an integer 0, and you will get true in return).

If you drop the const, you will get an error. If you put double in there (as with many other non integer types; I guess the exceptions are only types that can be converted to const char* [through overloading of the conversion operators]), you will get an error (even w/o the const). And so forth.

The whole thing is that, in this case, your implementation sees that you're returning a null ptr constant; which you can convert to a pointer type.

It seems that a lot of the real answer to this question has ended up in the comments. To summarize:

  • The C++ standard allows const variables of integral type to be considered "integral constant expressions." Why? Quite possibly to bypass the issue that C only allows macros and enums to hold the place of integral constant expression.

  • Going (at least) as far back as C89, an integral constant expression with value 0 is implicitly convertible to (any type of) null pointer. And this is used often in C code, where NULL is quite often #define'd as (void*)0.

  • Going back to K&R, the literal value 0 has been used to represent null pointers. This convention is used all over the place, with such code as:

    if ((ptr=malloc(...)) {...} else {/* error */}
    

there is a auto cast. if you well run this program:

#include <stdio.h>
const char* c_str()
{
    static const char nullchar = '\0';
    return nullchar;
}

int main()
{
    printf("%d" , sizeof(c_str()));
    return 0;
}

the out-put well be 4 on my computer -> the size of a pointer.

the compiler auto casts. notice, at least gcc gives a warning (i don't know about VS)

I think it might be the fact the null character is common between the types. What you are doing is setting a null pointer when you return the null character. This would fail if any other character was used because you are not passing the address of the character to the pointer, but the value of the character. Null is a valid pointer and character value so a null character can be set as pointer.

In short, null can be used by any type to set an empty value, regardless to if it is an array, a pointer, or a variable.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top