delete after new [] : with or without user destructor

Question 1

UB is not something you can rationalise about within the scope of the language, and at least without detailed knowledge of the toolchain, optimisation level, operating system, architecture and processor.

At a guess, though: with no destructor to call, the compiler is simply not generating any code that makes your code's fault visible in the real world. Such a destructor call may be hitting the header of your dynamically allocated memory block, which doesn't exist in the single new A example.

Question 2

Mismatching new, new[], delete and delete[] is undefined behaviour. Really, that's the end of the discussion.

Anything can happen after that. It's rather pointless to analyse further.

Question 3

It is overwhelmingly likely that the reason is that when you new[] an array of a type which is not trivially destructible, the implementations stores the size of the array at the start of the block it gets from the underlying allocation mechanism. It needs this so that it knows how many objects to destroy when you delete[]

When it returns the address of the first element of the array, this will differ from the address of the block it allocated by some number of bytes, probably sizeof(size_t).

As a memory-saving optimization, for types that don't need destruction it doesn't do this.

So, delete[] will do a different thing according to whether or not the type has a destructor. If it does then it will look at the bytes preceding the pointer it is given, destroy that many objects, subtract the offset and free the block. If it doesn't then it will just free the address it is given. As it happens, "just free the address" is also what delete does in the case of a type with no destructor, and so when you mismatch them you "get away with" the UB on this implementation.

Now, there may be other mechanisms that will produce the same results you're seeing, and you could in principle check the source of this specific version of GCC to confirm. But I'm pretty sure I remember that this is how GCC used to do it, so it probably still does.

Question 4

From a language point of view the only thing that matters is that it is Undefined Behavior. Anything goes. But that does not mean that you cannot explain the behavior you are getting.

The first thing to note is that the key difference is not that you provided a user destructor, but that the destructor (implicitly defined or user defined) is not trivial. Basically a destructor is trivial if it does nothing at all (the compiler cannot consider a user provided destructor as trivial). The cases where you might not have a user provided destructor and the destructor may still be non-trivial include any case in which a subobject (member or base) has a non-trivial destructor.

struct NonTrivialDestructor {
   std::string s;
};
struct NotTrivialEither : NonTrivialDestructor {};

Now with that being said, the main difference in the two cases where you are experiencing the crash and not is that in the latter case the compiler knows that destructors won't do anything and it thus knows that it won't make a difference if they are called or not.

The importance of this statement is that in the case where destructors do (or may do) something, the compiler must ensure that the generated code calls as many destructors as needed. Since the pointer returned by new[] does not hold the information of how many objects were allocated (and thus need being destroyed) that information needs to be tracked somewhere else. The most common approach is that the compiler will generate code that allocates one additional size_t object, it will set the value to be the number of objects and allocate the rest of the array, construct all objects, etc. The layout of memory would be:

 v
 +------+---+---+---+--
 | size | T | T | T |...
 +------+---+---+---+--
        ^

The pointer returned by new[] needs to hold the address of the first object which means that even though the underlying allocator (operator new) returns the pointer v, new[] will return ^ in the above drawing.

The additional information allows delete[] to know how many constructors to call, basically doing something like:

pseudo-delete[](T * p) {
   size_t *size = reinterpret_cast<size_t*>(p)-1;
   for (size_t i = 0; i < *size; ++i) 
      (p+i)->~T();
   deallocate(size);
}

That is, delete[] extracts the information from the previous memory location, uses that to drive how many destructors to run and then releases the memory that was allocated from the system (pointer v in the drawing).

On the other hand, delete is meant to handle deletion of a single object, for that there is no additional counts needed (the count must be 1), so there is no additional information tracked. The pseudo code for plain delete would just call p->~T() and deallocate(p) directly. This is where you get the crash.

The memory allocated by the underlying mechanism was at address v, but you attempted to release ^, causing the library to fail and crash.

In the case of new[] with types that have trivial destructors, the compiler knows that calling the destructor will not do anything, so it skips the operation completely, which in the pseudo code above means that the for loop is gone. This in turn removes the need for the additional size_t, and the memory returned by a call to new [] has the following layout:

 v
 +---+---+---+--
 | T | T | T |...
 +---+---+---+--
 ^

In this case the pointer obtained from the memory allocator and the pointer returned by new[] are in the same location, which means that when delete is evaluated the pointer used to deallocate is the same that was used to allocate.

It is important to note that all this discussion is about implementation details, this is undefined behavior in the standard and some implementation might opt to always track the size, or use a different mechanism to track the number of elements and everything here discussed would be bogus. You cannot depend on this behavior, but I hope that it will help people understand why the standard requires matching new/delete and new[]/delete[].