Difference between char[] and new char[] when using constant lengths

https://stackoverflow.com/questions/11226680

17-06-2021
|

Question

So this may seem like a widely-answered question, but I'm interested more in the internals of what exactly happens differently between the two.

Other than the fact that the second example creates not only the memory, but a pointer to the memory, what happens in memory when the following happens:

char a[5];
char b* = new char[5];

And more directly related to why I asked this question, how come I can do

const int len = 5;
char* c = new char[len];

but not

const int len = 5;
char d[len]; // Compiler error

EDIT Should have mentioned I'm getting this compiler error on VC++ (go figure...)

1>.\input.cpp(138) : error C2057: expected constant expression
1>.\input.cpp(138) : error C2466: cannot allocate an array of constant size 0
1>.\input.cpp(138) : error C2133: 'd' : unknown size

EDIT 2: Should have posted the exact code I was working with. This error is produced when the constant length for the dynamically allocated array is calculated with run-time values.

Assuming random(a,b) returns an int between a and b,

const int len1 = random(1,5);
char a[len1]; // Errors, since the value
              // is not known at compile time (thanks to answers)

whereas

const int len2 = 5;
char b[len2]; // Compiles just fine

Solution

The difference is the lifetime of the array. If you write:

char a[5];

then the array has a lifetime of the block it's defined in (if it's defined in block scope), of the class object which contains it (if it's defined in class scope) or static lifetime (if it's defined at namespace scope). If you write:

char* b = new char[5];

, then the array has any lifetime you care to give it—you must explicitly terminate its lifetime with:

delete [] b;

And with regards to your last question:

int const len = 5;
char d[len];

is perfectly legal, and should compile. Where there is a difference:

int len = 5;    //  _not_ const
char d[len];    //  illegal
char* e = new char[len];    //  legal

The reason for the difference is mostly one of compiler technology and history: in the very early days, the compiler had to know the length in order to create the array as a local variable.

OTHER TIPS

what happens in memory when the following happens:

char a[5]; 
char *b = new char[5];

Assuming a typical but somewhat simplified C++ implementation, and that the above code appears in a function:

char a[5];

The stack pointer is moved by 5 bytes, to make a 5-byte space. The name a now refers to that block of 5 bytes of memory.

char *b = new char[5];

The stack pointer is moved by sizeof(char*), to make space for b. A function is called, that goes away and allocates 5 bytes from a thing called the "free store", basically it carves 5 or more bytes off a big block of memory obtained from the OS, and does some book-keeping to ensure that when you free those bytes with delete[], they will be made available for future allocations to re-use. It returns the address of that allocated block of 5 bytes, which is stored into the the space on the stack for b.

The reason that the second is more work than the first is that objects allocated with new can be deleted in any order. Local variables (aka "objects on the stack") are always destroyed in reverse order of being created, so less book-keeping is needed. In the case of trivially-destructible types, the implementation can just move the stack pointer by the same distance in the opposite direction.

To remove some of the simplifications I made: the stack pointer isn't really moved once for each variable, possibly it's only moved once on function entry for all variables in the function, in this case the space required is at least sizeof(char*) + 5. There may be alignment requirements on the stack pointer or the individual variables which mean it's not moved by the size required, but rather by some rounded-up amount. The implementation (usually the optimizer) can eliminate unused variables, or use registers for them instead of stack space. Probably some other things I haven't thought of.

const int len1 = random(1,5);

The language rule is reasonably simple: the size of an array must be a constant expression. If a const int variable has an initializer in the same TU, and the initializer is a constant expression, then the variable name can be used in constant expressions. random(1,5) is not a constant expression, hence len1 cannot be used in constant expressions. 5 is a constant expression, so len2 is fine.

What the language rule is there for, is to ensure that array sizes are known at compile time. So to move the stack, the compiler can emit an instruction equivalent to stack_pointer -= 5 (where stack_pointer will be esp, or r13, or whatever). After doing that, it still "knows" exactly what offsets every variable has from the new value of the stack pointer -- 5 different from the old stack pointer. Variable stack allocations create a greater burden on the implementation.

what happens in memory when the following happens:
char a[5];
char b* = new char[5];

char a[5] allocates 5 chars on the stack memory.
new char[5] allocates 5 chars on the heap memory.

And more directly related to why I asked this question, how come I can do:
const int len = 5;
char* c = new char[len];
but not
const int len = 5;
char d[len]; // Compiler error

Both are compiled successfully for me.

In C++ you can't have dynamic arrays in stack. C99 has this feature, but not C++.

When you declare char d[ len ] you are allocating space on stack. When you do char *c = new char[ len ] you allocate space on heap.

The heap has its manager and can allocate variable amounts of memory. In C++, the stack must be allocated by constant expression values, so the compiler has room for lots of optimizations. The compiler is aware of how much space will be spent on a given context this way and is able to predict stack frames. With dynamic arrays, it wouldn't be possible, so the language staff decided to forbid it (at least until C++11).

The third pair of lines should work, that should not be a compiler error. There must be something else going on there.

The difference between the first two examples is that the memory for char a[5]; will automatically be freed, while char* b = new char[5]; allocates memory on that will not be freed until you expressly free it. An array that you allocate the first way can not be used once that particular variable goes out of scope because its destructor is automatically called and the memory is free to be overwritten. For an array created using new, you may pass the pointer around and use it freely outside the scope of the original variable, and even outside of the function in which it was created until you delete it.

Something that you cannot do is:

int a = 5;
int *b = new int[a];

For dynamic memory allocation, the size must be known at compile time.

Your a array is allocated on the stack ; that means, once the program is compiled, it knows it will have to reserve 5 bytes to stores the chars of a. On the opposite, b is just declared as a pointer, and its content will be allocated at runtime on the heap, and that can fail if memory is too scarce. Finally, as be has been newed, it must be deleted at some point, or you will be leaking memory.

When you're using new, you are allocating memory from the free-store/heap and you need to take care of releasing it yourself. Also, locating the free memory might actually take some time, as will freeing it.

When you are not using new, your memory gets reserved on the stack and is implicitly allocated and freed. I.e. when you enter a function, the call stack will just expand by the size of all your local variables (at least conceptually - for example, some variables can exist entirely in registers) and it will just be decremented when you leave the function.

When you allocate a variable with dynamic size on the stack as in your last example, it means that you need some additional information when entering the function scope. Specifically, the amount of space that needs to be reserved varies depending on the function inputs. Now if the context can be determined at the beginning of the function, everything is well - which is presumably why this is allowed in C99 - but if you have a variable for the size who's value you only know mid-function, you end up adding "fake" function calls. Together with C++'s scoping rules, this can get quite hairy, so it's conceptually a lot easier to just let C++ scoping take care of this via std::vector.

char a[5] allocates 5 sizeof(char) bytes to stack memory, when new char[5] allocates those bytes to heap memory. Bytes allocated to stack memory is also guaranteed to be freed when scope ends, unlike heap memory where you should free the memory explicitly.

char d[len] should be allowed since the variable is declared const and thus the compiler can easily make the code to allocate those bytes to stack memory.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow