What's the point of VLA anyway?

Question 1

For reasons that are not entirely clear to me, almost every time the topic of C99 VLA pops up in a discussion, people start talking predominantly about the possibility of declaring run-time-sized arrays as local objects (i.e. creating them "on the stack"). This is rather surprising and misleading, since this facet of VLA functionality - support for local array declarations - happens to be a rather auxiliary, secondary capability provided by VLA. It does not really play any significant role in what VLA can do. Most of the time, the matter of local VLA declarations and their accompanying potential pitfalls is forced into the foreground by VLA critics, who use it as a "straw man" intended to derail the discussion and bog it down among barely relevant details.

The essence of VLA support in C is, first and foremost, a revolutionary qualitative extension of the language's concept of type. It involves the introduction of such fundamentally new kind of types as variably modified types. Virtually every important implementation detail associated with VLA is actually attached to its type, not to the VLA object per se. It is the very introduction of variably modified types into the language that makes up the bulk of the proverbial VLA cake, while the ability to declare objects of such types in local memory is nothing more than a insignificant and fairly inconsequential icing on that cake.

Consider this: every time one declares something like this in one's code

/* Block scope */
int n = 10;
...
typedef int A[n];
...
n = 5; /* <- Does not affect `A` */

size-related characteristics of the variably modified type A (e.g. the value of n) are finalized at the exact moment when the control passes over the above typedef-declaration. Any changes in the value of n made further down the line (below this declaration of A) don't affect the size of A. Stop for a second and think about what it means. It means that the implementation is supposed to associate with A a hidden internal variable, which will store the size of the array type. This hidden internal variable is initialized from n at run time when the control passes over the declaration of A.

This gives the above typedef-declaration a rather interesting and unusual property, something we haven't seen before: this typedef-declaration generates executable code (!). Moreover, it doesn't just generate executable code, it generates critically important executable code. If we somehow forget to initialize the internal variable associated with such typedef-declaration, we'll end up with a "broken"/uninitialized typedef alias. The importance of that internal code is the reason why the language imposes some unusual restrictions on such variably modified declarations: the language prohibits passing control into their scope from outside of their scope

/* Block scope */
int n = 10;
goto skip; /* Error: invalid goto */

typedef int A[n];

skip:;

Note once again that the above code does not define any VLA arrays. It simply declares a seemingly innocent alias for a variably modified type. Yet, it is illegal to jump over such typedef-declaration. (We are already familiar with such jump-related restrictions in C++, albeit in other contexts).

A code-generating typedef, a typedef that requires run-time initialization is a significant departure from what typedef is in the "classic" language. (It also happens to pose a significant hurdle of the way of adoption of VLA in C++.)

When one declares an actual VLA object, in addition to allocating the actual array memory the compiler also creates one or more hidden internal variables, which hold the size(s) of the array in question. One has to understand that these hidden variables are associated not with the array itself, but rather with its variably modified type.

One important and remarkable consequence of this approach is as follows: the additional information about array size, associated with a VLA, is not built directly into the object representation of the VLA. It is actually stored besides the array, as "sidecar" data. This means that object representation of a (possibly multidimensional) VLA is fully compatible with object representation of an ordinary classic compile-time-sized array of the same dimensionality and the same sizes. For example

void foo(unsigned n, unsigned m, unsigned k, int a[n][m][k]) {}
void bar(int a[5][5][5]) {}

int main(void)
{
  unsigned n = 5;
  int vla_a[n][n][n];
  bar(a);

  int classic_a[5][6][7];
  foo(5, 6, 7, classic_a); 
}

Both function calls in the above code are perfectly valid and their behavior is fully defined by the language, despite the fact that we pass a VLA where a "classic" array is expected, and vice versa. Granted, the compiler cannot control the type compatibility in such calls (since at least one of the involved types is run-time-sized). However, if desired, the compiler (or the user) has everything necessary to perform the run-time check in debug version of code.

(Note: As usual, parameters of array type are always implicitly adjusted into parameters of pointer type. This applies to VLA parameter declarations exactly as it applies to "classic" array parameter declarations. This means that in the above example parameter a actually has type int (*)[m][k]. This type is unaffected by the value of n. I intentionally added a few extra dimensions to the array to maintain its dependence on run-time values.)

Compatibility between VLA and "classic" arrays as function parameters is also supported by the fact that the compiler does not have to accompany a variably modified parameter with any additional hidden information about its size. Instead, the language syntax forces the user to pass this extra information in the open. In the above example the user was forced to first include parameters n, m and k into function parameter list. Without declaring n, m and k first, the user would not have been able to declare a (see also the above note about n). These parameters, explicitly passed into the function by the user, will bring over the information about the actual sizes of a.

For another example, by taking advantage of VLA support we can write the following code

#include <stdio.h>
#include <stdlib.h>

void init(unsigned n, unsigned m, int a[n][m])
{
  for (unsigned i = 0; i < n; ++i)
    for (unsigned j = 0; j < m; ++j)
      a[i][j] = rand() % 100;
}

void display(unsigned n, unsigned m, int a[n][m])
{
  for (unsigned i = 0; i < n; ++i)
    for (unsigned j = 0; j < m; ++j)
      printf("%2d%s", a[i][j], j + 1 < m ? " " : "\n");
  printf("\n");
}

int main(void) 
{
  int a1[5][5] = { 42 }; 
  display(5, 5, a1);
  init(5, 5, a1);
  display(5, 5, a1);

  unsigned n = rand() % 10 + 5, m = rand() % 10 + 5;
  int (*a2)[n][m] = malloc(sizeof *a2);
  init(n, m, *a2);
  display(n, m, *a2);
  free(a2);
}

This code is intended to draw your attention to the following fact: this code makes heavy use of valuable properties of variably modified types. It is impossible to implement elegantly without VLA. This is the primary reason why these properties are desperately needed in C to replace the ugly hacks that were used in their place previously. Yet at the same time, not even a single VLA is created in local memory in the above program, meaning that this popular vector of VLA criticism is not applicable to this code at all.

Basically, the two last examples above is a concise illustration of what the point of VLA support is.

Question 2

Looking at the comments and the answers, it seems to me that VLAs are useful when you know that normally your input is not too big (similar to knowing your recursion is probably not too deep), but you don't actually have an upper bound, and you would generally ignore the possible stack overflow (similar to ignoring them with recursion) hoping they don't happen.

It may actually be not an issue altogether either, for example if you have unlimited stack size.

That said, here's another use for them I have found which doesn't actually allocate memory on stack, but makes working with dynamic multi-dimensional arrays easier. I'll demonstrate by a simple example:

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    size_t n, m;

    scanf("%zu %zu", &n, &m);

    int (*array)[n][m] = malloc(sizeof *array);

    for (size_t i = 0; i < n; ++i)
        for (size_t j = 0; j < m; ++j)
            (*array)[i][j] = i + j;

    free(array);
    return 0;
}

Question 3

Despite of all the points you mentioned about VLA, the best part of VLA is that the compiler automatically handles the storage management and the complexities of index calculations of arrays whose bounds are not compile-time constants.
If you want local dynamic memory allocation then the only option is VLA.

I think this could be the reason that VLA is adopted in C99 (optional on C11).

One thing I want to clear that is there are some remarkable differences between alloca and VLA. This post points out the differences:

The memory alloca() returns is valid as long as the current function persists. The lifetime of the memory occupied by a VLA is valid as long as the VLA's identifier remains in scope.

You can alloca() memory in a loop for example and use the memory outside the loop, a VLA would be gone because the identifier goes out of scope when the loop terminates.

Question 4

Your argument seems to be that since one has to bound check the size of the VLA, why not just allocate the maximum size and be done with the runtime allocation.

That argument overlooks the fact that memory is a limited resource in the system, shared between many processes. Memory wastefully allocated in one process is not available to any other (or perhaps it is, but at the expense of swapping to disk).

By the same argument we would not need to malloc an array at run time when we could statically allocate the maximum size that could be needed. In the end heap exhaustion is only slightly preferable to stack overflow.

Question 5

Stack allocation (a so VLA allocation) is VERY fast, just requires a quick modification to the stack pointer (typically a single CPU instuction). No need for expensive heap allocation/deallocation.

But, why not just use a constant size array instead?

Let suppose you are writing a high performance code, and you need a variable size buffer, let say between 8 and 512 elements. You can just declare a 512 elements array, but if most of the times you only require 8 elements then overallocating can affect the performance due to affecting the cache locality in the stack memory. Now imagine this function has to be called millions of times.

Another example, imagine your function (with a local VLA) is recursive, you know beforehand that in any moment the total size of all recursively allocated VLAs is limited (i.e. the arrays have variable size, but the sum of all sizes is bounded). In this case, if you use the maximun possible size as fixed local array size you may allocate much more memory than otherwise required, making your code slower (due to cache misses) and even causing stack overflows.

Question 6

VLAs do not have to allocate any memory or only stack memory. They are very handy in many aspects of programming.

Some examples

Used as function parameters.

int foo(size_t cols, int (*array)[cols])
{
    //access as normal 2D array
    prinf("%d", array[5][6]);
    /* ... */
}

Allocate 2D(or more) array dynamically

inr foo(size_t rows, size_t cols)
{
    int (*array)[cols] = malloc(rows * sizeof(*array));
    /* ... */
    //access as normal 2D array
    prinf("%d", array[5][6]);
    /* ... */