How does GCC implement variable-length arrays?

Question 1

Here's the allocation code (x86 - the x64 code is similar) for the following example line taken from some GCC docs for VLA support:

char str[strlen (s1) + strlen (s2) + 1];

where the calculation for strlen (s1) + strlen (s2) + 1 is in eax (GCC MinGW 4.8.1 - no optimizations):

mov edx, eax
sub edx, 1
mov DWORD PTR [ebp-12], edx
mov edx, 16
sub edx, 1
add eax, edx
mov ecx, 16
mov edx, 0
div ecx
imul    eax, eax, 16
call    ___chkstk_ms
sub esp, eax
lea eax, [esp+8]
add eax, 0
mov DWORD PTR [ebp-16], eax

So it looks to be essentially alloca().

Question 2

Well, these are just a few wild stabs in the dark, based on the restrictions around VLA's, but anyway:

VLA's can't be:

extern
struct members
static
declared with unspecified bounds (save for function prototype)

All this points to VLA's being allocated on the stack, rather than the heap. So yes, VLA's probably are the last chunks of stack memory allocated whenever a new block is allocated (block as in block scope, these are loops, functions, branches or whatever).
That's also why VLA's increase the risk of Stack overflow, in some cases significantly (word of warning: don't even think about using VLA's in combination with recursive function calls, for example!).
This is also why out-of-bounds access is very likely to cause issues: once the block ends, anything pointing to what Was VLA memory, is pointing to invalid memory.
But on the plus side: this is also why these arrays are thread safe, though (owing to threads having their own stack), and why they're faster compared to heap memory.

The size of a VLA can't be:

an extern value
zero or negative

the extern restriction is pretty self evident, as is the non-zero, non-negative one... however: if the variable that specifies the size of a VLA is a signed int, for example, the compiler won't produce an error: the evaluation, and thus allocation, of a VLA is done during runtime, not compile-time. Hence The size of a VLA can't, and needn't be a given during compile-time.
As MichaelBurr rightly pointed out, VLA's are very similar to alloca memory, with one, IMHO, crucial distinction: memory allocated by alloca is valid from the point of allocation, and throughout the rest of the function. VLA's are block scoped, so the memory is freed once you exit the block in which a VLA is used:

void alloca_diff( void )
{
    char *alloca_c, *vla_c;
    for (int i=1;i<10;++i)
    {
        char *alloca_mem = alloca(i*sizeof(*alloca_mem));
        alloca_c = alloca_mem;//valid
        char vla_arr[i];
        vla_c = vla_arr;//invalid
    }//end of scope, VLA memory is freed
    printf("alloca: %c\n", *alloca_c);//fine
    printf("vla: %c\n\", *vla_c);//undefined behaviour... avoid!
}//end of function alloca memory is freed, irrespective of block scope