it would seem better to flip things around and put the upward growing part (text, global data, and heap) on top?
No. Cache lines are typically only 32 bytes to 256 bytes. It's rare for a program to use less than a few megabytes of data, so the sharing is basically irrelevant. (Even if you're not using it, the standard library does a lot on your behalf.)
What is relevant is making sure data that is used together is stays close together in memory (and in some cases, aligned with a cache line.)
In a scripting language, each element of the array is likely to be on it's own cache line. But in C, you can put things close together (using arrays or structs). When it comes to number manipulation, rewriting in C can easily get 100x faster. (Or an efficient library like NumPy)
Most processors have separate instruction and data caches too.
Does it matter at all if the data segment is in a completely different part of memory from the stack?
Again, if your stack is likely to be more than a few hundred bytes deep (it will be!), then it's not even relevant. In fact, your stack will probably use many cache lines in a non-trivial program.
If you want to know more, I recommend trying to read What every programmer should know about memory by Ulrich Drepper. It's quite a tome, but even if you skim it, you can get some neat info. (Like making a program run 20x faster just by swiching your loop indices, or the fact that RAM is no more "random access" than your hard drive is.)