Are you aware that std::endl is flushing your ofstreams constantly? You could push out '\n' characters instead and just flush once at the end for each stream.
I/O might still end up being the biggest piece of your time spent, but you are working against the buffering and do not seem in this example at least to be gaining anything from it. Going one array/file at a time might give you infinitesimally faster performance due to better memory locality (unless you're using so much memory you're swapping pages to disk), but the difference is probably getting completely lost in the huge cost of file writing.