multithreaded combining of files (shared memory)

Question

Your approach will probably work OK for a moderate amount of data, but you've made one rank the central point of communication here. That's not going to scale terribly well.

You're on the right track with your part 2: a parallel write using MPI-IO sounds like a good approach to me. Here's how that might go:

Continue to have your T processes read their inputs.
I'm going to assume that 'id' is densely allocated. What I mean is, in this collection of files, can a process know if it sees data with id: 4 that some other processes have id 1, 2, 3, and 5 ? If so, then every process knows where it's data has to go.
Let's also assume each 'data' is fixed size. The approach is only a little more complicated if that's not the case.

If you don't know the max ID and the max timesteps, you'd have to do a bit of communication (MPI_Allreduce with MPI_MAX as the operation) to find that.

With these preliminaries, you can set an MPI-IO "file view", probably using MPI_Type_indexed

On rank 0, this gets a bit more complicated because you need to add to your list of data the timestep markers. Or, you can define a file format with an index of timesteps, and store that index in a header or footer.

The code would look roughly like this:

for(i=0; i<nitems; i++)
    datalen[i] = sizeof(item);
    offsets[i] = sizeof(item)*index_of_item;
}
MPI_Type_create_indexed(nitems, datalen, offsets, MPI_BYTE, &filetype);
MPI_File_set_view(fh, 0, MPI_BYTE, filetype, "native", MPI_INFO_NULL);
MPI_File_write_all(fh, buffer, nitems*sizeof(item), MPI_BYTE, &status);

The _all bit here is important: you're going to create a highly non-contiguous, irregular access pattern from each MPI processor. Give the MPI-IO library a chance to optimize that request.

Also it's important to note that MPI-IO file views must be monotonically non-decreasing, so you'll have to sort the items locally before writing the data out collectively. Local memory operations have an insignificant cost relative to an I/O operation, so this usually isn't a big deal.