You can write your own audio mixer. It isn't that hard. The audio data is extracted from the audio files, and the frames are summed before being output to a single SourceDataLine.
Or, you could use TinySound (code available on github). It is a nice clean implementation of exactly this sort of mixing.
One benefit of mixing to a single output: some systems (some Linux) do not support multiple outputs.