Question

I am using javax.sound.sampled and JLayer to play a MP3 file. I am trying to analyze the audio input stream to determine when the song starts and when it ends (based on the audio levels in the beginning and end of the MP3). A 4 minute song may only have 3 minutes and 55 seconds of actual music while the rest is silence, which is why I am determining this.

I thought I could determine this information by finding the first and last non-zero bytes in the stream.

Problem: The issue is that when I adjust the buffer size, the position of the first non-zero byte changes. Why is this, and shouldn't it remain constant no matter the buffer size?

E.g. At a buffer size of 16, the startFrame correlates to the 17th byte. With a buffer size of 64, the startFrame correlates to the 65th byte.

Here is the code:

        byte[] buffer;
        int pos = 0;
        short silenceThreshold = 1;

        startFrame = 0;
        endFrame = -1;

        boolean startFrameSet = false;

        buffer = new byte[16];
        byte prevVal = 0;
        for (int n = 0; n != -1; n = audioInputStream.read(buffer, 0,
                buffer.length)) {

            for (int i = 0; i < buffer.length; i++) {
                if (buffer[i] >= silenceThreshold || buffer[i] <= -silenceThreshold) {
                    // Is not silent
                    if (!startFrameSet) {
                        startFrame = (pos * buffer.length) + i;
                        startFrameSet = true;
                    }
                } else {
                    // Silence
                    // If the previous value is > 0 or < 0, set endFrame
                    if (prevVal >= silenceThreshold || prevVal <= silenceThreshold) {
                        endFrame = (pos * buffer.length) + i;
                    }
                }
                prevVal = buffer[i];
            }

            pos++;
        }

        //If last byte is not within silence threshold (song doesn't end in silence).
        if (prevVal >= silenceThreshold || prevVal <= silenceThreshold) {
            // last frame is not silent
            endFrame = -1;
        }

I figure I misunderstood how the audio input stream and audio in general works.

Était-ce utile?

La solution

Your outer for loop does not read from the audio input stream on the first pass through the loop

 for (int n = 0; n != -1; n = audioInputStream.read(buffer, 0,
            buffer.length)) {

is equivalent to:

int n = 0;
while (n != -1) {
    // Inner loop

    n = audioInputStream.read(buffer, 0, buffer.length);
}

so on the first loop the buffer is just the zero initialized array from new byte[16].

You should not assume the read fills the whole buffer, use the value returned by the read.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top