Pregunta

I am using javax.sound.sampled and JLayer to play a MP3 file. I am trying to analyze the audio input stream to determine when the song starts and when it ends (based on the audio levels in the beginning and end of the MP3). A 4 minute song may only have 3 minutes and 55 seconds of actual music while the rest is silence, which is why I am determining this.

I thought I could determine this information by finding the first and last non-zero bytes in the stream.

Problem: The issue is that when I adjust the buffer size, the position of the first non-zero byte changes. Why is this, and shouldn't it remain constant no matter the buffer size?

E.g. At a buffer size of 16, the startFrame correlates to the 17th byte. With a buffer size of 64, the startFrame correlates to the 65th byte.

Here is the code:

        byte[] buffer;
        int pos = 0;
        short silenceThreshold = 1;

        startFrame = 0;
        endFrame = -1;

        boolean startFrameSet = false;

        buffer = new byte[16];
        byte prevVal = 0;
        for (int n = 0; n != -1; n = audioInputStream.read(buffer, 0,
                buffer.length)) {

            for (int i = 0; i < buffer.length; i++) {
                if (buffer[i] >= silenceThreshold || buffer[i] <= -silenceThreshold) {
                    // Is not silent
                    if (!startFrameSet) {
                        startFrame = (pos * buffer.length) + i;
                        startFrameSet = true;
                    }
                } else {
                    // Silence
                    // If the previous value is > 0 or < 0, set endFrame
                    if (prevVal >= silenceThreshold || prevVal <= silenceThreshold) {
                        endFrame = (pos * buffer.length) + i;
                    }
                }
                prevVal = buffer[i];
            }

            pos++;
        }

        //If last byte is not within silence threshold (song doesn't end in silence).
        if (prevVal >= silenceThreshold || prevVal <= silenceThreshold) {
            // last frame is not silent
            endFrame = -1;
        }

I figure I misunderstood how the audio input stream and audio in general works.

¿Fue útil?

Solución

Your outer for loop does not read from the audio input stream on the first pass through the loop

 for (int n = 0; n != -1; n = audioInputStream.read(buffer, 0,
            buffer.length)) {

is equivalent to:

int n = 0;
while (n != -1) {
    // Inner loop

    n = audioInputStream.read(buffer, 0, buffer.length);
}

so on the first loop the buffer is just the zero initialized array from new byte[16].

You should not assume the read fills the whole buffer, use the value returned by the read.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top