I've come up with a solution, but I'm still interested in any feedback or alternative solutions. I'm not sure if the way I've done it is ok, but it seems to be producing the desired results. Although, my solution feels a little too convoluted...
1) Whenever I write data, I now write at either the Buffer's Write Cursor
or the StoppingPoint
- whichever is later. This is to avoid stopping, then later writing in that "untouchable" space between the Play Cursor and Write Cursor, whose data has already been dedicated to playback.
DWORD writeCursorOffset = 0;
buffer->GetCurrentPosition(NULL, &writeCursorOffset);
//if write cursor has passed stopping point, then we need to write from there.
//So update m_StoppingPoint to reflect the new writing position.
m_StoppingPoint = m_StoppingPoint > writeCursorOffset ? m_StoppingPoint : writeCursorOffset;
2) I added some silence after every single write, but I left StoppingPoint
to point at the end of the actual voice data. Eg.
|==============================|*********|---------------------|
^ ^ ^ ^
| | | |
Voice data Stopping Silence Old/Garbage
Point Data
3) If the Buffer's Play Cursor
passed the StoppingPoint
, I would then stop playing the buffer. Even if the Play Cursor overshoots here, all it will play is silence.
//error checking removed for demonstration purposes
buffer->Stop();
4) Immediately after stopping I would update StoppingPoint
to be equal to the end of the silence. This would ensure that when more speech data comes in, the buffer will not play any silence first.
//don't forget that modulo at the end - circular buffer!
m_StoppingPoint = (m_StoppingPoint + SILENCE_BUFFER_SIZE) % BufferSize;
|==============================|*********|-------------------|
^
|
Move Stopping
Point here
Again, if I've done anything glaringly evil here, please let me know!