Three points to check:
Do you feed the buffer fast enough? If not, the increased gap is due to your data providing code, not due to an internal behaviour of the processor. (use a toggle pin to find out)
Is it possible that because of the increased speed, your code just switches off the data register empty interrupt every single time it transmits data? Instead of using a put_char to fill up your ringbuffer, you could use a put_string(array, length) to fill it up with multiple chars at once (using memcopy for example, think about splitting it up into two mwmcopy operations when data has to wrap at the end of the buffer). (again, use a toggle pin to find out).
Reduce the code wraped into cli() and sei() to a minimum. Switch filling up the buffer with the flag check and and exclude this part from the cli-sei part
good luck!