Pergunta

I am looking into using a VBO instead of immediate mode for performance reasons. I am creating a 2D orthographic scene filled with sprites. I do not want to draw sprites that are off-screen. I do this by checking their position against the screen size and position of the camera.

In immediate mode this is simple; there is draw method for each sprite. Using a VBO this seems non-trivial; I render an entire section of a VBO at one time. There would be no way for me (that I can think of) to elect out of rendering sprites that are off-screen.

Foi útil?

Solução

I'll just assume that you do indeed animate the sprites on the CPU, because that's the only thing that makes sense in the light of your question (otherwise, how would you draw them in immediate mode initially, and how would you skip drawing some).

AGP/PCIe behaves much like a harddisk from a performance point of view. Bandwidth is huge, but access time is quite noticeable. In other words, doing a transfer at all is painful, but once you do it, a few kilobytes more don't really make any difference. Uploading 500 sprites and uploading 1000 sprites is the same thing.

Since you animate the sprites on the CPU, you already must do one transfer (glBufferSubData or glMapBuffer/glUnmapBuffer) every frame, there is no other way.

Be sure to use a "fresh" buffer e.g. by applying the glBufferData(null) idiom. This avoids pipeline stalls by allowing OpenGL to continue using (drawing from) the buffer while giving you a different buffer (without you knowing) at the same time. Later when it is done drawing, it just secretly flips buffers and throws the old one away. That way, you achieve good parallelism (which is key to performance and much more important than culling a few thousand vertices).

Also, graphics cards are reasonably good at culling geometry (this includes discarding entire triangles that are off-screen before fragments are generated). Hundreds? Thousands? Hundred thousands? No issue. Let the graphics card do it.

Unless you have a million sprites of which one half is visible at a time and the other half isn't, it is not unlikely that writing the entire buffer continuously and without branches is not only just as fast, but even faster due to cache and pipeline effects.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top