Batching arbitrary vertex data in OpenGL batch renderer

Question

Alright, so we need to minimize the number of state changes and draw calls. I'm assuming you're using modern OpenGL, including Vertex Buffer Objects and shaders.

One approach is to ensure that all vertex data has the same format. For example, each vertex has a position, color and texture coordinate (xyz, rgba, uv). If we interleave our vertex data in a VBO, we only need a single call to glVertexAttribPointer and glEnableVertexAttribArray, before rendering.

This means some redundant data for untextured objects, but we get to cram everything into a single batch, which is nice.

To handle the untextured objects, you could either bind a blank white texture and treat it as a textured object. Or, you could have a uniform variable (a float between 0 and 1) in your fragment shader, and blend between texture color and vertex color using the mix function.

To batch sprites and shapes we should first handle the transformations on the CPU, so that we always upload "world"-coordinates to the GPU. This saves us from having to set a transformation uniform for each sprite, which would each be require individual draw calls.

Furthermore, we need to sort by texture whenever possible, as texture bindings are among the more expensive operations you can do.

Our approach basically boils down to the following:

Maintain a single Vertex- and Index Buffer Object to store the data
Keep all vertex data in a single format and interleave the data in the VBO
Sort by texture
Flush the data (draw elements/arrays) in the buffers whenever we change texture (or set the texture-blend uniform, if we go with that option)

Getting the data from the CPU to GPU memory can be done in different ways. For example, by first allocating a large enough, empty memory buffer on the GPU, and using glBufferSubData to upload a subset of vertex/index data, whenever you do one of your Render calls.

Remember to profile!

It is very important to do profiling when doing this kind of work. For example, to compare the performance between batching and individual draw calls, or glDrawArrays vs glDrawElements. I recommend using gDebugger, which is a free and very good OpenGL profiler.

Also note that too big of a VBO can hurt your performance. So keep it to a reasonable size, and flush it with a draw call whenever it fills up.