Adding detail on what we started in the comments above, plus a couple more things.
As was previously suggested, add glEnable(GL_CULL_FACE)
. I doubt that it addresses your bottleneck, but it can't hurt.
One other thing you should generally do is store your positions and texture coordinates interleaved in a single buffer, instead of storing them in separate buffers. Again, I think you're limited elsewhere right now, just a general recommendation.
To illustrate what I suggested in the comment about avoiding to set redundant state inside your loop. Right now your structure in pseudo code looks like this:
loop over x, y, z
VBOrender(x, y, z)
end loop
Instead, I would structure it like this:
VBObind()
loop over x, y, z
VBOrender(x, y, z)
end loop
VBOunbind()
And split up the code in your current VBOrender
function like this:
void VBObind()
{
glBindTexture(GL_TEXTURE_2D, Texture::textures[0]);
glEnableClientState(GL_VERTEX_ARRAY);
glEnableClientState(GL_TEXTURE_COORD_ARRAY);
glBindBuffer(GL_ARRAY_BUFFER, m_vertexBuffer);
glVertexPointer(3, GL_FLOAT, 0, 0);
glBindBuffer(GL_ARRAY_BUFFER, m_textureBuffer);
glTexCoordPointer(2, GL_FLOAT, 0, OFFSET_BUFFER(0));
}
void VBOunbind()
{
glDisableClientState(GL_VERTEX_ARRAY);
glDisableClientState(GL_TEXTURE_COORD_ARRAY);
}
void VBOrender(int x, int y, int z)
{
glTranslatef(x, y, z);
glDrawArrays(GL_TRIANGLES, 0, 36);
glTranslatef(-x, -y, -z);
}
I expect that this will give you a significant performance improvement. To get massively better performance, you would need something more dramatic, like packing all the cubes in a single draw call. This looks slightly tricky, though, because from what I can see in your code on pastebin, the rendering of each cube is conditional.
If you're willing to write your own shaders, you can make the translation an attribute, which would be much faster to update than the fixed function transformation matrix.