Improve OpenGL rendering speed

https://stackoverflow.com/questions/20788499

21-09-2022
|

Question

I've built a small video player that grabs frames (as a string/byte array) from a movie by using GStreamer and then renders each frame to an OpenGL texture. This goes fine up to 30 fps 1080p movies, but when I try a 60 fps movie, it can't keep up and the video lags behind on the audio stream! When I play the video with "gst-launch playbin2" it works perfectly, so the video gets decoded with sufficent speed.

I've done a bit of measuring and it appears if the problem lies with either the updating of the texture with a new frame or the actual drawing of the frame to the screen. I'm using the old fashioned GL_BEGIN/QUADS/END method (because I don't know any better) to do the drawing part, but could this be the bottleneck? I thought the alternative and newer methods (GL_TRIANGLE_STRIP, VBOs/FBOs or glDrawArrays/Elements) are only beneficial when working with a large amount of textures/polygons and not so much with what I am trying to do, or am I wrong with this?

Does anyone have any tips on how to improve the rendering speed in this specific situation?

Update: Thanks to some good advice given here (Use glTexSubImage2D instead of glTexImage2D; Use Display lists), my code now looks as below. There has been some performance improvement, but still the movie runs a little too slow to reach 60 fps (just by a margin; only a little more optimization is required).

The output of the time measurement for the rendering of a couple of frames now is as follows:

Texture updated in 16.2160497479 ms
Frame drawn in 0.540967225085 ms
Texture updated in 14.7260598703 ms
Frame drawn in 0.606612686107 ms
Texture updated in 17.0613363633 ms
Frame drawn in 0.743171453788 ms
Texture updated in 12.6152746452 ms
Frame drawn in 2.45603172378 ms
Texture updated in 13.3847853272 ms
Frame drawn in 3.0869575436 ms
Texture updated in 17.7117126901 ms
Frame drawn in 0.572979517806 ms
Texture updated in 13.8203956395 ms
Frame drawn in 1.15892604026 ms
Texture updated in 16.0600404733 ms
Frame drawn in 0.563659483216 ms
Texture updated in 13.0213039782 ms
Frame drawn in 3.70653723435 ms

Even though the largest amount of time appears to be taken by the updating of the texture, even with glTexSubImage2D (which seems plausible as this involves transferring data from system memory to gpu) I guess I will try to improve performance by using VBO/FBOs/vertex arrays instead of drawing in immediate mode with glBegin/End

...

def __textureSetup(self):
    # Setup texture in OpenGL to render video to
    glEnable(GL_TEXTURE_2D)
    glMatrixMode(GL_MODELVIEW)
    self.textureNo = glGenTextures(1)
    glBindTexture(GL_TEXTURE_2D, self.textureNo)
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR)
    glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR)

    # Fill texture with black to begin with.        
    img = np.zeros([self.vidsize[0],self.vidsize[1],3],dtype=np.uint8)
    img.fill(0) 
    glTexImage2D( GL_TEXTURE_2D, 0, GL_RGB, self.vidsize[0], self.vidsize[1], 0, GL_RGB, GL_UNSIGNED_BYTE, img)

    # Create display list which draws to the quad to which the texture is rendered
    (x,y) = self.vidPos
    (w,h) = self.destsize

    self.frameQuad = glGenLists(1);
    glNewList(self.frameQuad, GL_COMPILE)
    glBegin(GL_QUADS)
    glTexCoord2f(0.0, 0.0); glVertex3i(x, y, 0)
    glTexCoord2f(1.0, 0.0); glVertex3i(x+w, y, 0)
    glTexCoord2f(1.0, 1.0); glVertex3i(x+w, y+h, 0)
    glTexCoord2f(0.0, 1.0); glVertex3i(x, y+h, 0)
    glEnd()
    glEndList()

    # Clear The Screen And The Depth Buffer
    glClear(GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT)

def __texUpdate(self, appsink):
    """ Callback for GStreamer """
    # Retrieve buffer from videosink
    self.buffer = appsink.emit('pull-buffer')
    self.texUpdated = True

def drawFrame(self):
    glCallList(self.frameQuad)
    # Flip the buffer to show frame to screen
    pygame.display.flip()

def play(self):
    # Start gst loop (which listens for events from the player)
    thread.start_new_thread(self.gst_loop.run, ())

    # Signal player to start video playback
    self.player.set_state(gst.STATE_PLAYING)
    self.paused = False

    # While video is playing, render frames
    while self.gst_loop.is_running():
                    # Only draw a frame when a new texture has been received
        if self.texUpdated:
            t1 = time.clock()
            # Update texture
            glTexSubImage2D( GL_TEXTURE_2D, 0, 0, 0, self.vidsize[0], self.vidsize[1], GL_RGB, GL_UNSIGNED_BYTE, self.buffer.data)  
            t2 = time.clock()
            print "Texture updated in {0} ms".format((t2-t1)*1000)
            self.drawFrame()
            print "Frame drawn in {0} ms".format((time.clock()-t2)*1000)

        for e in pygame.event.get():
            if e.type == pygame.QUIT:
                self.stop()
            if e.type == pygame.KEYDOWN: 
                if e.key == pygame.K_ESCAPE:
                    self.stop()
                if e.key == pygame.K_SPACE:
                    self.pause()

        pygame.event.pump()   # Prevent freezing of screen while dragging window

def stop(self):
    self.gst_loop.quit()
    self.player.set_state(gst.STATE_NULL)

...

Solution

glTexImage2D goes through a full texture object reinitialization. You should use glTexSubImage2D instead, which just uploads new data but keeps the existing texture object around.

The other issue is, that you might hit the Swap Interval barrier (V-Sync). The only way to get rid of that is to be done rendering and calling SwapBuffers before the V-Sync deadline imposed by your monitor. Reducing the texture upload latency by avoiding texture object reinitialization will help you with that.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow