Question

I'm working on Opengl ES 2.0 using OMAP3530 development board on Windows CE 7.

My Task is to Load a 24-Bit Image File & rotate it about an angle in z-Axis & export the image file(Buffer).

For this task I've created a FBO for off-screen rendering & loaded this image file as a Texture by using glTexImage2D() & I've applied this Texture to a Quad & rotate that QUAD by using PVRTMat4::RotationZ() API & Read-Back by using ReadPixels() API. Since it is a single frame process i just made only 1 loop.

Here are the problems I'm facing now. 1) All API's are taking distinct processing time on every run.ie Sometimes when i run my application i get different processing time for all API's.

2) glDrawArrays() is taking too much time (~50 ms - 80 ms)

3) glReadPixels() is also taking too much time ~95 ms for Image(800x600)

4) Loading 32-Bit image is much faster than 24-Bit image so conversion is needed.

I'd like to ask you all if anybody facing/Solved similar problem kindly suggest me any

Here is the Code snippet of my Application.

[code]
[i]
void BindTexture(){
glGenTextures(1, &m_uiTexture);
glBindTexture(GL_TEXTURE_2D, m_uiTexture);
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, ImageWidth, ImageHeight, 0, GL_RGBA, GL_UNSIGNED_BYTE, pTexData);
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,GL_LINEAR );
glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER, GL_LINEAR );
}

int WINAPI WinMain(HINSTANCE hInstance, HINSTANCE hPrevInstance, TCHAR *lpCmdLine, int nCmdShow)
{
// Fragment and vertex shaders code
char* pszFragShader = "Same as in RenderToTexture sample;
char* pszVertShader = "Same as in RenderToTexture sample;

CreateWindow(Imagewidth, ImageHeight);//For this i've referred OGLES2HelloTriangle_Windows.cpp example
LoadImageBuffers();
BindTexture();

Generate& BindFrame,Render Buffer();
glFramebufferTexture2D(GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0, GL_TEXTURE_2D, m_auiFbo, 0);
glRenderbufferStorage(GL_RENDERBUFFER, GL_DEPTH_COMPONENT16, ImageWidth, ImageHeight);
glFramebufferRenderbuffer(GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT, GL_RENDERBUFFER, m_auiDepthBuffer);
BindTexture();

GLfloat Angle = 0.02f;
GLfloat afVertices[] = {Vertices to Draw a QUAD};

glGenBuffers(1, &ui32Vbo);
LoadVBO's();//Aps's to load VBO's refer

// Draws a triangle for 1 frames
while(g_bDemoDone==false)
{
glBindFramebuffer(GL_FRAMEBUFFER, m_auiFbo);
glClear(GL_COLOR_BUFFER_BIT|GL_DEPTH_BUFFER_BIT);

PVRTMat4 mRot,mTrans, mMVP;
mTrans = PVRTMat4::Translation(0,0,0);
mRot = PVRTMat4::RotationZ(Angle);

glBindBuffer(GL_ARRAY_BUFFER, ui32Vbo);
glDisable(GL_CULL_FACE);

int i32Location = glGetUniformLocation(uiProgramObject, "myPMVMatrix");
mMVP = mTrans * mRot ;

glUniformMatrix4fv(i32Location, 1, GL_FALSE, mMVP.ptr());

// Pass the vertex data
glEnableVertexAttribArray(VERTEX_ARRAY);
glVertexAttribPointer(VERTEX_ARRAY, 3, GL_FLOAT, GL_FALSE, m_ui32VertexStride, 0);

// Pass the texture coordinates data
glEnableVertexAttribArray(TEXCOORD_ARRAY);
glVertexAttribPointer(TEXCOORD_ARRAY, 2, GL_FLOAT, GL_FALSE, m_ui32VertexStride, (void*) (3 * sizeof(GLfloat)));

glDrawArrays(GL_TRIANGLE_STRIP, 0, 4);//

glReadPixels(0,0,ImageWidth ,ImageHeight,GL_RGBA,GL_UNSIGNED_BYTE,pOutTexData) ;

glBindBuffer(GL_ARRAY_BUFFER, 0);
glBindFramebuffer(GL_FRAMEBUFFER, 0);

eglSwapBuffers(eglDisplay, eglSurface);

}

DeInitAll();[/i][/code]
Was it helpful?

Solution 2

I've got improved performance for rotating a Image Buffer my using multiple FBO's & PBO's. Here is the pseudo code snippet of my application.

InitGL()
GenerateShaders();
Generate3Textures();//Generate 3 Null Textures
Generate3FBO();//Generate 3 FBO & Attach each Texture to 1 FBO.
Generate3PBO();//Generate 3 PBO & to readback from FBO.

DrawGL()
{
 BindFBO1;
 BindTexture1;
 UploadtoTexture1;
 Do Some Processing & Draw it in FBO1;

 BindFBO2;
 BindTexture2;
 UploadtoTexture2;
 Do Some Processing & Draw it in FBO2;

 BindFBO3;
 BindTexture3;
 UploadtoTexture3;
 Do Some Processing & Draw it in FBO3;


 BindFBO1;
 ReadPixelfromFBO1;
 UnpackToPBO1;

 BindFBO2;
 ReadPixelfromFBO2;
 UnpackToPBO2;

 BindFBO3;
 ReadPixelfromFBO3;
 UnpackToPBO3;
}


DeinitGL();
DeallocateALL();

By this way I've achieved 50% increased performance for overall processing.

OTHER TIPS

The PowerVR architecture can not render a single frame and allow the ARM to read it back quickly. It is just not designed to work that way fast - it is a deferred rendering tile-based architecture. The execution times you are seeing are too be expected and using an FBO is not going to make it faster either. Also, beware that the OpenGL ES drivers on OMAP for Windows CE are really poor quality. Consider yourself lucky if they work at all.

A better design would be to display the OpenGL ES rendering directly to the DSS and avoid using glReadPixels() and the FBO completely.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top