Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

glTexSubImage2D extremely slow on Intel video card

My video card is Mobile Intel 4 Series. I'm updating a texture with changing data every frame, here's my main loop:

for(;;) {
    Timer timer;

    glBindTexture(GL_TEXTURE2D, tex);
    glBegin(GL_QUADS); ... /* draw textured quad */ ... glEnd();
    glTexSubImage2D(GL_TEXTURE2D, 0, 0, 0, 512, 512,
        GL_BGRA, GL_UNSIGNED_INT_8_8_8_8_REV, data);
    swapBuffers();

    cout << timer.Elapsed();
}

Every iteration takes 120ms. However, inserting glFlush before glTexSubImage2D brings the iteration time to 2ms.

The issue is not in the pixel format. I've tried the pixel formats BGRA, RGBA and ABGR_EXT together with the pixel types UNSIGNED_BYTE, BYTE, UNSIGNED_INT_8_8_8_8 and UNSIGNED_INT_8_8_8_8_EXT. The texture's internal pixel format is RGBA.

The order of calls matters. Moving the texture upload before the quad drawing, for example, fixes the slowness.

I also tried this on an GeForce GT 420M card, and it works fast there. My real app does have performance problems on non-Intel cards that are fixed by glFlush calls, but I haven't distilled those to a test case yet.

Any ideas on how to debug this?

like image 694
Stefan Monov Avatar asked Oct 18 '11 13:10

Stefan Monov


1 Answers

One issue is that glTexImage2D performs a full reinitialization of the texture object. If only the data changes, but the format remains the same, use glTexSubImage2D to speed things up (just a reminder).

The other issue is, that despite its name the immediate mode, i.e. glBegin(…) … glEnd() the drawing calls are not synchronous, i.e. the calls return long before the GPU is done drawing. Adding a glFinish() will synchronize. But as well will do calls to anything that modifies data still required by queued operations. So in your case glTexImage2D (and glTexSubImage2D) must wait for the drawing to finish.

Usually it's best to do all volatile resource uploads at either the beginning of the drawing function, or during the SwapBuffers block in a separate thread through buffer objects. Buffer objects have been introduced for that very reason, to allow for asynchronous, yet tight operation.

like image 60
datenwolf Avatar answered Sep 21 '22 09:09

datenwolf