I am creating an app for android using openGL ES. I am trying to draw, in 2D, lots of moving sprites which bounce around the screen.
Let's consider I have a ball at coordinates 100,100. The ball graphic is 10px wide, therefore I can create the vertices boundingBox = {100,110,0, 110,110,0, 100,100,0, 110,100,0}
and perform the following on each loop of onDrawFrame()
with the ball texture loaded.
//for each ball object
FloatBuffer ballVertexBuffer = byteBuffer.asFloatBuffer();
ballVertexBuffer.put(ball.boundingBox);
ballVertexBuffer.position(0);
gl.glVertexPointer(3, GL10.GL_FLOAT, 0, ballVertexBuffer);
gl.glDrawArrays(GL10.GL_TRIANGLE_STRIP, 0,4);
I would then update the boundingBox
array to move the balls around the screen.
Alternatively, I could not alter the bounding box
at all and instead translatef()
the ball before drawing the verticies
gl.glVertexPointer(3, GL10.GL_FLOAT, 0, ballVertexBuffer);
gl.glPushMatrix();
gl.glTranslatef(ball.posX, ball.posY, 0);
gl.glDrawArrays(GL10.GL_TRIANGLE_STRIP, 0,4);
gl.glPopMatrix();
What would be the best thing to do in the case in terms of efficient and best practices.
OpenGL ES (as of 2.0) does not support instancing, unluckily. If it did, I would recommend drawing a 2-triangle sprite instanced N times, reading the x/y offsets of the center point, and possibly a scale value if you need differently sized sprites, from a vertex texture (which ES supports just fine). This would limit the amount of data you must push per frame to a minimum.
Assuming you can't do the simulation directly on the GPU (thus avoiding uploading the vertex data each frame) ... this basically leaves you only with only one efficient option:
Generate 2 VBOs, map one and fill it, while the other is used as the source of the draw call. You can also do this seemingly with a single buffer if you glBufferData(... 0)
in between, which tells OpenGL to generate a new buffer and throw the old one away as soon as it's done reading from it.
Streaming vertices in every frame may not be super fast, but this does not matter as long as the latency can be well-hidden (e.g. by drawing from one buffer while filling another). Few draw calls, few state changes, and ideally no stalls should still make this fast.
Drawing calls are much more expensive than altering the data. Also glTranslate is not nearly as efficient as just adding a few numbers, after all it has to go through a full 4×4 matrix multiplication, which is 64 scalar multiplies and 16 scalar additions.
Of course the best method is using some form of instancing.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With