OpenGL performance for 10,000 static cubes

Question

I'm running the following Scala code. It compiles a single display list of 10,000 cubes. Then it displays them in the display loop with an animator that runs as fast as it can. But the FPS is only around 20. I had thought that using display lists would be able to handle this very quickly. I have a situation where I need to be able to display 10k-100k's of objects. Is there a better way to do so? In the display loop, pretty much all it does is call gluLookAt and glCallList (it's the last method).

I'm using JOGL 2.0-rc5 from jogamp.org which says it supports "OpenGL 1.3 - 3.0, 3.1 - 3.3, ≥ 4.0, ES 1.x and ES 2.x + nearly all vendor extensions"

class LotsOfCubes extends GLEventListener {
  def show() = {
    val glp = GLProfile.getDefault();
    val caps = new GLCapabilities(glp);
    val canvas = new GLCanvas(caps);
    canvas.addGLEventListener(this);

    val frame = new JFrame("AWT Window Test");
    frame.setSize(300, 300);
    frame.add(canvas);
    frame.setVisible(true);
  }

  override def init(drawable: GLAutoDrawable) {
    val gl = drawable.getGL().getGL2()
    gl.glEnable(GL.GL_DEPTH_TEST)

    gl.glNewList(21, GL2.GL_COMPILE)
    var i = -10.0f
    var j = -10.0f
    while (i < 10.0f) {
      while (j < 10.0f) {
        drawItem(gl, i, j, 0.0f, 0.08f)
        j += 0.1f
      }
      i += 0.1f
      j = -10f
    }
    gl.glEndList()

    val an = new Animator(drawable);
    drawable.setAnimator(an);
    an.setUpdateFPSFrames(100, System.out)
    an.start();
  }

  override def dispose(drawable: GLAutoDrawable) {
  }

  override def reshape(drawable: GLAutoDrawable, x: Int, y: Int, width: Int, height: Int) {
    val gl = drawable.getGL().getGL2();
    val glu = new GLU
    gl.glMatrixMode(GLMatrixFunc.GL_PROJECTION);
    gl.glLoadIdentity();
    glu.gluPerspective(10, 1, -1, 100);
    gl.glViewport(0, 0, width, height);
    gl.glMatrixMode(GLMatrixFunc.GL_MODELVIEW);
  }

  def drawBox(gl: GL2, size: Float) {
    import Global._
    gl.glBegin(GL2.GL_QUADS);
    for (i <- 5 until -1 by -1) {
      gl.glNormal3fv(boxNormals(i), 0);
      val c = colors(i);
      gl.glColor3f(c(0), c(1), c(2))
      var vt: Array[Float] = boxVertices(boxFaces(i)(0))
      gl.glVertex3f(vt(0) * size, vt(1) * size, vt(2) * size);
      vt = boxVertices(boxFaces(i)(1));
      gl.glVertex3f(vt(0) * size, vt(1) * size, vt(2) * size);
      vt = boxVertices(boxFaces(i)(2));
      gl.glVertex3f(vt(0) * size, vt(1) * size, vt(2) * size);
      vt = boxVertices(boxFaces(i)(3));
      gl.glVertex3f(vt(0) * size, vt(1) * size, vt(2) * size);
    }
    gl.glEnd();
  }

  def drawItem(gl: GL2, x: Float, y: Float, z: Float, size: Float) {
    gl.glPushMatrix()
    gl.glTranslatef(x, y, z);
    gl.glRotatef(0.0f, 0.0f, 1.0f, 0.0f); // Rotate The cube around the Y axis
    gl.glRotatef(0.0f, 1.0f, 1.0f, 1.0f);
    drawBox(gl, size);
    gl.glPopMatrix()
  }

  override def display(drawable: GLAutoDrawable) {
    val gl = drawable.getGL().getGL2()
    val glu = new GLU
    gl.glClear(GL.GL_COLOR_BUFFER_BIT | GL.GL_DEPTH_BUFFER_BIT)
    gl.glLoadIdentity()
    glu.gluLookAt(0.0, 0.0, -100.0f,
      0.0f, 0.0f, 0.0f,
      0.0f, 1.0f, 0.0f)
    gl.glCallList(21)
  }
}

prelic · Accepted Answer

You may want to think about using a Vertex Buffer, which is a way to store drawing information for faster rendering.

See here for an overview:

http://www.opengl.org/wiki/Vertex_Buffer_Object

newprogrammer · Answer

If you store the vertex information in a vertex buffer object, then upload it to OpenGL, you will probably see a great increase in performance, particularly if you are drawing static objects. This is because the vertex data stays on the graphics card, rather than fetching it from the CPU every time.

realytic · Answer

You create a display list in which you call drawItem for each cube. Inside drawItem for each cube you push and pop the current transformation matrix and inbetween rotate and scale the cube to place it correctly. In principle that could be performant since the transformations on the cube coordinates could be precomputed and hence optimized by the driver. When I tried to do the same (display lots of cubes like in minecraft) but without rotation, i.e. I only used glPush/glPopMatrix() and glTranslate3f() , I realized that actually these optimizations, i.e. getting rid of the unneccessary matrix pushes/pops and applications, were NOT done by my driver. So for about 10-20K cubes I only got around 40fps and for 200K cubes only about 6-7 fps. Then, I tried to do the translations manually, i.e. I added the respective offset vectors to the vertices of my cubes directly, i.e. inside the display list there was no matrix push/pop and no glTranslatef anymore, I got a huge speed up, so my code ran about 70 times as fast.

OpenGL performance for 10,000 static cubes

Tags:

performance

scala

opengl

jogl

mentics

3 Answers

prelic

newprogrammer

realytic

Recent Activity

Donate For Us

OpenGL performance for 10,000 static cubes

Tags:

performance

scala

opengl

jogl

mentics

3 Answers

prelic

newprogrammer

realytic

Related questions

Recent Activity

Donate For Us