Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to reduce the number of drawing calls to a large number of texture?

Tags:

c++

opengl

I'm trying to develop a map for a 2D tile based game, the approach I'm using is to save the map images in a large texture (tileset) and draw only the desired tiles on the screen by updating the positions through vertex shader, however on a 10x10 map involves 100 glDrawArrays calls, looking through the task manager, this consumes 5% of CPU usage and 4 ~ 5% of GPU, imagine if it was a complete game with dozens of calls, there is a way to optimize this, such as preparing the whole scene and just make 1 draw call, drawing all at once, or some other approach?

void GameMap::draw() {
  m_shader - > use();
  m_texture - > bind();

  glBindVertexArray(m_quadVAO);

  for (size_t r = 0; r < 10; r++) {
    for (size_t c = 0; c < 10; c++) {
      m_tileCoord - > setX(c * m_tileHeight);
      m_tileCoord - > setY(r * m_tileHeight);
      m_tileCoord - > convert2DToIso();

      drawTile(0);
    }
  }

  glBindVertexArray(0);
}

void GameMap::drawTile(GLint index) {
  glm::mat4 position_coord = glm::mat4(1.0 f);
  glm::mat4 texture_coord = glm::mat4(1.0 f);

  m_srcX = index * m_tileWidth;

  GLfloat clipX = m_srcX / m_texture - > m_width;
  GLfloat clipY = m_srcY / m_texture - > m_height;

  texture_coord = glm::translate(texture_coord, glm::vec3(glm::vec2(clipX, clipY), 0.0 f));
  position_coord = glm::translate(position_coord, glm::vec3(glm::vec2(m_tileCoord - > getX(), m_tileCoord - > getY()), 0.0 f));
  position_coord = glm::scale(position_coord, glm::vec3(glm::vec2(m_tileWidth, m_tileHeight), 1.0 f));

  m_shader - > setMatrix4("texture_coord", texture_coord);
  m_shader - > setMatrix4("position_coord", position_coord);

  glDrawArrays(GL_TRIANGLES, 0, 6);
}

--Vertex Shader

#version 330 core
layout (location = 0) in vec4 vertex; // <vec2 position, vec2 texCoords>

out vec4 TexCoords;

uniform mat4 texture_coord;
uniform mat4 position_coord;
uniform mat4 projection;

void main()
{
    TexCoords =    texture_coord * vec4(vertex.z, vertex.w, 1.0, 1.0);
    gl_Position =   projection * position_coord * vec4(vertex.xy, 0.0, 1.0);
}

-- Fragment Shader
#version 330 core
out vec4 FragColor;

in vec4 TexCoords;

uniform sampler2D image;
uniform vec4 spriteColor;

void main()
{
    FragColor = vec4(spriteColor) * texture(image, vec2(TexCoords.x, TexCoords.y));
}
like image 720
William Avatar asked Jan 10 '19 11:01

William


2 Answers

The Basic Technique

The first thing you want to do is set up your 10x10 grid vertex buffer. Each square in the grid is actually two triangles. And all the triangles will need their own vertices because the UV coordinates for adjacent tiles are not the same, even though the XY coordinates are the same. This way each triangle can copy the area out of the texture atlas that it needs to and it doesn't need to be contiguous in UV space.

Here's how the vertices of two adjacent quads in the grid will be set up:

enter image description here

1:  xy=(0,0) uv=(Left0 ,Top0)
2:  xy=(1,0) uv=(Right0,Top0)
3:  xy=(1,1) uv=(Right0,Bottom0)
4:  xy=(1,1) uv=(Right0,Bottom0)
5:  xy=(0,1) uv=(Left0 ,Bottom0)
6:  xy=(0,0) uv=(Left0 ,Top0)
7:  xy=(1,0) uv=(Left1 ,Top1)
8:  xy=(2,0) uv=(Right1,Top1)
9:  xy=(2,1) uv=(Right1,Bottom1)
10: xy=(2,1) uv=(Right1,Bottom1)
11: xy=(1,1) uv=(Left1 ,Bottom1)
12: xy=(1,0) uv=(Left1 ,Top1)

These 12 vertices define 4 triangles. The Top, Left, Bottom, Right UV coordinates for the first square can be completely different from the coordinates of the second square, thus allowing each square to be textured by a different area of the texture atlas. E.g. see below to see how the UV coordinates for each triangle map to a tile in the texture atlas.

enter image description here

In your case with your 10x10 grid, you would have 100 quads, or 200 triangles. With 200 triangles at 3 vertices each, that would be 600 vertices to define. But it's a single draw call of 200 triangles (600 vertices). Each vertex has its own x, y, u, v, coordinates. To change which tile a quad is, you have to update the uv coordinates of 6 vertices in your vertex buffer.

You will likely find that this is the most convenient and efficient approach.

Advanced Approaches

There are more memory efficient or convenient ways of setting this up with multiple streams to reduce duplication of vertices and leverage shaders to do the work of setting it up if you're willing to trade off computation time for memory or convenience. Find the balance that is right for you. But you should grasp the basic technique first before trying to optimize.

But in the multiple-stream approach, you could specify all the xy vertices separately from all the uv vertices to avoid duplication. You could also specify a second set of texture coordinates which was just the top-left corner of the tile in the atlas and let the uv coordinates just go from 0,0 (top left) to 1,1 (bottom right) for each quad, then let your shader scale and transform the uv coordinates to arrive at final texture coordinates. You could also specify a single uv coordinate of the top-left corner of the source area for each primitive and let a geometry shader complete the squares. And even smarter, you could specify only the x,y coordinates (omitting the uv coordinates entirely) and in your vertex shader, you can sample a texture that contains the "tile numbers" of each quad. You would sample this texture at coordinates based on the x,y values in the grid, and then based on the value you read, you could transform that into the uv coordinates in the atlas. To change the tile in this system, you just change the one pixel in the tile map texture. And finally, you could skip generating the primitives entirely and derive them entirely from a single list sent to the geometry shader and generate the x,y coordinates of the grid which gets sent downstream to the vertex shader to complete the triangle geometry and uv coordinates of the grid, this is the most memory efficient, but relies on the GPU to compute the setup at runtime.

With a static 6-vertices-per-triangle setup, you free up GPU processing at the cost of a little extra memory. Depending on what you need for performance, you may find that using up more memory to get higher fps is desirable. Vertex buffers are tiny compared to textures anyway.

So as I said, you should start with the basic technique first as it's likely also the optimal solution for performance as well, especially if your map doesn't change very often.

like image 166
Wyck Avatar answered Nov 12 '22 11:11

Wyck


You can upload all parameters to gpu memory and draw everything using only one draw call. This way it's not required to update vertex shader uniforms and you should have zero cpu load.

It's been 3 years since I used OpenGL so I can only point you into the right direction. Start reading some material like for instance:

https://ferransole.wordpress.com/2014/07/09/multidrawindirect/

https://www.khronos.org/registry/OpenGL-Refpages/gl4/html/glDrawArraysIndirect.xhtml

Also, keep in mind this is GL 4.x stuff, check your target platform (software+hardware) GL version support.

like image 34
Frederik De Ruyck Avatar answered Nov 12 '22 12:11

Frederik De Ruyck