I'm aware that the following is a vague question, but I'm hitting performance problems that I did not anticipate in XNA.
I have a low poly model (It has 18 faces and 14 vertices) that I'm trying to draw to the screen a (high!) number of times. I get over 60 FPS (on a decent machine) until I draw this model 5000+ times. Am I asking too much here? I'd very much like to double or triple that number (10-15k) at least.
My code for actually drawing the models is given below. I have tried to eliminate as much computation from the draw cycle as possible, is there more I can squeeze from it, or better alternatives all together?
Note: tile.Offset is computed once during initialisation, not every cycle.
foreach (var tile in Tiles)
{
var myModel = tile.Model;
Matrix[] transforms = new Matrix[myModel.Bones.Count];
myModel.CopyAbsoluteBoneTransformsTo(transforms);
foreach (ModelMesh mesh in myModel.Meshes)
{
foreach (BasicEffect effect in mesh.Effects)
{
// effect.EnableDefaultLighting();
effect.World = transforms[mesh.ParentBone.Index]
* Matrix.CreateTranslation(tile.Offset);
effect.View = CameraManager.ViewMatrix;
effect.Projection = CameraManager.ProjectionMatrix;
}
mesh.Draw();
}
}
You're quite clearly hitting the batch limit. See this presentation and this answer and this answer for details. Put simply: there is a limit to how many draw calls you can submit to the GPU each second.
The batch limit is a CPU-based limit, so you'll probably see that your CPU gets pegged once you get to your 5000+ models. Worse still, when your game is doing other calculations, it will reduce the CPU time available to submit those batches.
(And it's important to note that, conversely, you are almost certainly not hitting GPU limits. No need to worry about mesh complexity yet.)
There are a number of ways to reduce your batch count. Frustrum culling is one. Probably the best one to persue in your case is Geometry Instancing, this lets you draw multiple models in a single batch. Here is an XNA sample that does this.
Better still, if it's static geometry, can you simply bake it all into one or a few big meshes?
As with any performance problem there are limits where a particular approach works. You need to measure and see where problems are. The best option is to use profiler but even basic measurements like looking at CPU load may show what bottlencks you have.
As a first investiagtion step I'd recommend to remove all computations (like matrix multiplications) and see you get improvments - this would mean that CPU is still doing more work than GPU.
Make sure you are not doing measurements on debug build - it could make application significantly slower if it is CPU bound.
Side note: GPU works the best when you send large operations relatively infrequently. Your code does more or less opposite - send huge number of very small drawing requests. You should be able to batch your primitives and get better performance. There are samples around how to render large number of simple objects (including ones in DirectX SDK), searching for "gpu rendering crowds" can give you starting point.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With