I've written a simple physics modeler that allows me to bounce balls around the screen. You can click and drag to launch a ball, or you can generate balls hundreds at a time and watch them interact with each other.
[Link to larger version]
It has been a fun little program to work on and I want to take it further if I can. I know they say pre-mature optimization is the root of all evil but I'm starting to hit actual performance barriers and I want to know if anyone who is experienced in game/simulator development can help me out.
Problem:
Right now, my program chokes if you add too many balls (it can't seem to handle more than 800 on my machine). If you do, the simulation is no longer realistic and all the balls overlap each other on the bottom.
The problem is in collision detection. In the most naive case, collision detection is an O(N^2) problem. Each ball checks every other ball. This gets poor performance pretty quickly (after even 100 balls you'll be doing 10k collision checks per loop cycle).
If you look here, you can see a screenshot of where I have added several hundred balls. The simulator can't keep up and they begin to overlap each other.
[Link to larger version]
Currently, I detect collisions by looking for overlapping balls. If I find two balls that are overlapping, I separate them by their minimum translation distance (MTD), or push them apart. I then use a simple physics equation to adjust their impulse vectors and off they go in different directions after the collision.
It works great except if there are too many balls the minimum translation distance becomes noticeable. They begin overlapping by large amounts and constantly jostle each other on the bottom. It gets worse the more I increase "gravity". The pressure on them is increased and the amount they are compressed/overlap each other increases.
Again, I have no issues until I hit a sizable N number of balls.
Current Optimization Method:
Collision Detection -
Sweep and prune - (aka. Sort and Sweep)
I use an insertion sort on my balls each loop cycle along their x axis. Because of the nature of Insertion Sort I can exploit the temporal coherence of my simulator. Frame to frame, the balls positions change only slightly so the sort doesn't have much work to do. This brings Linear Sorts amortized runtime to O(N) or linear rather than its average runtime of O(N^2).
Since the balls are sorted, I do a couple pre-checks in my second loop before checking collisions. Now I only have to check the balls near each other (because they are sorted along the x-axis), and I break out of the second loop anytime I check a ball against another ball whose xmin is greater than the xmax of the current ball. So it skips thousands of checks.
When I implemented this, it brought a huge speed improvement. However, I would still like to be able to handle more than 600-800 balls. I've read about Physics Engines that easily handle collision detection between 10k objects simultaneously, so I would like to think I could reach 1-2k with a little work.
After running a profiler it came out that Collision Detection was eating up about 55% of my time while rendering was eating up about 45%. So, these are my two most expensive costs.
Question:
Can you think of any better algorithms or techniques that would allow my simulator to be able to handle more balls?
Relevant Code:
Entire Project:
svn checkout http://simucal-projects.googlecode.com/svn/ballbounce/trunk/
or, click here to browse the files manually in your browser.
Sections of Interest:
It starts off by simulating an external force such as gravity. At each new time step, it detects any collisions, then calculates the velocity and position of an object. And finally, sends the location data to the GPU. This continuous loop creates the illusion that an object is falling due to gravity.
The simple way is don't test Object vs Object collisions, populate an array with the center point of each ball. Then, from each center scan a a square of size 4*radius centered on that point (you can optimize this a bit by not using a square, at the expense of making the code more complex). If there's another center in this square, only then do you check to see if they're within 2*radius of each other (and therefore colliding). You can further optimize this by reducing the resolution, and rounding the ball position, reducing the number of checks you need to do.
Another way, is to divide your space into a grid, and store the objects in grid regions. You only need to check for collisions between objects in adjacent grids.
Keep track of nearby balls -
Just like the insertion sort is optimal due to the minimal change per frame, you can use the same property to keep track ball 'neighbors'
Each ball can only interact with a possible 6 other balls at the most. You can probably run an algorithm every 10 frames or so that figures out which neighbors each ball has, and then use that information for the next ten frames to calculate the actual collisions.
You can also follow the vectors of each ball, draw virtual lines, and see which lines cross in the next 3-5 frames and only calculate collisions that could possibly happen (even though few will happen due to time).
Just as you sorted by the x axis you can 'group' balls in subdivisions inside the main window. When a ball is inside a subdivision by at least one ball diameter, it only need to look at the balls in the same subdivision. If it's closer to a border or two you need to look at one or two other subdivisions, but it should be fewer calculations.
Further, those subdivisions can be dynamically located so the average subdivision has only 100 balls - there's not need to have subdivisions of the same size with different numbers of balls.
Depending on how crowded a subdivision is (balls per square unit) you might try different algorithms - a sparsely filled box only needs vector calculations to predict collisions, whereas a densely packed subdivision might only need some sort of optimized hexagraphic calculation. Sparse boxes may not need collision detection for many frames (since overlap won't be noticed as much as in densely populated boxes).
The energy density of a given box can be determined - a very sparse box with low energy balls needs fewer collision calculations than a sparse box with high energy.
-Adam
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With