Performance questions re: Capsules & GeneralConvexPairTester

Discuss any questions about BEPUphysics or problems encountered.
Post Reply
ecosky
Posts: 69
Joined: Fri Nov 04, 2011 7:13 am
Contact:

Performance questions re: Capsules & GeneralConvexPairTester

Post by ecosky »

Hello again,

I've been working through some performance issues I'm having with CompoundBody objects composed of between 3 and 7 capsules that are interacting with the terrain. The context is that of a character or other object that has been destroyed and various parts are hitting the world as they bounce around. Performance hasn't been good enough and so I tried simply creating spheres where the capsules were supposed to be and used the capsules radius. By doing this, the performance problem was virtually eliminated but of course the behavior was not that of capsules hitting the world. For example, the frame time without any rigid bodies is about 5ms, with 7 compound rigid bodies using spheres instead of capsules about 6ms, and with 7 compound rigid bodies using capsules around 45ms - obviously a pretty steep cost for capsules. Until I looked at this closely in the profiler I was pretty sure that I was doing something to cause the framerate to drop so badly, but after the sphere test and seeing the numbers come out of GeneralConvexPairTester I'm starting to realize there's not much I can do about the performance problem so long as I use capsules. I'm inclined to use a row of spheres to represent capsules to see how well that performs. I would appreciate any suggestions on what else I can do to improve performance here.

I'm curious if there are any plans for the creation of optimized testing functions for capsules? My immediate need is capsule to box, but given that capsules are one of the more generally useful & efficient primitives it would be great to have optimized support for capsules against sphere/box/triangle/convex. I'd expect contact generation costs for capsules to be around 2 or 3x that of a sphere if these were available (== orders of magnitude less of what it is now). I know that would be a lot of work and I don't really expect it to happen any time soon if ever, so I was looking at Bullet to see if maybe I could port something over myself. I found this, but oddly it doesn't seem to have made it into the trunk there, not sure why. I'm only passingly familiar with Bullet's code and conventions and after a closer look I decided porting those functions is more than I want to take on right now but maybe this will plant the seed for someone more industrious than myself :)

Thanks again for Bepu, this is a great API and I've enjoyed learning more about it lately.
ecosky
Posts: 69
Joined: Fri Nov 04, 2011 7:13 am
Contact:

Re: Performance questions re: Capsules & GeneralConvexPairTe

Post by ecosky »

I wound up breaking the capsules up into rows of spheres and it helped quite a lot on the PC but to my surprise made things worse on the xbox, not sure why yet. Also, the simple test of using sphere in place of capsules didn't help the xbox at all. I am thinking maybe the performance cost is a byproduct of the compound bodies but I need to do more testing to find out for sure.
Norbo
Site Admin
Posts: 4929
Joined: Tue Jul 04, 2006 4:45 am

Re: Performance questions re: Capsules & GeneralConvexPairTe

Post by Norbo »

Something doesn't seem right. While a capsule special case would be faster, the general convex case should not be that much slower than the sphere-sphere case. I have avoided adding the capsule special cases because the performance was good enough for most purposes.

I modified the TerrainDemo to test the performance:

Code: Select all

            //x and y, in terms of heightmaps, refer to their local x and y coordinates.  In world space, they correspond to x and z.
            //Setup the heights of the terrain.
            int xLength = 256;
            int zLength = 256;

            float xSpacing = 8f;
            float zSpacing = 8f;
            var heights = new float[xLength, zLength];
            for (int i = 0; i < xLength; i++)
            {
                for (int j = 0; j < zLength; j++)
                {
                    float x = i - xLength / 2;
                    float z = j - zLength / 2;
                    heights[i, j] = (float)(10 * (Math.Sin(x / 8) + Math.Sin(z / 8)));
                }
            }
            //Create the terrain.
            var terrain = new Terrain(heights, new AffineTransform(
                    new Vector3(xSpacing, 1, zSpacing),
                    Quaternion.Identity,
                    new Vector3(-xLength * xSpacing / 2, 0, -zLength * zSpacing / 2)));

            Space.Add(terrain);


            for (int i = 0; i < 20; i++)
            {
                for (int j = 0; j < 20; j++)
                {
                    CompoundBody cb = new CompoundBody(new CompoundShapeEntry[] 
                        {
                            new CompoundShapeEntry(new CapsuleShape(2, 1), new Vector3(0, 0,0)),
                            new CompoundShapeEntry(new CapsuleShape(2, 1), new Vector3(2, 0,0)),
                            new CompoundShapeEntry(new CapsuleShape(2, 1), new Vector3(4, 0,0)),
                            new CompoundShapeEntry(new CapsuleShape(2, 1), new Vector3(6, 0,0)),
                            new CompoundShapeEntry(new CapsuleShape(2, 1), new Vector3(8, 0,0)),
                            new CompoundShapeEntry(new CapsuleShape(2, 1), new Vector3(10, 0,0)),
                            new CompoundShapeEntry(new CapsuleShape(2, 1), new Vector3(12, 0,0)),
                        }, 10);
                    cb.Position = new Vector3(i * 15, 20, j * 4);

                    //CompoundBody cb = new CompoundBody(new CompoundShapeEntry[] 
                    //{
                    //    new CompoundShapeEntry(new SphereShape(1), new Vector3(0, 0,0)),
                    //    new CompoundShapeEntry(new SphereShape(1), new Vector3(2, -2,0)),
                    //    new CompoundShapeEntry(new SphereShape(1), new Vector3(4, 0,0)),
                    //    new CompoundShapeEntry(new SphereShape(1), new Vector3(6, -2,0)),
                    //    new CompoundShapeEntry(new SphereShape(1), new Vector3(8, 0,0)),
                    //    new CompoundShapeEntry(new SphereShape(1), new Vector3(10, -2,0)),
                    //    new CompoundShapeEntry(new SphereShape(1), new Vector3(12, 0,0)),
                    //}, 10);
                    //cb.Position = new Vector3(i * 15, 20, j * 4);

                    cb.ActivityInformation.IsAlwaysActive = true;

                    Space.Add(cb);
                }
            }
The capsules are organized such that when they settle, the maximum number of contacts is being created between them and the terrain (side by side and laying flat). The spheres are offset to prevent rolling.

For 400 such compounds on my (old) computer:
-Capsule version takes ~22 milliseconds per time step.
-Sphere version takes ~12 milliseconds.
(The objects are close enough that a significant number of compound-compound collisions also occur. In both cases, the total number of top-level collision pairs is similar (~750).)

For 49 such compounds on my Samsung Focus:
-Capsule version takes ~100 milliseconds per time step.
-Sphere version takes ~43 milliseconds.

For 7 such compounds on my Samsung Focus:
-Capsule version takes ~13 milliseconds per time step.
-Sphere version takes ~6 milliseconds.

These results can be explained relatively easily by three factors in no particular order:
-Sphere shapes can only generate one contact with an individual triangle within a mesh or with other spheres. Capsules generate more and need more. More contacts means more contact constraints, which means longer solving times.
-Sphere shapes have a special narrow phase case for both meshes and other spheres which is faster than the general case.
-The capsules are quite a bit larger (length 2, radius 1) than the spheres (radius 1), so they will be tested against more triangles.

To accentuate testing against many triangles, the triangle density is increased by 64x (1x1 triangles instead of 8x8 triangles). For 7 compounds on my Samsung Focus:
-Capsule version takes ~37 milliseconds.
-Sphere version takes ~9.2 milliseconds.

A sphere might be within the bounding box of 18 triangles at once with this density while a capsule may be within the bounding box of up to 50 at once (though 30-40 is more common in this test).

To isolate this effect, degenerate capsules (length 0, radius 1) are tested with the 1x1 triangles. For 7 compounds on my Samsung Focus:
-LongCapsule version takes ~37 milliseconds.
-SphereyCapsule version takes ~17 milliseconds.
-Sphere version takes ~9.2 milliseconds.

Quite an improvement! Now, back to 8x8 triangles again. For 7 compounds on my Samsung Focus:
-LongCapsule version takes ~13 milliseconds.
-SphereyCapsule version takes ~7.6 milliseconds.
-Sphere version takes ~6 milliseconds.

One final test, back on the PC. With 8x8 triangles and 400 compounds on my (old) computer:
-LongCapsule version takes ~22 milliseconds.
-SphereyCapsule version takes ~14 milliseconds.
-Sphere version takes ~12 milliseconds.

Comparing against the earlier long capsules on 8x8 triangles, it appears that the performance difference is largely (sometimes mostly) caused by overlapping more triangles as opposed to the narrow phase algorithms themselves.

So, from this, the general case collision detection system isn't significantly slower than the sphere special cases overall. Somewhere around a factor of 2 overall frame time, at the high end, for comparable shapes. (This matches expectations from earlier tests: the box-box special case is not significantly faster than the general case. In fact, with some tuning in some situations, the general case can sometimes beat the box-box special case by a tiny amount. However, box-box is very robust, well-behaved in numerical extremes, and handles sliding contacts better, so I picked it anyway.)

Capsule special cases would be slower than sphere special cases, so the benefit would be lower than the factor of 2 or less found above.

My Xbox isn't currently easily accessible for testing, but it should produce consistent results.

Long story short: 40 ms for capsules versus 1ms for spheres smells like shenanigans :)
I am thinking maybe the performance cost is a byproduct of the compound bodies but I need to do more testing to find out for sure.
If you can reproduce the performance loss in isolation (like in the BEPUphysicsDemos), I could take a look. I don't have any good theories at the moment, though :)
ecosky
Posts: 69
Joined: Fri Nov 04, 2011 7:13 am
Contact:

Re: Performance questions re: Capsules & GeneralConvexPairTe

Post by ecosky »

This wouldn't be the first time I've been fairly accused of shenanigans!

Seriously though, thanks for taking such a close look at this. I am glad to see your results of using capsules shows the cost is closer to 2x than that of using spheres - that's about the range I'd have expected based on past experiences and your tests indicate I've probably done something wrong. There is one difference between your tests and mine which is that my terrain is actually using box shapes, not triangles; perhaps this is significant. I meant to explain that in the first post but I now see after rereading it I only mentioned boxes in passing.

I will take the time to carefully review what you've explained and will try to reproduce the problematic situation in a tester. I don't want to take any more of your time until I get some solid evidence of a problem; I feel bad you spent as much effort as you did so I'll do what I can to identify what is really going on here. All I really know right now for sure is that when I activate a relatively small collection of these CompoundBody objects, performance takes a pretty big hit. I'll get more solid information and get back to you.

Thanks again for your time & efforts,
Norbo
Site Admin
Posts: 4929
Joined: Tue Jul 04, 2006 4:45 am

Re: Performance questions re: Capsules & GeneralConvexPairTe

Post by Norbo »

There is one difference between your tests and mine which is that my terrain is actually using box shapes, not triangles; perhaps this is significant.
It is indeed significant, and I did know you had a box terrain, but the significance just slipped my mind when writing the previous post :) I'll have to admit to a rather significant oopsy. Convex-triangle is a partial special case. Face-convex collisions can get away without running the traditional general case at all. I forgot about this completely while doing the benchmarks. So, most of the above tests aren't actually helpful for this particular issue!

The super-dense triangles do offer some insight, though, since the system reverts to general case collision detection for nonspheres in edge collisions. This helps explain the oddity where capsules of any size suffered disproportionately with triangle density increase.
I feel bad you spent as much effort as you did so I'll do what I can to identify what is really going on here.
Don't feel too bad, I did it in large part because I didn't actually have performance data for capsules specifically. I had always assumed they would be fast enough, but I didn't have numbers.

For the same reason, I'm going to have to ask you not to feel bad about the following tests, either :)

These tests use the same compounds as above, but fall onto a flat 'terrain' composed of big blocks.

For 400 compounds composed of 7 shapes each:
-LongCapsules take 16 ms.
-SphereyCapsules take 12 ms.
-Spheres take 8.5 ms.

Fortunately, it appears the results are fairly consistent.

There is still a performance loss for using capsules over spheres thanks to the box-sphere special case. The fundamental difference seems to be under 50% still.

Longer capsules suffer a bit, likely because they are generating more contacts due to laying down. The persistent manifold prevents additional redundant points from being created in the spherical capsule case. The extra (necessary) contacts in the long capsule case take longer to handle.
ecosky
Posts: 69
Joined: Fri Nov 04, 2011 7:13 am
Contact:

Re: Performance questions re: Capsules & GeneralConvexPairTe

Post by ecosky »

Well it turns out I had a bug in my content pipeline that was causing far too many capsules to be created for the asset. I didn't notice it before because when I checked out the CompoundBody objects I only looked at the first few of them in detail, expecting the rest of them to be configured as exported from Softimage. In fact, I only noticed after I wrote code in my engine to emit C# code to reproduce the same exact data I was using so I could recreate the exact CompoundBody in a Bepu tester. Of the 7 compound bodies, 4 of them had over 50 capsules each! I actually had a hint of this bug earlier but I didn't see the connection at the time; I had noticed a huge number of contacts being processed but still being new to Bepu I wasn't sure if it was a sign of a trouble or if it was expected behavior with CompoundBody objects. This error explains the frame time spiking so badly when all these capsules became active, but not the frame time not spiking with spheres.. I haven't been able to reproduce that behavior since last night so I think its clear that I did something wrong with my earlier measurements.. I think I must have had the memory profiler active when I took those slow numbers, that's the only thing I can think of. Either way, I'm sure it was an error on my part.

Fixing the pipeline to produce the correct set of capsules has fixed the original performance problem. Thanks for taking the time to measure the capsule performance; despite knowing the original problem was a bug in my code I am glad to know the relative costs of capsules because I had (and still do) intended to make heavy use of them in the near future.

Thanks again!
Post Reply