Sphere-terrain optimization.

olhovsky · Post by **olhovsky** » Sun May 22, 2011 1:55 pm

Hi Norbo, you dropped by on this thread the other day:
http://www.olhovsky.com/2011/05/physics ... sion-done/
You seemed to suggest that 400 spheres takes 25ms to update on the XBOX.

So I did some testing to see how performance works out on the XBOX 360 in my own simulator, and in BEPU.

I documented some of my testing here:
http://www.olhovsky.com/2011/05/my-own-physics-or-bepu/

In particular, I modified the terrain demo to add spheres with the following code, and removed the adding of boxes.

Code: Select all

Random rand = new Random();

for (int i = 0; i < 7; i++)
{
    for (int j = 0; j < 7; j++)
    {
        for (int k = 0; k < 8; k++)
        {
            Vector3 position =
                new Vector3((float) rand.NextDouble()*10 + i*128,
                    400 + -j*2,
                    (float) rand.NextDouble()*10 + k*128);
            float radius = (float) rand.NextDouble() + 1;
            Space.Add(new Sphere(position,
                radius,
                radius * radius * radius));
        }
    }
}

I found that adding 392 spheres this way caused physics updates to take about 80ms once the spheres hit the terrain. It takes a couple of minutes before this speeds up very much (and it gets worse first).

Can you help me figure out what is costing the most time, or if perhaps I am doing something wrong?

I don't care very much about rotation, would adding a special case for spheres that don't rotate help performance significantly? Is the sphere-terrain collision expensive -- maybe I could start optimizing there?

I'm willing to give up some detection precision. Mostly these spheres will not be moving too quickly. Maybe I should cap the number of sphere-cast steps. I care very little about time of impact precision. Maybe TOI sorting is costing a lot?

I'm not familiar with your broadphase technique. It seems that about 80 active spheres takes about 20ms to update on the XBOX (obviously this depends greatly on their positioning). I might be able to get away with that if I can simultaneously have ~200-300 inactive spheres at the same time. I'm off to go test that next.

EDIT:
I tested BEPU with a much flatter terrain, and updates took 30ms with 392 spheres. Holy crap, that's awesome! So BEPU clearly wins in the fight against my simulator. I'll start transitioning everything over to BEPU in the next day or two.

Post by **Norbo** » Sun May 22, 2011 10:52 pm

I documented some of my testing here:
http://www.olhovsky.com/2011/05/my-own-physics-or-bepu/

I promise I'm not an alien preparing the invasion of Earth, and that removing part of the first letter of my name was not just a ploy to make you feel comfortable

I don't care very much about rotation, would adding a special case for spheres that don't rotate help performance significantly?

Very little, most likely. You can set an entity's LocalInertiaTensorInverse to a zero matrix and that will disallow rotation. While the math involved in integration is performed, having infinite angular inertia can simplify the solving process a little.

You could also try just setting the solver's iteration limit to something lower (Space.Solver.IterationLimit). It will help a little; if its a bunch of isolated spheres on the ground, its already early-outing really fast, but you may be able to save an iteration or two here and there.

Is the sphere-terrain collision expensive -- maybe I could start optimizing there?

It's not particularly expensive. The primary consideration is the number of triangles that are possibly colliding with a single object. If a single sphere overlaps 40 triangles, it's obviously going to be slower than if it's only overlapping 2. Secondarily, the configuration of collisions matter. If the plane of the triangle is found to be the colliding feature, the collision test goes extremely fast. If it has to resort to boundary tests, it may slow down a bit. It still should go pretty fast due to the simplicity of the involved shapes. The worst case for this sort of issue is a terrain composed of a bunch of spikes.

The convex-triangle special case is fast enough in general that there is not currently a sphere-triangle special case, though there may be in the future. The difference wouldn't be gigantic.

With spheres, the issue of triangle boundary bumps is mostly eliminated. Depending on your game, you may be able to turn off the terrain.ImproveBoundaryBehavior. Having that enabled allows boxes and other shapes to slide smoothly over edges/vertex based on their connectivity with adjacent triangles. It won't be a huge performance win, but it is worth trying out.

I'm willing to give up some detection precision. Mostly these spheres will not be moving too quickly. Maybe I should cap the number of sphere-cast steps. I care very little about time of impact precision. Maybe TOI sorting is costing a lot?

No TOI sorting is used. CCD should have extremely low overhead in most cases since it only activates when the relative velocities put objects in danger of tunnelling.

I'm not familiar with your broadphase technique. It seems that about 80 active spheres takes about 20ms to update on the XBOX (obviously this depends greatly on their positioning). I might be able to get away with that if I can simultaneously have ~200-300 inactive spheres at the same time. I'm off to go test that next.

With only 80 objects, it's extremely unlikely that the broad phase is the bottleneck. In addition, the broad phase will still do almost all of its work even if objects are inactive because the broad phase is responsible for triggering the sequence which can result in the activation of entities.

If you were interested, there are a few new broad phases currently in testing.
-One is an extremely simple traditional 1D SAP (SortAndSweep1D) which can beat the current DBVH by quite a bit in certain configurations (primarily when object density is sparse). It does not support any accelerated queries, though.
-Another is a hybridized 2D grid + SAP (Grid2DSortAndSweep) which has a slightly higher baseline overhead, but eliminates most of the 1D SAP's worst-case configuration issues. It beats DBVH quite handily most of the time and supports easy parallelization. It conceptually supports queries, though they are not implemented yet (and when they are, they will be slower than the DBVH's).
-The DBVH is scheduled for a rewrite. The current version is a bit messy and a memory hog.

All of the new stuff can be found in the development fork (though beware bugs):
http://bepuphysics.codeplex.com/SourceC ... evelopment

By the way, I don't know how your whole architecture is set up, but you can get some pretty significant gains from using internal multithreading. A fork-and-join type approach where the physics engine has full access to the Xbox's 3 cores will take the per-update time down a lot, though other compute-heavy systems couldn't be running on other threads simultaneously. It also simplifies the access patterns a lot since you don't have to worry about corrupt data mid-update.

EDIT:
I tested BEPU with a much flatter terrain, and updates took 30ms with 392 spheres. Holy crap, that's awesome! So BEPU clearly wins in the fight against my simulator. I'll start transitioning everything over to BEPU in the next day or two.

Could I ask what the original terrain test looked like? I may be able to use it to do additional optimizing in the future.

olhovsky · Post by **olhovsky** » Mon May 23, 2011 4:17 am

Norbo wrote:Could I ask what the original terrain test looked like? I may be able to use it to do additional optimizing in the future.

Here's a copy of my TerrainDemo.cs, in the worst case this reaches ~14ms on my E5300 3.0Ghz single threaded.

Code: Select all

using System;
using BEPUphysics.Collidables;
using BEPUphysics.Entities.Prefabs;
using BEPUphysics.EntityStateManagement;
using BEPUphysics.MathExtensions;
using Microsoft.Xna.Framework;

namespace BEPUphysicsDemos.Demos
{
    /// <summary>
    /// Boxes fall onto a large terrain.  Try driving around on it!
    /// </summary>
    public class TerrainDemo : StandardDemo
    {
        /// <summary>
        /// Constructs a new demo.
        /// </summary>
        /// <param name="game">Game owning this demo.</param>
        public TerrainDemo(DemosGame game)
            : base(game)
        {
            //x and y, in terms of heightmaps, refer to their local x and y coordinates.  In world space, they correspond to x and z.
            //Setup the heights of the terrain.
            int xLength = 256;
            int zLength = 256;

            float xSpacing = 1f;
            float zSpacing = 1f;
            var heights = new float[xLength, zLength];
            for (int i = 0; i < xLength; i++)
            {
                for (int j = 0; j < zLength; j++)
                {
                    float x = i - xLength / 2;
                    float z = j - zLength / 2;
                    //heights[i,j] = (float)(x * y / 1000f);
                    heights[i, j] = (float)(5 * (Math.Sin(x / 4) + Math.Sin(z / 4)));
                    //heights[i,j] = 3 * (float)Math.Sin(x * y / 100f);
                    //heights[i,j] = (x * x * x * y - y * y * y * x) / 1000f;
                }
            }
            //Create the terrain.
            var terrain = new Terrain(heights, new AffineTransform(
                    new Vector3(xSpacing, 1, zSpacing),
                    Quaternion.Identity,
                    new Vector3(-xLength * xSpacing / 2, 0, -zLength * zSpacing / 2)));

            Space.Add(terrain);

            Random rand = new Random();

            for (int i = 0; i < 7; i++)
            {
                for (int j = 0; j < 7; j++)
                {
                    for (int k = 0; k < 8; k++)
                    {
                        Vector3 position =
                            new Vector3((float) rand.NextDouble()*2 - 50f + i*20,
                                        100 + -j*2,
                                        (float) rand.NextDouble()*2 - 50f + k*20);
                        float radius = (float) rand.NextDouble()*0.25f + 0.25f;
                        Space.Add(new Sphere(position, 
                            radius,
                            radius * radius * radius));
                    }
                }
            }




            game.ModelDrawer.Add(terrain);

            game.Camera.Position = new Vector3(0, 30, 20);

        }

        /// <summary>
        /// Gets the name of the simulation.
        /// </summary>
        public override string Name
        {
            get { return "Terrain"; }
        }
    }
}

Here's another version with a different scale of objects, which reaches ~16ms on my E5300 3.0Ghz, single threaded:

Code: Select all

using System;
using BEPUphysics.Collidables;
using BEPUphysics.Entities.Prefabs;
using BEPUphysics.EntityStateManagement;
using BEPUphysics.MathExtensions;
using Microsoft.Xna.Framework;

namespace BEPUphysicsDemos.Demos
{
    /// <summary>
    /// Boxes fall onto a large terrain.  Try driving around on it!
    /// </summary>
    public class TerrainDemo : StandardDemo
    {
        /// <summary>
        /// Constructs a new demo.
        /// </summary>
        /// <param name="game">Game owning this demo.</param>
        public TerrainDemo(DemosGame game)
            : base(game)
        {
            //x and y, in terms of heightmaps, refer to their local x and y coordinates.  In world space, they correspond to x and z.
            //Setup the heights of the terrain.
            int xLength = 256;
            int zLength = 256;

            float xSpacing = 8f;
            float zSpacing = 8f;
            var heights = new float[xLength, zLength];
            for (int i = 0; i < xLength; i++)
            {
                for (int j = 0; j < zLength; j++)
                {
                    float x = i - xLength / 2;
                    float z = j - zLength / 2;
                    //heights[i,j] = (float)(x * y / 1000f);
                    heights[i, j] = (float)(50 * (Math.Sin(x / 4) + Math.Sin(z / 4)));
                    //heights[i,j] = 3 * (float)Math.Sin(x * y / 100f);
                    //heights[i,j] = (x * x * x * y - y * y * y * x) / 1000f;
                }
            }
            //Create the terrain.
            var terrain = new Terrain(heights, new AffineTransform(
                    new Vector3(xSpacing, 1, zSpacing),
                    Quaternion.Identity,
                    new Vector3(-xLength * xSpacing / 2, 0, -zLength * zSpacing / 2)));

            Space.Add(terrain);

            Random rand = new Random();

            for (int i = 0; i < 7; i++)
            {
                for (int j = 0; j < 7; j++)
                {
                    for (int k = 0; k < 8; k++)
                    {
                        Vector3 position =
                            new Vector3((float) rand.NextDouble()*20 - 200f + i*128,
                                        100 + -j*2,
                                        (float) rand.NextDouble()*20 - 200f + k*128);
                        float radius = (float) rand.NextDouble() + 1f;
                        Space.Add(new Sphere(position, 
                            radius,
                            radius * radius * radius));
                    }
                }
            }




            game.ModelDrawer.Add(terrain);

            game.Camera.Position = new Vector3(0, 30, 20);

        }

        /// <summary>
        /// Gets the name of the simulation.
        /// </summary>
        public override string Name
        {
            get { return "Terrain"; }
        }
    }
}

EDIT:

This version takes 30ms+ per update on the PC in the worst case, ouch!

Code: Select all

using System;
using BEPUphysics.Collidables;
using BEPUphysics.Entities.Prefabs;
using BEPUphysics.MathExtensions;
using Microsoft.Xna.Framework;
using BEPUphysics.Settings;
using BEPUphysics.CollisionTests.CollisionAlgorithms;
using BEPUphysics.CollisionRuleManagement;
using BEPUphysics.NarrowPhaseSystems.Pairs;
using BEPUphysics.CollisionTests.Manifolds;
using System.Diagnostics;

namespace BEPUphysicsDemos.Demos
{
    /// <summary>
    /// Boxes fall onto a large terrain.  Try driving around on it!
    /// </summary>
    public class TerrainDemo : StandardDemo
    {
        /// <summary>
        /// Constructs a new demo.
        /// </summary>
        /// <param name="game">Game owning this demo.</param>
        public TerrainDemo(DemosGame game)
            : base(game)
        {
            //x and y, in terms of heightmaps, refer to their local x and y coordinates.  In world space, they correspond to x and z.
            //Setup the heights of the terrain.
            int xLength = 256;
            int zLength = 256;

            float xSpacing = 1f;
            float zSpacing = 1f;
            var heights = new float[xLength, zLength];
            for (int i = 0; i < xLength; i++)
            {
                for (int j = 0; j < zLength; j++)
                {
                    float x = i - xLength / 2;
                    float z = j - zLength / 2;
                    //heights[i,j] = (float)(x * y / 1000f);
                    heights[i, j] = (float)(2 * (Math.Sin(x / 8) + Math.Sin(z / 8)));
                    //heights[i,j] = 3 * (float)Math.Sin(x * y / 100f);
                    //heights[i,j] = (x * x * x * y - y * y * y * x) / 1000f;
                }
            }
            //Create the terrain.
            var terrain = new Terrain(heights, new AffineTransform(
                    new Vector3(xSpacing, 1, zSpacing),
                    Quaternion.Identity,
                    new Vector3(-xLength * xSpacing / 2, 0, -zLength * zSpacing / 2)));

            //MotionSettings.DefaultPositionUpdateMode = BEPUphysics.PositionUpdating.PositionUpdateMode.Continuous;

            Space.Add(terrain);
            for (int i = 0; i < 7; i++)
            {
                for (int j = 0; j < 7; j++)
                {
                    for (int k = 0; k < 8; k++)
                    {
                        Space.Add(new Sphere(new Vector3(0 + i * 8, 100 + -j * 10, 0 + k * 8), 0.5f, 1));
                        //Space.Add(new Box(
                        //    new Vector3(0 + i * 4, 1000 + -j * 10, 0 + k * 4),
                        //    2 + i * j * k,
                        //    2 + i * j * k,
                        //    2 + i * j * k,
                        //    4 + 20 * i * j * k));
                    }
                }
            }


            //Random rand = new Random();


            //for (int i = 0; i < 7; i++)
            //{
            //    for (int j = 0; j < 7; j++)
            //    {
            //        for (int k = 0; k < 8; k++)
            //        {
            //            Vector3 position =
            //                new Vector3((float)rand.NextDouble() * 10 + i * 128,
            //                    400 + -j * 2,
            //                    (float)rand.NextDouble() * 10 + k * 128);
            //            float radius = (float)rand.NextDouble() + 1;
            //            Space.Add(new Sphere(position,
            //                radius,
            //                radius * radius * radius));
            //        }
            //    }
            //}



            game.ModelDrawer.Add(terrain);

            game.Camera.Position = new Vector3(0, 30, 20);

        }

        public override void Update(float dt)
        {
            if (Game.KeyboardInput.IsKeyDown(Microsoft.Xna.Framework.Input.Keys.P))
                Debug.WriteLine("Break.");
            base.Update(dt);

        }



        /// <summary>
        /// Gets the name of the simulation.
        /// </summary>
        public override string Name
        {
            get { return "Terrain"; }
        }
    }
}

olhovsky · Post by **olhovsky** » Mon May 23, 2011 4:37 am

Norbo wrote:By the way, I don't know how your whole architecture is set up, but you can get some pretty significant gains from using internal multithreading. A fork-and-join type approach where the physics engine has full access to the Xbox's 3 cores will take the per-update time down a lot, though other compute-heavy systems couldn't be running on other threads simultaneously. It also simplifies the access patterns a lot since you don't have to worry about corrupt data mid-update.

Core 0 is maxed on draw calls.
Core 1 is dedicated to AI, and queries to AI vary greatly, from 1ms - 200ms (the 200ms worst case occurs relatively rarely, and occurs when bad things happen in a pathfinding search).
This really only leaves me with one core left. I guess I can run two threads on a single core, which probably gives me a small speed increase due to better handling of cache misses.

Norbo wrote:If you were interested, there are a few new broad phases currently in testing.
-One is an extremely simple traditional 1D SAP (SortAndSweep1D) which can beat the current DBVH by quite a bit in certain configurations (primarily when object density is sparse). It does not support any accelerated queries, though.
-Another is a hybridized 2D grid + SAP (Grid2DSortAndSweep) which has a slightly higher baseline overhead, but eliminates most of the 1D SAP's worst-case configuration issues. It beats DBVH quite handily most of the time and supports easy parallelization. It conceptually supports queries, though they are not implemented yet (and when they are, they will be slower than the DBVH's).
-The DBVH is scheduled for a rewrite. The current version is a bit messy and a memory hog.

All of the new stuff can be found in the development fork (though beware bugs):
http://bepuphysics.codeplex.com/SourceC ... evelopment

In my testing of 400 spheres on my terrain, I used a 2D sweep and prune (aka sort and sweep) that uses matrix tables and bit sets.

I don't know if it helps, but here's my SAP code:

Code: Select all

using System;
using System.Collections.Generic;
using Game3.Development;
using Microsoft.Xna.Framework;

namespace Game3.Physics.Collision.Detection
{
    /// <summary>
    /// Maintains sorted AABBs of objects for quick collision testing.
    /// </summary>
    public class SweepAndPrune
    {
        private const int DefaultCapacity = 256;
        private const int DefaultMatrixGrowFactor = 2;

        private List<RigidBody> _list;

        private List<RigidBody>[] _boxes;

        private uint[][][] _boxMatricies;
        private int _newItems;
        private int _rows;
        private int _columns;
        private int _growFactor = DefaultMatrixGrowFactor;

        private Comparer<RigidBody>[] _boxCompare;

        public SweepAndPrune()
        {
            SetCapacity(DefaultCapacity);

            _list = new List<RigidBody>();
            
            _boxes = new List<RigidBody>[2];
            for (int axis = 0; axis < _boxes.Length; axis++)
            {
                _boxes[axis] = new List<RigidBody>();
            }

            _boxCompare = new Comparer<RigidBody>[2];
            _boxCompare[0] = new AxisXComparer();
            _boxCompare[1] = new AxisZComparer();
        }

        public int Capacity 
        { 
            get { return _rows; }
            set { SetCapacity(value); } 
        }

        public int GrowFactor
        {
            get { return _growFactor; } 
            set { _growFactor = value; }
        }

        public void Add(RigidBody c)
        {
            c.Flags = (uint)_list.Count;
            _list.Add(c);
            _boxes[0].Add(c);
            _boxes[1].Add(c);
            _newItems++;
        }

        public void Remove(RigidBody c)
        {
            int pos = _list.IndexOf(c);
            if (pos < 0)
                return;
            if (_list.Count < 2)
                _list.RemoveAt(pos);
            else
            {
                _list[pos] = _list[_list.Count - 1];
                _list.RemoveAt(_list.Count - 1);
            }
            _boxes[0].Remove(c);
            _boxes[1].Remove(c);
        }

        public void Clear()
        {
            _list.Clear();
            _boxes[0].Clear();
            _boxes[1].Clear();
        }

        public void Execute(ref ContactPair[] pairs, ref int pairCount)
        {
            while (_boxes[0].Count > _rows)
            {
                SetCapacity(_rows * _growFactor);
            }

            DebugSystem.Instance.TimeRuler.BeginMark("SortAndSweep", Color.PaleGreen);

            SortAndSweep();

            DebugSystem.Instance.TimeRuler.EndMark("SortAndSweep");

            _newItems = 0;


            DebugSystem.Instance.TimeRuler.BeginMark("GetClosestPoints", Color.MediumAquamarine);

            for (int axis = 0; axis < _boxMatricies.Length; axis++)
            {
                for (int i = 0; i < _rows; i++)
                {
                    for (int j = 0; j < _columns; j++)
                    {
                        // Note: only two axis are assumed here for speed.
                        // Bitwise AND the two axis.
                        uint block = 
                            _boxMatricies[0][i][j] & _boxMatricies[1][i][j];

                        if (block == 0)
                        {
                            continue;
                        }

                        uint mask = 1;
                        for (int k = 0; k < sizeof(uint); k++)
                        {
                            if ((mask & block) > 0)
                            {
                                #region Generate contact pair for these two bodies.

                                RigidBody a = _list[i];
                                RigidBody b = _list[j*sizeof (uint) + k];

                                if (a.InverseMass == 0 && b.InverseMass == 0)
                                {
                                    // Both objects are immovable.
                                    continue;
                                }

                                // Ensure that there will be enough room 
                                // for contacts generated.
                                while (pairCount + 
                                    PhysicsSystem.MaxPairsGenerated >= 
                                    pairs.Length)
                                {
                                    ContactPair.GrowContactPairArray(ref pairs);
                                }

                                a.GetClosestPoints(b, ref pairs, ref pairCount);

                                #endregion
                            }

                            mask <<= 1;
                        }
                    }
                }
            }

            DebugSystem.Instance.TimeRuler.EndMark("GetClosestPoints");
        }

        #region SortAndSweep methods for X, Y, and Z

        private void SortAndSweep()
        {
            // Zero out box matricies.
            for (int axis = 0; axis < _boxMatricies.Length; axis++)
            {
                for (int i = 0; i < _boxMatricies[axis].Length; i++)
                {
                    Array.Clear(_boxMatricies[axis][i], 0, 
                        _boxMatricies[axis][i].Length);
                }
            }

            const int quickSortThreshold = 2;
            if (_newItems > quickSortThreshold)
            {
                // Use quicksort.
                for (int axis = 0; axis < _boxes.Length; axis++)
                {
                    _boxes[axis].Sort(_boxCompare[axis]);
                }
            }
            else
            {
                // Use insertion sort.
                DebugSystem.Instance.TimeRuler.BeginMark("Physics InsertionSort",
                                                         Color.Thistle);

                InsertionSort();

                DebugSystem.Instance.TimeRuler.EndMark("Physics InsertionSort");
            }

            for (int axis = 0; axis < _boxes.Length; axis++)
            {
                for (int i = 0; i < _boxes[axis].Count; i++)
                {
                    for (int j = i + 1; j < _boxes[axis].Count; j++)
                    {
                        RigidBody a = _boxes[axis][i];
                        RigidBody b = _boxes[axis][j];

                        if (a.AABB2DMax[axis] < b.AABB2DMin[axis])
                        {
                            break;
                        }

                        uint row = (a.Flags < b.Flags ? a.Flags : b.Flags);
                        int col = (int)(a.Flags < b.Flags ? b.Flags : a.Flags);
                        _boxMatricies[axis][row][col / sizeof(uint)] |=
                            (uint)(1 << (col % sizeof(uint)));
                    }
                }
            }
        }

        #endregion

        private class AxisXComparer : Comparer<RigidBody>
        {
            public override int Compare(RigidBody a, RigidBody b)
            {
                float d = a.AABB2DMin[0] - b.AABB2DMin[0];
                return d == 0f ? 0 : (d < 0f ? -1 : 1);
            }
        }

        private class AxisZComparer : Comparer<RigidBody>
        {
            public override int Compare(RigidBody a, RigidBody b)
            {
                float d = a.AABB2DMin[1] - b.AABB2DMin[1];
                return d == 0f ? 0 : (d < 0f ? -1 : 1);
            }
        }

        private void InsertionSort()
        {
            for (int axis = 0; axis < _boxes.Length; axis++)
            {
                for (int i = 0; i < _boxes[axis].Count; i++)
                {
                    var val = _boxes[axis][i];
                    int j = i - 1;
                    while (j >= 0 && _boxes[axis][j].AABB2DMin[axis] > val.AABB2DMin[axis])
                    {
                        _boxes[axis][j + 1] = _boxes[axis][j];
                        --j;
                    }
                    _boxes[axis][j + 1] = val;
                }
            }
        }

        private void SetCapacity(int size)
        {
            _rows = size;
            _columns = _rows / sizeof(uint) + 1;
            _boxMatricies = new uint[2][][];
            for (int axis = 0; axis < _boxMatricies.Length; axis++)
            {
                _boxMatricies[axis] = new uint[_rows][];
                for(int row = 0; row < _boxMatricies[axis].Length; row++)
                {
                    _boxMatricies[axis][row] = new uint[_columns];
                }
            }
        }
    }
}

Post by **Norbo** » Mon May 23, 2011 5:51 am

Interesting- something odd appears to be going on. Are you using v0.15.2 or the most recent development version? I don't recall making any massive speed improvements in v0.16.0 yet, but I suppose it's best to ensure we're on the same page regardless

The first two examples you provided maxed out at around 9.5ms and 8.5ms respectively on a single thread. The last one (30ms+ on your system) took around 12ms in the worst case (around 1100+ collision pairs) on mine. Judging by some anandtech benchmarks, one of a E5300's cores (especially when overclocked) should be able to beat one core of my Q6600 at 2.4ghz.

This really only leaves me with one core left. I guess I can run two threads on a single core, which probably gives me a small speed increase due to better handling of cache misses.

I suspect the overhead from managing the extra threads would wipe out any stall-filling benefit. Giving it a shot can't hurt I suppose.

olhovsky · Post by **olhovsky** » Mon May 23, 2011 8:09 am

Part of the problem I think was that I was running those tests in Debug mode.

In release mode, using the latest development version this time (I was using 0.15.2 before), with nothing changed except for TerrainDemo.cs:

Code: Select all

using System;
using BEPUphysics.Collidables;
using BEPUphysics.Entities.Prefabs;
using BEPUphysics.MathExtensions;
using Microsoft.Xna.Framework;
using BEPUphysics.Settings;
using BEPUphysics.CollisionTests.CollisionAlgorithms;
using BEPUphysics.CollisionRuleManagement;
using BEPUphysics.NarrowPhaseSystems.Pairs;
using BEPUphysics.CollisionTests.Manifolds;
using System.Diagnostics;

namespace BEPUphysicsDemos.Demos
{
    /// <summary>
    /// Boxes fall onto a large terrain.  Try driving around on it!
    /// </summary>
    public class TerrainDemo : StandardDemo
    {
        /// <summary>
        /// Constructs a new demo.
        /// </summary>
        /// <param name="game">Game owning this demo.</param>
        public TerrainDemo(DemosGame game)
            : base(game)
        {
            //x and y, in terms of heightmaps, refer to their local x and y coordinates.  In world space, they correspond to x and z.
            //Setup the heights of the terrain.
            int xLength = 256;
            int zLength = 256;

            float xSpacing = 1f;
            float zSpacing = 1f;
            var heights = new float[xLength, zLength];
            for (int i = 0; i < xLength; i++)
            {
                for (int j = 0; j < zLength; j++)
                {
                    float x = i - xLength / 2;
                    float z = j - zLength / 2;
                    //heights[i,j] = (float)(x * y / 1000f);
                    heights[i, j] = (float)(2 * (Math.Sin(x / 8) + Math.Sin(z / 8)));
                    //heights[i,j] = 3 * (float)Math.Sin(x * y / 100f);
                    //heights[i,j] = (x * x * x * y - y * y * y * x) / 1000f;
                }
            }
            //Create the terrain.
            var terrain = new Terrain(heights, new AffineTransform(
                    new Vector3(xSpacing, 1, zSpacing),
                    Quaternion.Identity,
                    new Vector3(-xLength * xSpacing / 2, 0, -zLength * zSpacing / 2)));
            terrain.Thickness = 50;

            //MotionSettings.DefaultPositionUpdateMode = BEPUphysics.PositionUpdating.PositionUpdateMode.Continuous;

            Space.Add(terrain);
            for (int i = 0; i < 7; i++)
            {
                for (int j = 0; j < 7; j++)
                {
                    for (int k = 0; k < 8; k++)
                    {
                        Space.Add(new Sphere(new Vector3(0 + i * 8, 100 + -j * 10, 0 + k * 8), 0.5f, 1));
                        //Space.Add(new Box(
                        //    new Vector3(0 + i * 4, 1000 + -j * 10, 0 + k * 4),
                        //    2 + i * j * k,
                        //    2 + i * j * k,
                        //    2 + i * j * k,
                        //    4 + 20 * i * j * k));
                    }
                }
            }


            //Random rand = new Random();


            //for (int i = 0; i < 7; i++)
            //{
            //    for (int j = 0; j < 7; j++)
            //    {
            //        for (int k = 0; k < 8; k++)
            //        {
            //            Vector3 position =
            //                new Vector3((float)rand.NextDouble() * 10 + i * 128,
            //                    400 + -j * 2,
            //                    (float)rand.NextDouble() * 10 + k * 128);
            //            float radius = (float)rand.NextDouble() + 1;
            //            Space.Add(new Sphere(position,
            //                radius,
            //                radius * radius * radius));
            //        }
            //    }
            //}



            game.ModelDrawer.Add(terrain);

            game.Camera.Position = new Vector3(0, 30, 20);

        }

        public override void Update(float dt)
        {
            if (Game.KeyboardInput.IsKeyDown(Microsoft.Xna.Framework.Input.Keys.P))
                Debug.WriteLine("Break.");
            base.Update(dt);

        }



        /// <summary>
        /// Gets the name of the simulation.
        /// </summary>
        public override string Name
        {
            get { return "Terrain"; }
        }
    }
}

This maxes out at 17ms in Release mode for me, again with no changes made to BEPU.

This runs at over 110ms max on the XBOX. Even in release mode the debugger is probably attached, are you running it without the debugger?

I got this version of BEPU by clicking Download under Latest Release here http://bepuphysics.codeplex.com/SourceC ... evelopment

Post by **Norbo** » Mon May 23, 2011 9:11 am

Even in release mode the debugger is probably attached, are you running it without the debugger?

Correct.

I'm seeing around 95 max for the xbox on that simulation. Tuning can get it below 90, but when there's a giant pile of objects on a fairly fine-grained mesh and no multithreading, it is going to hurt a bit. Turning the triangle density down to a spacing of 8 cuts the time from ~90 into the 30's.

I'm looking into a sphere-mesh special case, though it will should only improve performance by 20-50% depending on the bottleneck.

Post by **Norbo** » Tue May 24, 2011 12:59 am

I've implemented the first version of the sphere-terrain special case. The other mesh types will come soon, in addition to some more testing. You can grab it on the development fork: http://bepuphysics.codeplex.com/SourceC ... evelopment

The simulation that used to take 95ms is down to 72ms max. Some tuning ("super speedy settings" and a max of 1 solver iteration) can get it down to 62ms max.

By the way, if you'd like things to go to sleep faster, there are a few things you can do. First, locking the rotation will cause them to use their sliding friction at all times. The friction will remove a ton of energy from the system and get it into a resting state pretty quick. Another option is linear/angular damping. This would let your objects continue to roll around, but a bit of their energy goes away each frame.

By killing off rotation (LocalInertiaTensorInverse = new Matrix3X3), the max time for that simulation dropped to around 35ms. The main benefit of fixed rotation is the fact that the objects don't keep rolling around indefinitely and are able to go to sleep relatively quickly.

olhovsky · Post by **olhovsky** » Wed May 25, 2011 1:28 pm

Norbo wrote:I've implemented the first version of the sphere-terrain special case. The other mesh types will come soon, in addition to some more testing. You can grab it on the development fork: http://bepuphysics.codeplex.com/SourceC ... evelopment

The simulation that used to take 95ms is down to 72ms max. Some tuning ("super speedy settings" and a max of 1 solver iteration) can get it down to 62ms max.

By the way, if you'd like things to go to sleep faster, there are a few things you can do. First, locking the rotation will cause them to use their sliding friction at all times. The friction will remove a ton of energy from the system and get it into a resting state pretty quick. Another option is linear/angular damping. This would let your objects continue to roll around, but a bit of their energy goes away each frame.

By killing off rotation (LocalInertiaTensorInverse = new Matrix3X3), the max time for that simulation dropped to around 35ms. The main benefit of fixed rotation is the fact that the objects don't keep rolling around indefinitely and are able to go to sleep relatively quickly.

That's awesome Norbo, thank you!

I'm in the middle of separating out my game drawing logic into an engine project, which will take a few days, after which point I'll return to physics. I'm excited to try this out.

BEPUphysics

Sphere-terrain optimization.

Sphere-terrain optimization.

Re: Sphere-terrain optimization.

Re: Sphere-terrain optimization.

Re: Sphere-terrain optimization.

Re: Sphere-terrain optimization.

Re: Sphere-terrain optimization.

Re: Sphere-terrain optimization.

Re: Sphere-terrain optimization.

Re: Sphere-terrain optimization.