Abuse-Case Physics Timing

Discuss any questions about BEPUphysics or problems encountered.
Post Reply
mcmonkey
Posts: 92
Joined: Fri Apr 17, 2015 11:42 pm

Abuse-Case Physics Timing

Post by mcmonkey »

This is not my use case; this is my abuse case: I'm testing the limits of some features while working on related things.

I spawned 10,000 sphere-shape entities into a BEPU World that, aside from the spheres, consists of a /humongous/ static box for them to all land on.

The stats for how that went confuse me:
Pre-Spawn: Frame time is 15ms due to random things running including BEPU itself
Spawn: World freezes for about a second, ms frame time spike momentarily
After the freeze ends: about 30-50 ms times
landing on the box: another freeze, shorter, about 500ms frame time
Idling afterwards: ... about 30-50 ms times


Do you see my question here?

How is that free falling spheres x10k has the same frame time approximates as fully idling spheres?

I checked earlier to confirm that they were all deactivating properly (which is a checking operating that takes 10ms minimum in itself), which they were, quite rapidly, after I applied settings like "DeactivationManager.MaximumDeactivationAttemptsPerFrame = 1000;" and some other adjustments.

So I'm wondering, why is a deactivated entity as heavy as an entity moving in freefall?
Surely the falling ones have to do some amount of motion tracing and calculations, and the deactivated ones can do nothing at all (potentially even be removed from the list of things that need ticking?)
Norbo
Site Admin
Posts: 4929
Joined: Tue Jul 04, 2006 4:45 am

Re: Abuse-Case Physics Timing

Post by Norbo »

Physics time is dominated by collision detection and resolution. Integrating positions is usually very cheap in comparison. While falling, the majority of the time is probably being spent in the broad phase collision detection phase since there are no or few active collision pairs. When objects go to sleep, the broad phase is still forced to consider them- new collision pairs are how objects get activated.

Any reduction in cost from not doing position updates is also probably going to be outweighed by the more stressful configuration in the broad phase when everything is in collision- it will tend to traverse all the way to the leaf nodes more often.

As usual, v2 changes things here. One of the major design goals of v2 is to be able to handle worlds with extreme numbers of static and inactive objects without choking, because it's a pretty common pattern for large open world games. The overhead of additional statics or deactivated objects in v2 should be similar to the overhead of having an additional triangle in a StaticMesh in v1 (that is, logarithmic and basically irrelevant).

(As a sidenote, 30-50ms is really high for the described simulation. On my 3770K with multithreading, I see physics times of ~4ms in freefall, ~5ms while inactive. If you're running on a device or configuration that you expect to be 5-10 times slower than a 3770K using multithreading, it might be normal, but otherwise there could be some fixable performance problem or the cost is somewhere else.)
landing on the box: another freeze, shorter, about 500ms frame time
A lot of this is probably creating enough collision pairs for all the new collisions. If you ever run into this in a 'real' application, you could probably reduce the hiccup by preallocating collision pairs at launch by doing something like:

Code: Select all

NarrowPhaseHelper.Factories.BoxSphere.EnsureCount(16384);
NarrowPhaseHelper.Factories.SphereSphere.EnsureCount(16384);
mcmonkey
Posts: 92
Joined: Fri Apr 17, 2015 11:42 pm

Re: Abuse-Case Physics Timing

Post by mcmonkey »

I'm testing on a i7-4710HQ, so I would imagine there is indeed something going wrong in my simulation setup...

This isn't my primary project by the way; I'm doing a side-project using Unity with a friend.

Unity reports 15ms empty, 30-50ms idle, and throughout it all, render time nearly nonexistent because it's just rendering whichever low-poly spheres are directly in view.

Actually, now that I think about it... my times are /way/ too high compared to yours. Because: you probably simulated "correctly", IE 60 framerate physics assumption (default config)

I, on the other hand, chose to test with:

Code: Select all

            PhysicsWorld.TimeStepSettings.TimeStepDuration = UnityEngine.Time.deltaTime;
            PhysicsWorld.Update(UnityEngine.Time.deltaTime);
When I disable this bit of genius (comment out the first line), my times go up to 300-400ms/frame. (200 ms/frame if I disable CCD)

I can sure with 80% surety that the lag is in the physics calculations somewhere... (I can simulate this many objects in view with BEPU disabled for them, and it's near lagless)

(Also, unrelated, Unity Editor seems to crash an awful lot after installing BEPU into it - I think something is disposing improperly, but I'm not quite sure how to go about diagnosing that.) (It first crashed regularly from having an accidental second space free-floating, after getting rid of that it still crashes once in a while for reasons I haven't figured out yet - possibly the main space needs some disposal process I'm not remembering/aware of?)
EDIT: Disregard that editor crashing, I had mistranslated a line of code.
ParallelLoopWorker.cs -> "getToWork.Dispose();" needed to become "getToWork.SafeWaitHandle.Dispose();", I didn't realize that and it was the source of the flaw!

(also: Eagerly awaiting early passes of general functionality in BEPUv2)


I wrapped a stopwatch around the physics update method:
25-30ms/frame times from that exact execution.


Some environment if it helps:
ParallelLooper is provided with 8 threads (to match Environment.ProcessorCount)
AllowedPenetration is 0.005
Deactivation stuff:

Code: Select all

            PhysicsWorld.DeactivationManager.MaximumDeactivationAttemptsPerFrame = 1000;
            PhysicsWorld.DeactivationManager.LowVelocityTimeMinimum = 0.1f;
            PhysicsWorld.DeactivationManager.VelocityLowerLimit = 0.1f;
^ those settings help ensure objects get deactivated properly and in rapid order.

I added the Factories bit you mentioned, and that reduces collision-with-box time to about 100ms or less, which is fair enough.

Most of the rest of the setup is defaultly configured BEPU.

Any idea where I might look or what I might do to diagnose things to better decipher where it's going wrong? I'm not quite sure how to attach a profiler to Unity...
Norbo
Site Admin
Posts: 4929
Joined: Tue Jul 04, 2006 4:45 am

Re: Abuse-Case Physics Timing

Post by Norbo »

Jumping up to 400ms frame times from 30-50ms doesn't make a lot of sense unless the Space.TimeStepSettings.MaximumTimeStepsPerFrame has been increased above 3 and the deltatime being provided is abnormally large (e.g. measured in milliseconds). What happens if you just use a time step duration of 1/60f and call Space.Update() without a parameter to trigger a single time step?

It could be caused by something in the environment. Unity uses a different runtime than the usual desktop framework (or used to, at least- it's moving to newer runtimes and IL2CPP). Also, is it running with full optimizations without a debugger attached? Even debug mode shouldn't slow it down that much, but it does have an impact.

One option would be to replicate the simulation portion in a regular desktop application, isolated from everything else. If the performance is significantly better there, you would at least know it's not related directly to the simulation. If it is still super slow, I could take a look.
nug700
Posts: 9
Joined: Sat Dec 03, 2016 2:34 am

Re: Abuse-Case Physics Timing

Post by nug700 »

Didn't read everything so not sure this is 100% relevant, but make sure you are using Release configuration when building BEPU as Debug slows it down A LOT (those frame times seems similar to mine when using debug with a lot of objects).
mcmonkey
Posts: 92
Joined: Fri Apr 17, 2015 11:42 pm

Re: Abuse-Case Physics Timing

Post by mcmonkey »

Hahaha hwhoops.

So, a few things.

I'm not good with Unity and didn't realize the editor was in incredibly slow debug config - running a release build improved the timing to 14-16ms per frame.

I also created a perfect replica C# console app that gets the same result in standard Visual Studio debug mode (16 ms/frame).

In release mode, it gets 9ms/frame.

I feel like there's still room to improve this though.

You mentioned your roughly equivalent (I think? abouts the same ranges) processor getting 4ms in a similar simulation.

perhaps you could try my version of the simulation, see how it runs for a perf comparsion, then see if there's any adjustments that need to be made that could improve it to run at that golden 4ms/frame?

The code is a bit odd, it's designed to simulate unity set up as closer as possible, EG I add 15 ms time for physics to simulate in part because that's about the time added by general unity stuff.

Also as you can see below, yes, I definitely increased the timestepsperframe and ignored it forever and didn't pay attention to it when simulating with a static delta time.

(Also, if you or anyone know how to make Unity run with the same performance as a non-unity test app, that'd be really helpful - Unity tested in full build with development mode off gets almost double the frame time usage for simple calcs...)

(Also, we found that it does still crash after the crash fix, just much much less often - no idea why that'd happen)

Here's an entire single-file megatest console application!
Open in VS, attach BEPU as a reference, compile, run, watch the console readouts - the final average is probably the important output, and it generally aligns with the final frame time too.

Code: Select all

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Diagnostics;
using BEPUphysics;
using BEPUphysics.CollisionShapes.ConvexShapes;
using BEPUphysics.Entities;
using BEPUutilities;
using BEPUutilities.Threading;
using BEPUphysics.Settings;
using BEPUphysics.NarrowPhaseSystems;

namespace PhysTimer
{
    class Program
    {
        static void Main(string[] args)
        {
            ParallelLooper Looper = new ParallelLooper();
            for (int i = 0; i < Environment.ProcessorCount; i++)
            {
                Looper.AddThread();
            }
            Space PhysicsWorld = new Space(Looper);
            CollisionDetectionSettings.AllowedPenetration = 0.005f;
            PhysicsWorld.TimeStepSettings.MaximumTimeStepsPerFrame = 10;
            PhysicsWorld.ForceUpdater.Gravity = new Vector3(0, -9.8f, 0);
            PhysicsWorld.DeactivationManager.MaximumDeactivationAttemptsPerFrame = 1000;
            PhysicsWorld.DeactivationManager.LowVelocityTimeMinimum = 0.1f;
            PhysicsWorld.DeactivationManager.VelocityLowerLimit = 0.1f;
            NarrowPhaseHelper.Factories.BoxSphere.EnsureCount(16384);
            NarrowPhaseHelper.Factories.SphereSphere.EnsureCount(16384);
            BoxShape box = new BoxShape(500, 12, 500);
            Entity boxent = new Entity(box, 0);
            boxent.Position = new Vector3(0, -6, 0);
            PhysicsWorld.Add(boxent);
            Random rand = new Random();
            for (int i = 0; i < 10000; i++)
            {
                float val = (float)rand.NextDouble() * 0.5f + 0.2f;
                SphereShape shape = new SphereShape(val * 0.5f);
                Entity sphere_ent = new Entity(shape, val * 10f);
                sphere_ent.Position = new Vector3((float)rand.NextDouble() * 500f - 250f, 10f, (float)rand.NextDouble() * 500f - 250f);
                sphere_ent.PositionUpdateMode = BEPUphysics.PositionUpdating.PositionUpdateMode.Continuous;
                PhysicsWorld.Add(sphere_ent);
            }
            Stopwatch sw = new Stopwatch();
            const int COUNT = 600;
            double total = 0.0;
            sw.Start();
            for (int i = 1; i <= COUNT; i++)
            {
                sw.Stop();
                double time = sw.ElapsedTicks / (double)Stopwatch.Frequency;
                total += time;
                Console.WriteLine(i + ") Time: " + time + ", average: " + (total / i));
                sw.Reset();
                sw.Start();
                time += 0.015;
                PhysicsWorld.TimeStepSettings.TimeStepDuration = (float)time;
                PhysicsWorld.Update((float)time);
            }
            Console.WriteLine("Test complete!");
            Console.ReadLine();
        }
    }
}

Norbo
Site Admin
Posts: 4929
Joined: Tue Jul 04, 2006 4:45 am

Re: Abuse-Case Physics Timing

Post by Norbo »

On that simulation, my 3770K averages 7.27ms. There are a few differences that probably explain the difference with my earlier test:
1) There are a few more collisions in this due to the random horizontal positions.
2) Due to the randomization, there are still active objects at the end of test. Spheres are still rolling around a bit.
3) CCD is on. That isn't super expensive, but it's something.
4) The timestep is variable length and tends to be longer. Longer time step durations tend to be more expensive per time step since more stuff happens in each one. In this simulation this is a pretty minor effect. The main reason I bring it up is that, unless you have a very specific goal in mind that makes it valid, that method of computing the time to simulate is likely not right. If you expect the time between frames to be 15ms in real time, then the simulated update duration should be 15ms, not 15+calculation time- that would be double counting. Also, variable time steps can worsen stability; be careful.

As for the remaining ~24% difference on this simulation, the i74710HQ is a bit slower than the 3770K judging by the ark specs. IVB->Haswell was only about 11% IPC improvement, and the 4710's turbo clock is only 3.5ghz. If the 4710 is running with 3.25ghz turbo with all threads active and note that my 3770K is overclocked to 4.5ghz, that's (4.5ghz/(3.25ghz * 1.112)) * 7.27ms = 9.05ms, which is about right.
Post Reply