Issues with Instanced Mesh (tiny triangles, frequent moving)

Frooxius · Post by **Frooxius** » Tue Aug 16, 2016 4:32 am

Hello!

First of all, thanks for creating BEPU Physics, I love it. Having a well designed, performant physics engine in pure C# is a godsend (saves the pain of updating wrappers for a native one and recompiling for each target platform for starters, plus it's fun to play with!

I'll be throwing some money your way soon ).

I have encountered a few minor issues. I managed to find solutions for two of them, but I'd like to check, as I'm not 100 % sure they won't cause some problems down the road.

Performing ConvexCast on a very hi-poly instanced mesh is very expensive when the cast encompases large chunk of triangles. I only need to detect whether there's an intersection or not, but I don't need any details or the number of triangles. What is the best way to do this efficiently?

When I use InstancedMesh collider with a high density mesh, raycast against this collider ignores triangles smaller than certain threshold, while convex body sweep works fine. I have found that this is caused by too large (relatively speaking) Epsilon in the "FindRayTriangleIntersection" function, where it checks for a degenerate triangle. Decreasing epsilon or skipping this check seems to work, as I have decided for the former (so that actual degenerate triangles are still filtered). Could smaller epsilon cause some inefficiencies (or other problems) in other scenarios (most other objects have "regular" sizes).

Sometimes the instanced meshes move every single frame for a period of time. I have found that for very hi-poly meshes (dozens of thousands to millions of triangles - usually 3D scans) this caused frame drop. I have found out that this was caused by recalculating of the bounding box each time I modified the affine transform property, as the recalculation uses all of the vertices.

I have modified this behavior, so it calculates the bounding box for the mesh only once in its default position (that's actually done in my own mesh class, so I just use that data) and then I calculate new bounding box by transforming the vertices of the old one and building a new bounding box from these (essentially treating it as if the original mesh was a box).

I realize this will result in suboptimal bounding boxes for rotated meshes, but stable framerate is crucial (it's VR application). Could having larger bounding box cause some other problems though? Would it be okay/worth-it if I ran an efficient bounding box calculation on background thread once its stops moving and then just assigned it once it's done (I'm not using asynchronous updating at this point)?

Additionally, is there a nicer way to do this, that wouldn't require me to use a modified version of the library (it's not a big problem, but I'd like to avoid making modifications to the library itself if possible)?

Post by **Norbo** » Tue Aug 16, 2016 10:03 pm

First of all, thanks for creating BEPU Physics, I love it. Having a well designed, performant physics engine in pure C# is a godsend (saves the pain of updating wrappers for a native one and recompiling for each target platform for starters, plus it's fun to play with! I'll be throwing some money your way soon ).

Glad you like it

Performing ConvexCast on a very hi-poly instanced mesh is very expensive when the cast encompases large chunk of triangles. I only need to detect whether there's an intersection or not, but I don't need any details or the number of triangles. What is the best way to do this efficiently?

Two things will likely give you significant improvements:
1) Make your own boolean-only version of the convex cast. It would be almost identical to the existing one, but instead of using MPRToolbox.Sweep, it would use MPRToolbox.AreSweptShapesIntersecting. The regular sweep actually uses this as a prepass, so this only saves time when the convex cast is intersecting.
2) Make the candidate selection less dumb. I implemented convex casts very lazily under the assumption that they would always be very, very short. Their primary purpose is in CCD, so they mainly need to span the distance between an object's previous frame and current frame positions. So, rather than having a dedicated tree traversal, they expand the bounding box to contain the entire sweep... which means for a long diagonal sweep, the vast majority of the volume is wasted. Adding a dedicated swept-AABB traversal to the trees would help a lot.

On the hacky side of things, there's the option of spraying a bunch of rays approximating the convex cast.

(Future note: v2 will absolutely have a proper traversal. It wouldn't even take long to add to v1.4.X's tree but I'm trying to stop myself from getting distracted

)

When I use InstancedMesh collider with a high density mesh, raycast against this collider ignores triangles smaller than certain threshold, while convex body sweep works fine. I have found that this is caused by too large (relatively speaking) Epsilon in the "FindRayTriangleIntersection" function, where it checks for a degenerate triangle. Decreasing epsilon or skipping this check seems to work, as I have decided for the former (so that actual degenerate triangles are still filtered). Could smaller epsilon cause some inefficiencies (or other problems) in other scenarios (most other objects have "regular" sizes).

Decreasing epsilon across the board could have some sneaky side effects considered all the places it is used- it's related to world scale. (I'd like to make these scale tuning factors less pernicious in v2...)

If you know for a fact that all meshes being raytraced won't have degenerate triangles, then eliminating the check in FindRayTriangleIntersection will be fine. You could also just use a much more aggressive epsilon for that function specifically- realistically it could go down a few orders of magnitude without causing problems.

Could having larger bounding box cause some other problems though? Would it be okay/worth-it if I ran an efficient bounding box calculation on background thread once its stops moving and then just assigned it once it's done (I'm not using asynchronous updating at this point)?

Unnecessarily large bounding boxes will just make its presence in the broad phase larger. If the larger bounding box puts it in testing range of more objects, then it would increase the collision detection cost. If there's nothing in the vicinity, it's no problem at all.

If the collision detection overhead caused by conservative bounds is an issue, asynchronously computing tight bounds could be useful. In that case, however, it may just be easier to use something like the MobileMeshShape's bounding hull vertices. By default it computes the convex hull of the input mesh and uses that hull to compute the bounding box. In the worst case, the convex hull operation doesn't help at all (a highly tessellated sphere), but most of the time it saves quite a few. The MobileMeshShape can also be constructed with arbitrary bounding hull vertices, so a simplified/approximate set could be provided.

Additionally, is there a nicer way to do this, that wouldn't require me to use a modified version of the library (it's not a big problem, but I'd like to avoid making modifications to the library itself if possible)?

There may be ways of eliminating library modifications if the problem could be changed a bit. As it is, the library just wasn't built for this use case in mind, so it needs some hand holding as you've found. My usual advice for a game-like application would be to massively simplify the physics representation of the mesh, preferably into simple convex bits. Ideally, a single box or sphere would suffice, but sometimes complex concavity is needed.

For complex concavity, I would recommend running offline mesh simplification and convex decomposition, like V-HACD. Note that convex hull collision detection costs are roughly linear with the number of vertices in the hull, so the simplification step can be important. BEPUphysics doesn't have any of these offline processes built in, but something like Blender could do it.

This sort of decomposition will be almost mandatory if you expect a nontrivial number of collisions spanning many triangles in the mesh each. In that situation, the speedup could be between 1-3 orders of magnitude, and most likely no library changes would be required.

Frooxius · Post by **Frooxius** » Tue Aug 23, 2016 8:54 pm

Thank you very much for the detailed reply, this helps me understand the engine a lot better.

1) Make your own boolean-only version of the convex cast. It would be almost identical to the existing one, but instead of using MPRToolbox.Sweep, it would use MPRToolbox.AreSweptShapesIntersecting. The regular sweep actually uses this as a prepass, so this only saves time when the convex cast is intersecting.

I've looked at the convex cast of the instanced mesh and I found that simply terminating once the MPRToolbox.Sweep finds a first hit helps tremendously with dense geometry data already (there's a lot of intersecting triangles), since I don't care about getting the closest hit. The TriangleMesh.Tree.GetOverlaps seems to be the biggest eater right now, since it still gets all the triangles in the area, but it doesn't seem to be causing much trouble (and I'm throwing ridiculously complex meshes at it anyway).

In this specific scenario my convex cast is actually super short (it's zero, I'm not sure if there's a better way to do this), although that won't always be the case (I'm integrating it with my engine and this is what I'm building on top).

(Future note: v2 will absolutely have a proper traversal. It wouldn't even take long to add to v1.4.X's tree but I'm trying to stop myself from getting distracted )

Yes, I'm very much looking forward to v2! Distractions have unfortunate tendency to pile up, so I think we're better off if you're working on that one instead.

If you know for a fact that all meshes being raytraced won't have degenerate triangles, then eliminating the check in FindRayTriangleIntersection will be fine. You could also just use a much more aggressive epsilon for that function specifically- realistically it could go down a few orders of magnitude without causing problems.

Can I just remove all the degenerate triangles (ones that have zero area I assume?) before instantiating InstancedMeshShape or is there some scenario where degenerate triangles might somehow arise after that point? I have complete control over instantiating InstancedMeshShape, so I can make sure the data is always filtered before passed on to BEPU.

Unnecessarily large bounding boxes will just make its presence in the broad phase larger. If the larger bounding box puts it in testing range of more objects, then it would increase the collision detection cost. If there's nothing in the vicinity, it's no problem at all.

Yeah, this is pretty much what I expected, that's okay. I was just worried if it wasn't going to mess up some calculations in a weird way (for example if raycast or some intersection check would freak out if some checks began outside of the optimal bounding box in the mesh's local space or something like that).

If the collision detection overhead caused by conservative bounds is an issue, asynchronously computing tight bounds could be useful. In that case, however, it may just be easier to use something like the MobileMeshShape's bounding hull vertices. By default it computes the convex hull of the input mesh and uses that hull to compute the bounding box. In the worst case, the convex hull operation doesn't help at all (a highly tessellated sphere), but most of the time it saves quite a few. The MobileMeshShape can also be constructed with arbitrary bounding hull vertices, so a simplified/approximate set could be provided.

Oh that's cool! I haven't considered that. I will still probably do it in the background asynchronously (I don't want to take any chances and make sure there are as few hitches as possible and I don't know what are users going to import or how), but I imagine it can speed up the background calculation. I assume if I just assign the optimal bounding box using the property on BroadPhaseEntry (synchronously, when Space.Update isn't running) once it's computed, it's going to be okay?

There may be ways of eliminating library modifications if the problem could be changed a bit. As it is, the library just wasn't built for this use case in mind, so it needs some hand holding as you've found. My usual advice for a game-like application would be to massively simplify the physics representation of the mesh, preferably into simple convex bits. Ideally, a single box or sphere would suffice, but sometimes complex concavity is needed.

For complex concavity, I would recommend running offline mesh simplification and convex decomposition, like V-HACD. Note that convex hull collision detection costs are roughly linear with the number of vertices in the hull, so the simplification step can be important. BEPUphysics doesn't have any of these offline processes built in, but something like Blender could do it.

This sort of decomposition will be almost mandatory if you expect a nontrivial number of collisions spanning many triangles in the mesh each. In that situation, the speedup could be between 1-3 orders of magnitude, and most likely no library changes would be required.

In the problematic scenario I'm essentially just trying to get a list of all colliders (not just instanced meshes) that overlap with a spherical area around user's controller when he presses a button in order to determine which objects he "grabs". The problematic scenario is when the objects are small-sized 3D scans with highly tessellated geometry (so there can be dozens of thousands of triangles overlapping with the sphere), but they can be grabbing anything.

The issue is that the scans are imported at runtime by user (or even automatically directly from a device to visualize a result), so performing the decomposition isn't possible right now (although I'd like to integrate it to be (semi)-automatic process later on for the assets they bring in), but luckily I don't need anything more complex than the boolean detection of the overlap for this interaction.

Anyway, I'll do with the library modifications, since some other issues seem to require it. Anyway thank you very much for your help and good luck with v2!

Post by **Norbo** » Tue Aug 23, 2016 9:51 pm

Can I just remove all the degenerate triangles (ones that have zero area I assume?) before instantiating InstancedMeshShape or is there some scenario where degenerate triangles might somehow arise after that point? I have complete control over instantiating InstancedMeshShape, so I can make sure the data is always filtered before passed on to BEPU.

Prefiltering would work; the engine doesn't do any additional processing that could introduce them later. For the purposes of the engine, a triangle should be considered degenerate if the cross product of any two of its edges fails to produce a numerically reliable normal direction. That can happen when near zero area or when the triangle is an extremely long and thin sliver.

I assume if I just assign the optimal bounding box using the property on BroadPhaseEntry (synchronously, when Space.Update isn't running) once it's computed, it's going to be okay?

Yup. The character controller does exactly that as a part of its update processes.

In this specific scenario my convex cast is actually super short (it's zero, I'm not sure if there's a better way to do this), although that won't always be the case (I'm integrating it with my engine and this is what I'm building on top). ...
In the problematic scenario I'm essentially just trying to get a list of all colliders (not just instanced meshes) that overlap with a spherical area around user's controller when he presses a button

In the zero-length case, you may be able to get significant performance improvements by using a query sphere instead of a convex cast. A sphere collision query will use the much faster sphere-triangle tests at the lowest level, rather than the general purpose minkowski sum approaches used by the convex cast.

An example of using a collidable as a query shape can be found in the character's QueryManager.QueryContacts function. It creates a temporary pair handler, does collision detection, and then cleans up the pair (and some other character specific stuff you don't need).

I can't guarantee with certainty that it'll always be faster, though. While the per-object test is much simpler, it does unnecessarily generate and filter contacts.

Frooxius · Post by **Frooxius** » Wed Aug 24, 2016 7:18 pm

Thank you very much for the explanation again!

In the zero-length case, you may be able to get significant performance improvements by using a query sphere instead of a convex cast. A sphere collision query will use the much faster sphere-triangle tests at the lowest level, rather than the general purpose minkowski sum approaches used by the convex cast.

An example of using a collidable as a query shape can be found in the character's QueryManager.QueryContacts function. It creates a temporary pair handler, does collision detection, and then cleans up the pair (and some other character specific stuff you don't need).

I've looked at the QueryContacts function, there seems to be only one variant for the character body, but I think I should be able to create a modified version (seems like it would be mostly removing the CategorizeContacts at glance). I'd like to ask how does the collidable pair behave in a scenario, where the sphere completely envelops an instanced mesh collider with dense geometry (so that all triangles are within the sphere). Will it go through all the triangles or just a fraction of them? I have looked at the narrow phase handlers for the instanced mesh, but I don't really understand it sufficiently so far to be able to tell myself.

One thing I'm considering is just modifying the ConvexCast for instanced mesh so instead of getting all the overlaps at the beginning, it would query the hierarchy one by one, testing each overlapped triangle immediately and terminating on the first overlap, since that's the biggest slowdown at the moment.

Post by **Norbo** » Wed Aug 24, 2016 8:31 pm

By default, the pair handler is a contact generating system, so it can't stop on the first hit. A boolean-only version of the convex cast that early-outs would indeed be way faster for your purposes if you're willing to make the modification.

(Yet another thing I'd like for v2 is the ability to treat the traversal as an enumerable, so you could do a simple foreach over a traversal with the option of breaking out at any time without visiting the rest. And for things like ray casts and convex casts, something on the enumerable that you could modify mid-enumeration to clamp the maximum ray distance to accelerate first impact searches.)

BEPUphysics

Issues with Instanced Mesh (tiny triangles, frequent moving)

Issues with Instanced Mesh (tiny triangles, frequent moving)

Re: Issues with Instanced Mesh (tiny triangles, frequent mov

Re: Issues with Instanced Mesh (tiny triangles, frequent mov

Re: Issues with Instanced Mesh (tiny triangles, frequent mov

Re: Issues with Instanced Mesh (tiny triangles, frequent mov

Re: Issues with Instanced Mesh (tiny triangles, frequent mov