What exactly is that? What's it do? How is it built?
It's a structure designed to accelerate queries by pruning out all of the irrelevant objects.
Consider a mesh with 100,000 triangles. Without any acceleration structure, you'd have to test each of the 100,000 triangles in sequence. That would be very, very slow.
Instead, those 100,000 triangles could be put into a bounding volume hierarchy. If a ray hits the root node, test the ray against the root's children. If it hits root.ChildA but not root.ChildB, intersect root.ChildA and ignore root.ChildB. Now, do it again for the children of root.ChildA, and so on. Each step has a high chance of eliminating a huge number of potential intersections. So, instead of doing 100,000 tests, it only takes around log(n) ray-bounding box tests (i.e. probably less than 100, and as low as 15-20).
That sort of tree is what the StaticMesh uses. The construction process is extremely simple at the moment- it just takes each object and tries to greedily insert it into the tree. That insertion process traverses the tree, taking the path down to the leaf nodes which fits the new object the best. This sort of construction, being simple and O(nlog(n)) in non-pathological cases, tends to be very quick relative to heavier/'smarter' constructions like SAH optimization.
Binary bounding box hierarchies are certainly not the only kind of acceleration structure, but they're pretty common. Other options include sort and sweep/sweep and prune, uniform grids, quad/octrees, k-d trees, simd-friendly N-ary trees, and a whole lot more. Each has its own strengths, weaknesses, and peculiarities- for example, sweep and prune is often used by physics engines for the broadphase thanks to good collision pair detection performance, but it doesn't provide great ray query acceleration and is vulnerable to a few corner cases.
How would I input a custom one? (I have a /ton/ of information pre-available about the object I'm building - EG it's general AABB limits and so on... perhaps I could generate this hierarchy more quickly than manually reading through the full vertex list could?)
If you wanted to make your own acceleration hierarchy from scratch, you would basically be making your own version of the StaticMesh. It's possible, but difficult as mentioned before. It requires a lot of familiarity with the internals of the engine. If you're curious, you can look up a few of the data structures I mentioned up above.
I'm curious /why/ that caused lag.
The heuristics used to build the acceleration tree have some weaknesses. Specifically, it's possible to make the tree highly unbalanced. Rather than having O(log(n)) accesses to reach a leaf node, it's possible to end up with a tree that's more like a long line where it may take O(n) accesses. The existing tree construction is insertion-based, so the longer the path to the desired position is, the longer the construction takes. In the pathological case of one long line, that would take O(n^2) instead of O(nlog(n)). Most likely, this is what you encountered.
A better heuristic wouldn't have this sort of weird corner case. I'll probably fix it in the rewrite.
If I wanted to spawn an object in a space, would I need to lock on anything, or could I safely do that from any given thread? What about building the object (new StaticMesh() type lines)?
An object can be constructed on any thread at any time.
Only one thread at a time should make any changes to the Space (which includes adds/removes), and no changes should occur while the Space is updating. Further, be careful about
reading from Space-handled data while the Space is updating. (This is technically more strict that it has to be, but it's a good rule of thumb.)
viewtopic.php?t=1028 <-- this post implies that there is no multithreading by default, and has to be enabled manually with a small bit of code (written out in the link) - is this correct?
That's a bit old. All you have to do now is pass the Space constructor an IParallelLooper that has a ThreadCount of 2 or more.
- Does the threading code merely do something along the lines of / similar to "Parallel.ForEach" inside the update method - IE, running all the updates in parallel instead of one-by-one, or is there more to it than that? (Does it accelerate ConvexCast? ...? )
The engine's Update is essentially a series of for loops, and parallelizing it is indeed running those for loops in parallel. Of course, there's a lot of trickiness involved to make that work efficiently.
The multithreading only affects the Space.Update's stages. So things like the BroadPhase, NarrowPhase, and Solver are all multithreaded, but an individual ray cast or convex cast is not. There's no real way to efficiently multithread something so tiny due to overhead. Of course, if there are a bunch of casts to perform, a parallel loop across all the casts would provide a great speedup.