You're not using interpolated states in your own game? How do you deal with variable frame times then?
I just built a game-specific system. I already had a semidynamic smoothing system for graphics built to handle network updates without visual snapping. Every object that could undergo correction had a current position/velocity state and a target position/velocity state, and they update like damped springy motors. I run a little implicit integrator on those states. It's kind of like a very simple physics simulation, except with only one thing involved at any time. So, when a network update update comes in with new position/velocity information, I immediately update the 'real' gameplay state and then just set the target state and let the integrator smooth it out visually.
From there, it's pretty easy to make the jump to handling variable frame times. Conceptually, the smoothing simulation can run independently from the game simulation and with a variable timestep thanks to the simulation simplicity and the implicit integrator. From its perspective, it's just receiving occasional new target states- it doesn't matter how they came about. So, if a graphics frame happens between physics updates, all it has to do is perform the graphical simulation starting from the last physical state, up to the current graphical time.
The main motivation here is a reduction in latency. When interpolating between the last two completed frames like the BEPUphysics implementation, you're always seeing results from the past. Extrapolation gives you a guess at what things should look like at any time with zero latency. That guess can be wrong, but when the simulation is running at 60 to 120hz, there isn't a lot of time for severe errors to accumulate. Network correction induced divergence tends to be much larger, and it doesn't cause any significant problems.
The devil is in the details, of course- things don't stay simple once you think about every single bone in every single character's animation being extrapolated, and needing to also extrapolate IK targets for feet on the ground, and every object could potentially require custom simulation logic for optimal extrapolation, and so on. But it works pretty well!