11 January 2018

Softbodies in Vulkan

I've been writing a 3D game engine in Rust.

A lot of what makes NMG entertaining is its physics. To that end, I will be bootstrapping a physics engine from scratch for the game in order to control the effect I'm going for as well as ensure performance.

The original prototype used rigidbody physics and was heavily dependent on constrained joints and spring-damper systems. This required a lot of tuning and didn't scale very well. It also made advanced effects such as mesh deformation difficult to achieve.

For the new engine, I've decided to build a softbody physics engine centered around Verlet integration. If you're unfamiliar with the concept, I highly recommend this (very readable) paper, which offers a good introduction to the subject despite being 15 years old.

Verlet integration is a well-known method and can be used either for simulating rigidbody dynamics or modeling a large number of particles with defined constraints. Given that NMG utilizes a low-poly visual style, mapping mesh vertices to particles one-to-one is not out of the question.

All in all, such a system is exciting for NMG for two reasons:
1. The mechs in NMG require a robust and dynamic joint system, and the particle/constraint approach makes modeling complex constraints straightforward.
2. NMG involves a lot of things satisfyingly hitting other things, and a general softbody system enables me to make essentially everything deformable (or jiggly!). This gives the game a unique aesthetic.

So how does such a system make its way into the engine?

First off, we need a place to crunch the numbers. I structured the engine using an Entity-Component-System (ECS) architecture, coupling the component data and the system behavior. This data-oriented approach was inspired by Niklas Gray and should give our softbodies the performance they need.

I added a softbody component manager, which can be queried externally and simulates the system once every fixed update. At a high level, the manager contains a number of instances; each instance contains a number of particles. A simulation step iterates through every instance in the system and updates the positions of its particles using the Verlet integrator and any constraints specified by the developer. These calculations are highly parallelizable and my preliminary AoS approach is probably slower than it should be. Later they may get moved to a Vulkan compute shader.

Finally, we need a way of seeing our particles in action! The Vulkan renderer I wrote can draw arbitrary instances of models loaded at initialization time. On startup, the renderer loads the models specified and packs their data into a couple of static buffers on the GPU. Later, per-frame data (e.g. the MVP matrices) is passed to the vertex shader using a pre-allocated dynamic uniform buffer. Command buffers are re-recorded every frame, since the renderer doesn't necessarily know how many instances you want to draw.

However, for softbody physics, we need to we able to deform the mesh at runtime! Modifying the vertex buffer is a bad idea--not only is it slow, but it would affect other instances of the same model, which we don't want. We might take a page from skinned mesh renderers and use vertex blending, but I decided to go with a simpler approach more in line with the goals of the game.

GPUs specify a minimum supported offset alignment for uniform buffers. On my lame integrated graphics processor, this alignment is 16 bytes. On nicer cards, you see numbers as high as 256 bytes.

Why is this important? Currently I'm only shipping one matrix per instance in my dynamic uniform buffer. One matrix is 64 bytes (with 32-bit floats), so that's 192 bytes being wasted! I decided to fill that space with the vertex information spat out from my physics engine and use it to offset the mesh vertices in the vertex shader. It's possible to pack 16 offsets into that space, but the GLSL/SPIR-V compiler doesn't like this, probably because that kind of indivisible packing is murderous to the GPU. Padding our vectors with an extra float gives us 12 offsets to work with.

Because I pass in offsets to the vertex shader instead of positions, the shader can choose to ignore the data if it wants to. Otherwise, the offsets are applied to each vertex. 12 may seem like a low number, but in the future they will probably be used as proxies for deformation of a higher-poly model. Worst comes to worst I can "sacrifice" some performance by aligning the dynamic uniform buffer to 512 bytes.

And that's all! If you've cloned the engine repository, you can run the softbody demo using cargo run --example softbody. In debug mode the demo renders in wireframe, with color dynamically denoting particle constraints. In release mode it renders a solid mesh. I may post a video here later for the lazy.