Voxel Renderer: Parallellism is key

I’ve implemented a stable version of the Amanatides and Woo algorithm for fast traversal of voxel grids and added some optimisations and support for non-square voxels.

After reading up on OpenMP, I couldn’t stand having a renderer which ran an inherently parallel algorithm on just one CPU. Every ray shot (we shoot rays to ‘pierce’ the data grid) is independent: the only thing the threads need is read access to the data structure which holds the grid information. Notice that I’ve just scratched the surface of parallel programming – lots of other optimisations are possible, more efficient data structures, etc … I’m sure my colleague Ares Lagae can advise me on that, since he’s a real C++ mastermind.

In the following tests I rendered images on a “first found” basis: we render the first nonempty voxel the ray encounters. This is a field of 256x256x256 colorfield / randomly-colored, slightly rectangular voxels. Thats a bit more than 16 million data points. Pretty impressive how a software-mode ray caster just burns through all that at a decent speed on such a high resolution (2000×2000) If you only work with dense objects, on a more human resolution (800×600) the rates can be considered real time – can’t wait for my GPU implementation!


Now that I have Amanatides/Woo(and thus a list of voxels the ray pierces), the need for marching is gone. Other interesting problems arise however:

  • Our Amanatides/Woo algorithm should be incremental. We don’t want a huge list of voxels the ray pierces only to find out that we only need the first dozen of them.
  • Empty space (sparseness) in the data set still has the biggest impact on performance. This is to be expected, since we don’t have a hierarchical data structure yet: every empty space is built out of thousands of empty voxels.
  • The Amanatides/Woo algorithm also gives us the intersection points: we have to use these and trilinear interpolation to get the correct value for a point, and calculate the weights. Lots of manual tweaking here, depending on data set.
  • Which brings us to my next point: I should get working on some drag/rotate UI.

Next step is to write an importer that can import some obj files, perform sampling on them to transform them into a voxel field. Let’s get rid of these stupid cubes!

Oh, and since I love publishing bloopers, this is what happens when you mess up your inner loop variables/invariables with OpenMP: