One effect I was asked for a lot for demos by artists was “fluids”. Fluid dynamics make for great visuals – they look complicated, with minute details, yet they make sense to the human eye on a larger scale thanks to their physical basis. Unfortunately they are rather difficult to do. But if it were easy it wouldn’t be fun, right? Fluid dynamics and rendering has become an area of research I’ve kept coming back to over the years, and my journey started at the end of 2006. I immediately discounted simple fakes and decided to try and do something pretty “real” for our demos in 2007.
The first problem is that fluids take some serious computation power to calculate. To do it “properly”, what you have to do is simulate the flow of velocities and densities through space using a set of equations – “Navier Stokes”. I’m no mathematician, but fortunately these equations don’t look half as nasty in code as they do on paper. The rough point of them is that the velocites/densities at a certain point are interacted with other points nearby in space in the right way.
The basic approach follows two paths, depending on how you want to model space – you can use particles (SPH), or a grid. Grids are bounded in the area they cover, and their memory requirements depend on their size, but the equations are simpler and tend to be faster to solve – with grids, you automatically know what points are nearby. densities/velocities tend to dissipate and you lose details. Their rendering suits gas-like fluids well, as it’s typically visualised using a ray march through the volume – which suits transparent media. Particles can suit liquids better, and are unbounded – they can go where ever they want – and you don’t lose details in the same way as grids. You need a lot of particles, and the equations are a bit uglier and slower to solve because you have to work out which particles are near each other particle – it actually has a lot of similarities with flocking behaviours. Rendering particle fluids usually involves some sort of implicit surface formed from the particles and visualised using a raytracer or polygonised with marching cubes.
Modern offline fluid solvers tend to support one or both methods of fluid calculation. Fluids are big business – movies, adverts, all sorts of media use them. Offline renderers support hundreds of thousands of particles or huge grids, and can take hours to compute a few seconds of animation. So it’s going to be hard to compete with realtime.
I started off with grid solvers. They’re easier to understand and there are good examples and tutorials/papers out there that explain how to do it – and they aren’t packed full of magic numbers. The navier stokes equation breaks down into the following steps:
Update velocity grid:
– take the previous frame’s velocity grid
– perform a diffusion step – a bit like a blur
– perform a projection step – makes it mass-conserving, and its what gives the effect that swirly quality. In reality that means a linear solver with 10-20 steps, meaning looping over the whole grid 10-20 times. It’s the slow bit.
– perform an advect step, which is like doing “position = position + velocity” in reverse – to pull the velocities from the previous frame’s grid into the new grid.
Update density/colour grid:
– perform a diffusion step
– advect using the velocity grid.
You’ll soon realise you can cut out the diffusion steps – you can lose them without hurting the final result.
The first attempt I made was to do a 2D grid solver. That’s pretty easy – there’s a good example released by Jos Stam (probably the “father” of realtime fluid dynamics) some years ago which is easy to follow, although it needs optimising. The nice thing about 2D grid fluids is that they map very simply to the GPU – even back in the days of shaders 2.0. The grid goes into a 2D texture, and the algorithm becomes a multi-pass process on that texture. That worked out great – a few days of work and it was usable in a demo. It made it into halfsome in 2007. The resolution was good – we could have one fluid solver running at 512×512 well, or several at 256×256 – which proved to be ample. It was quite easy to control, too – we could simply initialise the density grid with an image, the velocity grid with random blobs, and the image was pushed around by the fluid.
2D fluids are nice, but 3D is better. But 3D brings a whole new set of problems. It was quite easy to extend a 2D solver to 3D on both CPU and GPU. Sadly at the time and on DirectX9, GPUs could not render to volume textures – so the problem could not be extended simply by changing the texture from 2D to 3D. Instead I had to lay out the “volume” as a series of slices on a 2D texture. That was hard to get right, but apart from that it extended easily. The problem was it was rather slow. The GPU at the time (GF6800 was the card du jour) just didn’t have the performance to handle it, when the extra overhead from fixing up and sampling from the 2D slice texture came into account. So the next option was to go CPU – I spent quite a long time hand optimising the code in SSE2 intrinsics, and then wrote out a volume texture in the end for rendering. Unfortunately the algorithm is in parts heavily memory/cache limited – the advect stage in the equation jumps around in memory almost randomly. In fact, in some parts of the solver the GPU was far faster thanks to more efficient memory access, and in other parts the CPU won out thanks to raw calculation performance and being able to re-use previous cell values. (Note, this is back in 2006-2007, Core Duo vs a GF6800.)
Finally I had a solver. Now, how to render the results? Raymarching the “volume texture” was quite slow – rendering the slices as a series of quads worked out better. Now the fun stuff started. I realised the key to a good looking was lighting – and for a semi-transparent thing like smoke/gas, that means shadows with absorbtion along the shadow ray. Ray marching in the shader or on CPU was out of the question, but fortunately there was an easy fake – assume a light that was only from the top down and do a “ray march” which added the value of the current grid cell to a rolling sum, and wrote the current sum back to the grid cell as a shadow value, then moved to the next cell immediately below it. That could be made even easier on my CPU solver by flipping the grid around so that the Y axis of the fluid was the “x axis” of the grid – and the sum could be rolled into the output stage of the copy to volume texture – so the whole shadow “ray march” was almost completely free.
After some time experimenting I discovered that the largest grid resolution I could get away with performance-wise was 64x32x32. Unfortunately it looks pretty rough at that. I tried a few things with octrees to avoid empty space but it just didn’t work – with my small grid, the whole space got filled quickly. In the end I simply doubled the res of the density grid and interpolated the velocities – so the slow part of the equation, updating the velocities, ran on a grid with 1/8 the number of cells as the density grid – which is the one where you really notice the blockiness.
It worked. It ran at framerates of at least 30fps realtime, with a grid res of 128x64x64 for the densities and 64x32x32 for the velocities. At the time of release it was probably the fastest and most powerful CPU grid solver ever made. It was even compared to the nvidia 8800 demo – which was running at a significantly higher resolution, but on vastly superior hardware and without lighting. We used it in media error, although we were so rushed with the demo that it only got used in a rather boring way – as “smoke in a box”. And it made me want to do something bigger and better.