<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>direct to video</title>
	<atom:link href="http://directtovideo.wordpress.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://directtovideo.wordpress.com</link>
	<description>Just another WordPress.com weblog</description>
	<lastBuildDate>Mon, 12 Dec 2011 15:18:00 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='directtovideo.wordpress.com' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>direct to video</title>
		<link>http://directtovideo.wordpress.com</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://directtovideo.wordpress.com/osd.xml" title="direct to video" />
	<atom:link rel='hub' href='http://directtovideo.wordpress.com/?pushpress=hub'/>
		<item>
		<title>numb res.</title>
		<link>http://directtovideo.wordpress.com/2011/05/03/numb-res/</link>
		<comments>http://directtovideo.wordpress.com/2011/05/03/numb-res/#comments</comments>
		<pubDate>Tue, 03 May 2011 16:39:23 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[demoscene]]></category>
		<category><![CDATA[fluid dynamics]]></category>
		<category><![CDATA[particles]]></category>
		<category><![CDATA[realtime rendering]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=364</guid>
		<description><![CDATA[Numb Res by CNCD &#38; Fairlight pouet exe version video video (anaglyph 3d) youtube vimeo Begin It was easter. We made a new demo for The Gathering 2011.Yea, that&#8217;s right &#8211; in Norway, not in Germany. I really wanted to do a new demo because I&#8217;ve been collecting new routines all winter, and it was [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=364&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Numb Res by CNCD &amp; Fairlight</strong></p>
<p><a href="http://directtovideo.files.wordpress.com/2011/04/screenshot1.jpg"><img class="alignnone size-medium wp-image-365" title="numb res" src="http://directtovideo.files.wordpress.com/2011/04/screenshot1.jpg?w=945&#038;h=528" alt="numb res. get it?" width="945" height="528" /></a></p>
<p><a title="pouet" href="http://www.pouet.net/prod.php?which=56900" target="_blank">pouet</a> <a title="exe" href="ftp://ftp.scene.org/pub/parties/2011/thegathering11/demo/cncdflt_numb_res.zip" target="_blank">exe version</a> <a title="video" href="http://scene.org/file.php?file=%2Fparties%2F2011%2Fthegathering11%2Fdemo%2Fcncdflt_numb_res.mp4&amp;fileinfo" target="_blank">video</a> <a title="video 3d" href="http://scene.org/file.php?file=%2Fparties%2F2011%2Fthegathering11%2Fdemo%2Fcncdflt_numb_res_anaglyph.mp4&amp;fileinfo" target="_blank">video (anaglyph 3d)</a> <a title="youtube" href="http://youtu.be/LTOC_ajkRkU?hd=1" target="_blank">youtube</a> <a title="vimeo" href="http://www.vimeo.com/23216699">vimeo</a></p>
<p><strong>Begin</strong></p>
<p>It was easter. We made a new demo for <a title="The Gathering" href="http://www.gathering.org">The Gathering 2011</a>.Yea, that&#8217;s right &#8211; in Norway, not in Germany. I really wanted to do a new demo because I&#8217;ve been collecting new routines all winter, and it was high time they got into the wild. So about 3 weeks before easter Jani and I started bouncing ideas around (&#8220;something with fluids&#8221; was the sumtotal of that I think). Then we went on the hunt for music. As some may know, we don&#8217;t have an active musician we work with regularly in Fairlight or CNCD anymore; we have to outsource. So I dropped a message on facebook half-jokingly asking if anyone had a spare soundtrack. I&#8217;m not sure whether that was a good idea or not but I spoke to Ruairi (RC55), who put me in touch with Tom Wright (aka <a title="stereo wildlife" href="http://stereowildlife.co.uk/" target="_blank">Stereo Wildlife</a>). He&#8217;s produced a beautiful new album and agreed to let us use one of the tracks &#8211; and even did a bit of remixing to make it fit the demo. So, music was ready from day 1. This is such a huge bonus when making a demo; it meant we could completely design around it, plan out what scenes we wanted straight away and know they&#8217;d fit.</p>
<p>The demo was envisaged as a &#8220;small project&#8221; &#8211; a relatively low budget production. Low budget meaning less development time, fewer resources. Weeks to make by a small team. Frameranger for example is a very &#8220;high budget&#8221; demo &#8211; lots of people, over a year in the making, tonnes of art assets and specifically made effects, and lots and lots of wasted work. This one is very different; there&#8217;s only one hand-modelled mesh in the whole thing that&#8217;s &#8220;rendered&#8221; properly (the head at the start and end), although there&#8217;s lots of meshes used for other things in the demo. We wanted an effect-led production. The first thing that happened was that Jani designed the numbers scene in Lightwave: creating meshes for each number, placing them in the scene, timing them and making a camera path for the whole lot. Meanwhile I was working on effect development. Then Jani developed the introduction part with the head more or less on his own, and modelled and tweaked the tracks for the fluid parts while I worked on fleshing out the numbers scene with elements and effects. Then we integrated and worked together to finish. With a week or so to go there was a touch of panic and it looked like we weren&#8217;t going to get there; but in the end we found ourselves more or less done 5 days before the competition. For once we had time to polish, tweak and optimise. Hope it shows..</p>
<p>As an aside: the Gathering was a great event for us not least because they also held the <a title="Scene.org Awards" href="http://awards.scene.org">Scene.org Awards</a>, which recognises the best demoscene productions from last year. We got 11 nominations and after a very rock &amp; roll ceremony full of glitz and fireworks came away with 4 awards: Ceasefire for best music, Agenda Circling Forth for best effects, technical achievement and the cherry on the cake: best demo of 2010. Ooooh. Apparently we just missed out on Public&#8217;s Choice by a few points &#8211; but hey, no accounting for taste.. <img src='http://s1.wp.com/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /> </p>
<p><strong>32. Particles. Again?</strong></p>
<p>I&#8217;ve realised over time that I&#8217;m not really a traditional &#8220;democoder&#8221;. I&#8217;m a graphics researcher who happens to prefer to show his new work off in whatever demo we make next. That probably goes some way to explaining why I do things the way I do: researching and improving on certain areas (like particle systems or fluid dynamics. but not ribbons. bitches.). Some would say that fluids or particles are effects: you &#8220;do&#8221; fluids for a scene in a demo, then you go &#8220;do&#8221; something completely different. I don&#8217;t subscribe to that. For me the achievement in a demo like this is not to implement fluids: we first used fluid dynamics in a demo 5 years ago. The challenge is to move the field on &#8211; to do something new with it that nobody else has managed to do in realtime yet, or not on the same scale. Of course there&#8217;s a point where this gets lost on the viewer, and maybe it does just become &#8220;nice particles&#8221; to the uninitiated.</p>
<p>Although the natural reaction of some people will be &#8220;oh, particles again &#8211; nothing new!&#8221; &#8211; this is probably the biggest technical leap we&#8217;ve made for a demo since Blunderbuss. Instead of concentrating on the amount of particles and simply using them to render 3D scenes with a few modifiers on top, we concentrated on the cleverness of the particles: the simulation itself and the rendering/shading. In this demo the particles are smart. They&#8217;re going somewhere.</p>
<p>Particles are just a primitive like polygons or lines &#8211; not interesting in themselves. Creating and rendering a lot of them is easy. Making them do something interesting and look good is a completely different kettle of fish.</p>
<p>So lets talk about what we did this time to make particles do something interesting and look good..</p>
<p><strong>93. Smoothed Particle Hydrodynamics (SPH)</strong></p>
<p><a href="http://en.wikipedia.org/wiki/Smoothed-particle_hydrodynamics" target="_blank">SPH</a> is a form of fluid dynamics which uses particles for storing the fluid and the transport of the forces/densities, rather than a grid. This allows you to represent more detail at higher resolution than a grid would allow given the same memory / performance limitations, it&#8217;s not limited to a certain area of space, and it makes collisions more practical and it&#8217;s a better fit for liquid effects. It&#8217;s the scheme used in professional offline packages like Realflow, used for all those nice liquid splashy effects you see in ads and movies &#8211; which take hours to simulate, let alone render. Good SPH is for me one of those holy grails of  effects development (like realtime radiosity). The thing is, the quality and scope of effects you can do with it is directly dependent on the number of particles &#8211; and so is the difficulty in pulling it off. If you have a few thousand you can make some droplet effects; with 10s of thousands you can make some nice splashes; and with 100s of thousands or millions, you can start to make really amazing running water simulations.</p>
<div id="attachment_377" class="wp-caption alignnone" style="width: 1034px"><a href="http://directtovideo.files.wordpress.com/2011/05/screenshot_49.jpg"><img class="size-full wp-image-377" title="Early tests with SPH fluids" src="http://directtovideo.files.wordpress.com/2011/05/screenshot_49.jpg" alt="Early tests with SPH fluids" width="1024" height="576" /></a><p class="wp-caption-text">Early tests with SPH fluids</p></div>
<div id="attachment_378" class="wp-caption alignnone" style="width: 1034px"><a href="http://directtovideo.files.wordpress.com/2011/05/screenshot_50.jpg"><img class="size-full wp-image-378" title="Early tests with SPH fluids - with environment" src="http://directtovideo.files.wordpress.com/2011/05/screenshot_50.jpg" alt="Early tests with SPH fluids - with environment" width="1024" height="576" /></a><p class="wp-caption-text">Early tests with SPH fluids - with environment</p></div>
<p>The problem with SPH in realtime is it&#8217;s really really hard. The simple explanation of the algorithm is: &#8220;take all the particles near my particle and perform some force exchange between them&#8221;. The force exchange is easy; the &#8220;all the particles near my particle&#8221; is a bitch. On GPU it&#8217;s even more of a bitch; and in 3D it becomes an order of magnitude more of a bitch.</p>
<p>Other demos have featured SPH before; <a href="http://www.pouet.net/prod.php?which=56458" target="_blank">FR-063</a> performed it on the CPU with (what looks like) between 1000-10000 particles. The current bleeding edge for 3D SPH in realtime is around 250,000 particles, working on a top end GPU using CUDA and with simple point rendering (and no effects or anything else on top). The current bleeding edge for 3D SPH on DX9 &#8211; i.e. with no compute shader / CUDA &#8211; is erm.. I dont actually think it&#8217;s been done.</p>
<p>The problem is simply the neighbourhood search. You end up with a variable amount of fast-moving particles affecting each particle, where it&#8217;s hard to pick an upper bound &#8211; so the spatial database is hard to construct. If you solve the neighbourhood search, you can solve SPH.</p>
<p>The demo features up to 500,000 particles running under 3D SPH in realtime on the GPU, with surface tension and viscosity terms; this is in combination with collisions, meshing, high end effects like MLAA and depth of field, and plenty of lighting effects. On DirectX9. It&#8217;s <em>fast</em>. Almost impossibly fast. How? We found a new approach to SPH where we can re-form the neighbourhood search term to something much easier to solve on a GPU. Meaning we can, honestly, get very close to what a program like Realflow can do over hours of simulation &#8211; but in realtime. And that, for me, is what demo coding (and realtime graphics) is all about.</p>
<span style="text-align:center; display: block;"><a href="http://directtovideo.wordpress.com/2011/05/03/numb-res/"><img src="http://img.youtube.com/vi/-fZMqjQp4c0/2.jpg" alt="" /></a></span>
<p>There are 4 scenes which are directly showing &#8220;fluids&#8221; in the demo; a couple more using SPH in places for the great quality it has that it makes the particles spread out really nicely rather than bunch together randomly. In each of the fluid scenes it&#8217;s basically a load of particles dropped at the top of a very long track, and left to get on with it. The camera captures only a part of the action at any time &#8211; the great battle of &#8220;design vs showing off code&#8221; resulted in something that probably doesn&#8217;t completely sell the effect, but it does make something more enjoyable to watch. And that too is what democoding is about..</p>
<p>I thought it&#8217;d be nice to show it in isolation, so I put a couple of screenshots and a video above. Aside from that one embedded video &#8211; apparently wordpress is a little bitch and won&#8217;t let me embed more than one video link into a blog post &#8211; you can also check the reverse angles <a title="SPH test 2" href="http://www.youtube.com/watch?v=TP_vMKWyR1E">here</a> and <a title="SPH test 3" href="http://www.youtube.com/watch?v=tFS90yu9Gs4">here</a>. Those and the above screenshots show an initial test shot we did with 3D SPH &#8211; we drop 250,000 particles, and let them run with SPH and collisions against a mesh (handled as a signed distance field). Look, it splashes about and shit like that. All completely in realtime. Oooooooh. If nothing else, being able to run it in realtime makes it a lot easier to tweak. You get instant results &#8211; you don&#8217;t have to wait for any simulations to calculate. In these days of youtube and the prevalence of netbooks, perhaps high end realtime graphics doesn&#8217;t have the same relevance to the audience that it did 15 years ago &#8211; but it sure matters a huge amount when you&#8217;re actually making something. The benefit to the workflow is huge.</p>
<p><strong>12. Signed Distance Fields</strong></p>
<p>I touched on this for Ceasefire, but it was this production where we finally got them working and used them in anger: the use of signed distance fields for arbitrary collisions (and attraction) with particles. We take polygon meshes, convert them into signed distance fields using distance to triangle measurements and place the results in a volume texture, giving us the means for fast collision ray tests. This is absolutely invaluable when using fluid dynamics because otherwise the particles fly off merrily into space. So we have particles flowing around a head; particles flowing down a track carried by SPH; and particles being blown by a 3d fluid effect into the form of a word. All using signed distance fields.</p>
<p>We used them for a lot more besides particle effects, though. They&#8217;ve become an integral part of our rendering pipeline. That will become more apparent the next time we do something featuring a lot of solid 3D.. but they&#8217;ve opened up a lot of doors.</p>
<p>One clear example of SDF usage comes in the first &#8220;fluid&#8221; scene &#8211; falling drops collide with invisible words. This also neatly demonstrates the &#8220;art vs code&#8221; issue &#8211; we&#8217;re simulating 250,000 particles under SPH running down a long 3D track, and the camera shows a small subsection of those. The collision with the words actually uses two affectors: we used a collision node to make the particles bounce off the 3D words (using an SDF version of the mesh), which worked great &#8211; but it means you only see the top of the words. <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  So we added a second affector &#8211; a low weighted mesh attractor which pulls the particles towards points on the faces of the mesh. This helped the particles slowly run down and also pulls them in from 3d space towards the words. It also added to the surface tension effect by keeping them attracted to the words even after they fall off the end.</p>
<p><strong>65. Particle Shading</strong></p>
<p>In my original post on my particle system a year or more ago I talked about how we  had support for opacity shadow maps for self shadowing on particles. Since Blunderbuss we didn&#8217;t actually use that much &#8211; we&#8217;ve mainly got away with unlit particles, using the shading and lighting from the source meshes. But I&#8217;ve been working on some new techniques and had to make use of them..</p>
<p>The major problem with opacity shadow maps is depth aliasing &#8211; you only have a limited set of depth samples (16 in my case) for which to represent the scene, and it&#8217;s not enough. They tend not to be spread evenly across the particles either. So I tried a few new methods:</p>
<p><strong>252. Volume Shading</strong></p>
<p>This method borrows heavily from slice-wise volume rendering: the particles are sorted in light space by depth, nearest to furthest, and rendered in slices to composite the image. In this case though we only care about the shadow result: the values are written into the per-particle shading buffer used in the final particle render.</p>
<p>The sorted particles are rendered into the shadow map in batches &#8211; typically we used 64 batches per particle system. Per batch we additively render the batch particles into the shadow map, then project the shadow map onto the particles into the next batch: the value read from the shadow map is considered the amount of shadow on that particle from particles closer to the light.</p>
<div id="attachment_375" class="wp-caption alignnone" style="width: 1034px"><a href="http://directtovideo.files.wordpress.com/2011/05/screenshot_42.jpg"><img class="size-full wp-image-375" title="Rendering using an opacity shadow map" src="http://directtovideo.files.wordpress.com/2011/05/screenshot_42.jpg" alt="opacity shadow map version" width="1024" height="576" /></a><p class="wp-caption-text">Rendering using an opacity shadow map</p></div>
<div id="attachment_376" class="wp-caption alignnone" style="width: 1034px"><a href="http://directtovideo.files.wordpress.com/2011/05/screenshot_40.jpg"><img class="size-full wp-image-376" title="Rendering using volumetric shadowing" src="http://directtovideo.files.wordpress.com/2011/05/screenshot_40.jpg" alt="Rendering using volumetric shadowing" width="1024" height="576" /></a><p class="wp-caption-text">Rendering using volumetric shadowing</p></div>
<p>This clever bit is, this method doesn&#8217;t care about the actual depth of the particle : it only cares about the position of the particle in the sorted sequence. No depth writes are required and transparency is supported without any problems. One additional benefit of the technique is that we can blur the shadow map a bit after each batch, giving a scattering effect. If one had the power to do it and could render one particle per batch, it&#8217;d give a perfect shadowing result. As it is, the batch sizes give some slice aliasing.</p>
<p>Unfortunately the slice aliasing was too much of a problem with large sytems and the technique is also a bit too slow &#8211; and generates a lot of render target swaps. So I came up with something better..</p>
<p><strong>15.</strong><em><strong> &#8220;Stochastic&#8221; </strong></em><strong>Shadow Mapping</strong></p>
<p>This isn&#8217;t the same as the stochastic shadow mapping paper that was recently presented, but the name makes a certain amount of sense for the effect anyway. <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  The basic idea is something I&#8217;ve tried a few times on and off since 2009. The idea is that if your particles don&#8217;t overlap pixels in view space, you could render them as solid &#8211; using regular shadowmapping and lighting techniques. Of course this is rarely the case in a render &#8211; because particle systems rely on lots of small elements overlapping and blending  to look solid and nice. However, what if you do render them as single pixels and make them not overlap, and then perform a full screen 2D operation to upscale each point and make them overlap and blend?</p>
<p>We applied that approach to shadow maps generated from particles. The particles are rendered as single points to a very large shadow map; this gives us a reasonable chance that the particles won&#8217;t overlap. It&#8217;s just like a spatial hash &#8211; with a very simple hashing function and no collision handling.. Then, when sampling, we read from the map using a large kernel and sum up the amount of filled pixels which pass the shadow map test to give a shadowing result.</p>
<div id="attachment_392" class="wp-caption alignnone" style="width: 1034px"><a href="http://directtovideo.files.wordpress.com/2011/05/screenshot_56.jpg"><img class="size-full wp-image-392" title="Stochastic shadowing in action, on something that is definately not a semen cell." src="http://directtovideo.files.wordpress.com/2011/05/screenshot_56.jpg" alt="Stochastic shadowing in action, on something that is definately not a semen cell." width="1024" height="576" /></a><p class="wp-caption-text">Stochastic shadowing in action, on something that is definately not an artistic interpretation of a sperm cell.</p></div>
<p>But there&#8217;s a twist: in order to improve the quality, cope with hash collisions and reduce aliasing, we perform a temporal reprojection step. When writing the shadow map each frame a random sub-pixel offset is applied to each particle which varies every frame; this means we get a different set of collisions, so different particles become visible each frame. Then when sampling the shadow map we blend the result with the previous frame, so the results adjust smoothly over time. By combining these two things we get a very nice, soft, reasonably alias-free shadow solution which is also efficient to render. No sorting required. The final shadow value per particle is written into a buffer and used at particle render time.</p>
<p>I also experimented with the technique for the actual rendering of the particles to the main frame &#8211; rendering single points with Z test and blurring the buffer out, with some per-pixel sorting during the composite, to create softened particles but without the need for a full particle sort. Unfortunately it didn&#8217;t give us the visual fidelity we needed; we relied on the blending of particles, the variable sizes and the sprites used. Could be more applicable in a future project though.</p>
<p><strong>536. Meshing (Marching Cubes)</strong></p>
<p><strong></strong>I suppose it&#8217;s the obvious step, isn&#8217;t it. Democoders love metaballs. Being able to render particles as meshes using metaballs is something we&#8217;ve wanted to do for ages because it moves us towards the &#8220;liquid&#8221; look &#8211; the Realflow-style look. We&#8217;ve been here before: in Frameranger we rendered around 50,000 metaballs in realtime by generating a potential field, converting it into a signed distance field and raymarching it. Results were promising but not perfect: being able to generate an actual triangle mesh has some side benefits, like being able to post process the mesh and adjust it with tension &#8211; something we really wanted to do to get closer to that Realflow look I keep going on about.</p>
<p>Marching cubes gives two issues to solve: generating the potentials, and then triangulating them. We already worked out how to generate the potentials some time ago for Frameranger, although a bit of work was required to scale it up to 250,000 particles. The second part is more difficult: you need to generate an arbitrary amount of geometry data from that potential field with triangle and vertex counts that change every frame. Naturally, we could quite easily make an implementation which just generates the worst case: treat every cell in the volume as if it was contributing triangles, then write degenerates for the invalid ones. That actually works &#8211; but it&#8217;s prohibitive for large volumes. One cell can contribute up to 5 triangles, and with a 128^3 volume we&#8217;d be looking at 10 million triangles &#8211; which isn&#8217;t great. 256^3 volumes would effectively be impossible. What we need is a way to only process and send triangles for the cells that are active.</p>
<p>This is problematic because we can&#8217;t generate index or vertex buffers on the GPU, we can&#8217;t generate drawcalls on the GPU (so we can&#8217;t vary how many primitives are rendered on the GPU) and we can&#8217;t use the CPU &#8211; because the potential field is on the GPU and it&#8217;d be far too slow to get it back to CPU. And even if we could, the CPU probably isn&#8217;t up to the task of generating the geometry fast enough anyway. And even if it was, we&#8217;d have to send all the triangle data back to the GPU again. So we&#8217;re stuck with the GPU &#8211; and yet we don&#8217;t have a way to vary the number of cells we render triangles for.</p>
<p><a href="http://directtovideo.files.wordpress.com/2011/05/screenshot2.jpg"><img class="alignnone size-full wp-image-371" title="metaballs " src="http://directtovideo.files.wordpress.com/2011/05/screenshot2.jpg" alt="metaballs in numb res" width="1024" height="576" /></a></p>
<p>It seems impossible. However, Gernot Ziegler came up with a nice solution a while ago: histopyramids. This is a way of performing stream compaction on the GPU: it takes a big sparse buffer, and moves all the filled elements to the start of the buffer. A bit like a sort, but much more efficient. This gives us exactly what we need: we generate the (sparse) potential grid and use histopyramid compaction to move all the filled elements to the start. Then we use an occlusion query to count the number of active cells and use the CPU to generate batches which give enough triangles for the count to generate. The actual vertices are generated using a pixel shader and vertex texture fetch is used to read them.</p>
<p>Result!</p>
<p><strong>4. Bokeh Glows</strong></p>
<p>I&#8217;ve had this effect on the back burner for a few years but finally got to actually finishing it up.. Bokeh is the term relating to the effect of circular or <strong></strong>shaped highlights in a depth of field effect, caused by inaccuracies in the shape of the lens of a camera. Or something. They make DOF look really nice. I&#8217;ve tried before by using a really big circular kernel for a regular DOF effect with an HDR input and leaving it at that and it actually does work, but I wanted to see if I could get some shaped bokehs and really overblow it. So I tried something with point sprites.</p>
<div id="attachment_386" class="wp-caption alignnone" style="width: 1034px"><a href="http://directtovideo.files.wordpress.com/2011/05/screenshot_81.jpg"><img class="size-full wp-image-386" title="bokeh" src="http://directtovideo.files.wordpress.com/2011/05/screenshot_81.jpg" alt="bokeh" width="1024" height="576" /></a><p class="wp-caption-text">bokeh, innit. turned up to max, of course.</p></div>
<p>The basic idea is to work out where on screen bokehs would happen, and render point sprites at those points. I did this using the following method:</p>
<p>- Bilinear downsample the screen (in several steps), storing the 2d position (UV) of the brightest point of the 4 values of the quad that were read to a render target.</p>
<p>- Use those 2d positions to read a blurred version of the original frame. Perform some thresholding to pick out the points which pass. Generate colour values for the points.</p>
<p>- Temporally smooth positions and colours using positions from last frame, apply some attack and decay.</p>
<p>- Render a load of point sprites using vertex texture fetch to read the positions and colours, rendering the sprites to the screen. (With some additional magic to make it look good.)</p>
<p><strong>72. Post Process Antialiasing (MLAA)</strong></p>
<p>This is the first demo since 2009 (Frameranger, in fact) that we&#8217;ve released which actually features polygons being rendered as polygons. Happily, time has moved on, and so has our renderer. One of the major bugbears I had with the deferred renderer is lack of antialiasing &#8211; but fortunately a whole bunch of post process antialiasing techniques got invented in the last couple of years. MLAA is the technique du jour, and we use an implementation in our renderer. It&#8217;s great.</p>
<p>We do two little twists in our version to make it cool: firstly we use a lot of stencil optimisation so only the active edges get the big-ass shader applied to do the actual MLAA (or in fact get any of the process after the edge detect applied). And secondly.. there&#8217;s an ugly problem with MLAA in that it actually cocks up quite badly in a certain case. The technique relies on checking for horizontal or vertical edges. But where you have a pixel which is both a horizontal and vertical edge, it messes it up. Which breaks about 1/4 of the diagonal edges you have to deal with, so its pretty noticable. Our oh so clever technique for fixing that is.. do the MLAA twice. <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  The second time we flip the whole image in x and y, then MLAA it and flip it back. Genius huh? .. no? Well, it makes the polygonal scenes look good, and fortunately the stenciled version is so fast the extra hit isnt really noticable.</p>
<p><strong>42. Stereoscopic 3D</strong></p>
<p>We really wanted to do something with 3D for a while, but sadly we dont have any true 3D hardware (*cough* donations please *cough*). We decided quite early on that we were going to go for a pretty much black &amp; white look &#8211; so it would actually be feasible to use the good old red / cyan anaglyph method. 3D isn&#8217;t as easy as just turning it on, though. It takes some effort to make it work well, give a good effect and not strain your eyes. We tuned it quite carefully and the setup of the scenes really helps &#8211; the first scene is slow and quite static so it lets your eyes adjust, the camera movements are quite smooth and in a single direction so they&#8217;re easy to track, and so on and so on.</p>
<p>Do watch the demo in 3D, it&#8217;s really made for it. We&#8217;re going to make a proper HD 3D video with left &amp; right splits soon for those with real 3d setups.</p>
<p><strong>End</strong></p>
<p>I guess what&#8217;s interesting for me about this demo is that it was so much easier to make than many we&#8217;ve done. It just kind of came together; we started early enough, we got the music at the start, we  didn&#8217;t have any major problems, nobody disappeared or dropped out, everything showed up on time, we didn&#8217;t completely overstretch ourselves and come up with some ideas that couldn&#8217;t be done, and we had time at the end to go over it and tweak and polish things, and we&#8217;re really happy with how it turned out. It&#8217;s like the way it&#8217;s supposed to go but never does. It doesn&#8217;t work for everyone (not very bombastic, you see) but it seems the people who got it really got it and like it, which is what matters. Maybe we&#8217;ve actually cracked it.. or maybe next time&#8217;ll be a royal screwup.  Have to wait and see..</p>
<p>An amusing realisation hit me the other day. We&#8217;ve unintentionally managed to make a demo which is entirely full of sexual references. There&#8217;s a load of massive sperm cells; there&#8217; what looks like a female gender symbol, made up of little sperm cells; there&#8217;s a load of sperm falling down and colliding off things; and then there&#8217;s a big river of .. well, it&#8217;s not much of a stretch in context to call that fluid &#8220;spunk&#8221;, is it? It only dawned on me after Dixan commented that it was &#8220;finally a good demo about semen&#8221; on pouet, and I started thinking about it.</p>
<p>Shit.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/364/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/364/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/364/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=364&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2011/05/03/numb-res/feed/</wfw:commentRss>
		<slash:comments>27</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/04/screenshot1.jpg?w=300" medium="image">
			<media:title type="html">numb res</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/05/screenshot_49.jpg" medium="image">
			<media:title type="html">Early tests with SPH fluids</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/05/screenshot_50.jpg" medium="image">
			<media:title type="html">Early tests with SPH fluids - with environment</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/05/screenshot_42.jpg" medium="image">
			<media:title type="html">Rendering using an opacity shadow map</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/05/screenshot_40.jpg" medium="image">
			<media:title type="html">Rendering using volumetric shadowing</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/05/screenshot_56.jpg" medium="image">
			<media:title type="html">Stochastic shadowing in action, on something that is definately not a semen cell.</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/05/screenshot2.jpg" medium="image">
			<media:title type="html">metaballs </media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/05/screenshot_81.jpg" medium="image">
			<media:title type="html">bokeh</media:title>
		</media:content>
	</item>
		<item>
		<title>come see me talk at gdc 2011.</title>
		<link>http://directtovideo.wordpress.com/2011/02/25/come-see-me-talk-at-gdc-2011/</link>
		<comments>http://directtovideo.wordpress.com/2011/02/25/come-see-me-talk-at-gdc-2011/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 14:57:20 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[realtime rendering]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=312</guid>
		<description><![CDATA[To all the gamedev people who were savvy enough to get their company to pay for their plane ticket: come see me at GDC! I&#8217;ll be introducing the new version of PhyreEngine &#8211; 3.0 &#8211; to the world. It has tools and everything. I&#8217;ll also be talking about the new rendering work we&#8217;ve done on [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=312&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>To all the gamedev people who were savvy enough to get their company to pay for their plane ticket: <a href="http://schedule.gdconf.com/session/12455">come see me at GDC!</a></p>
<p>I&#8217;ll be introducing the new version of PhyreEngine &#8211; 3.0 &#8211; to the world. It has tools and everything. I&#8217;ll also be talking about the new rendering work we&#8217;ve done on PS3 lately. That includes:<br />
-a particle system on SPU (which took a lot of ideas from the one ive presented on this blog which ran on GPU. but now on SPU.)<br />
-a new take on MLAA on PS3<br />
-deferred lighting on SPU and our rendering engine in general.</p>
<p>And then I get to talk about NGP. If you&#8217;re thinking of (or are currently) developing for the device, you might be interested in knowing what you can do on it graphics-wise and how it went adding NGP support for our engine. This I shall attempt to impart.</p>
<p>Plus, I&#8217;ll be giving out a free NGP devkit to the first 10 people through the door!* So come along! Thursday March 3rd,  3:00- 4:00, 	  	 	Room 302, South Hall.</p>
<p><a href="http://schedule.gdconf.com/session/12455"><strong>GDC 2011</strong></a></p>
<p>*lies</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/312/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/312/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/312/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=312&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2011/02/25/come-see-me-talk-at-gdc-2011/feed/</wfw:commentRss>
		<slash:comments>13</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>
	</item>
		<item>
		<title>ceasefire (all falls down).</title>
		<link>http://directtovideo.wordpress.com/2011/02/25/ceasefire-all-falls-down/</link>
		<comments>http://directtovideo.wordpress.com/2011/02/25/ceasefire-all-falls-down/#comments</comments>
		<pubDate>Fri, 25 Feb 2011 11:13:19 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[demoscene]]></category>
		<category><![CDATA[particles]]></category>
		<category><![CDATA[realtime rendering]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=285</guid>
		<description><![CDATA[Ceasefire (All falls down..) by CNCD vs Fairlight &#8211; 2nd place, Assembly 2010 combined demo competition. Capped.tv Youtube Pouet Download executable binary (This is late. Really really late. Sorry. I&#8217;ve been busy! Honest.) It&#8217;s become traditional for us to do something for Assembly (in Helsinki, Finland, 5-8 aug 2010). This year we wanted to do [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=285&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><strong>Ceasefire (All falls down..)</strong> by <strong>CNCD vs Fairlight</strong> &#8211; 2nd place, Assembly 2010 combined demo competition.<br />
<a href="http://capped.tv/cncd_fairlight-ceasefire_all_falls_down">Capped.tv</a> <a href="http://www.youtube.com/watch?v=vQ2iQQvofCE&amp;feature=related">Youtube</a> <a href="http://www.pouet.net/prod.php?which=55558">Pouet</a> <a href="ftp://ftp.scene.org/pub/parties/2010/assembly10/demo/ceasefire_all_fall_down_by_cncd_vs_fairlight.zip">Download executable binary</a></p>
<p>(This is late. Really really late. Sorry. I&#8217;ve been busy! Honest.)<br />
It&#8217;s become traditional for us to do something for <a href="http://www.assembly.org">Assembly</a> (in Helsinki, Finland, 5-8 aug 2010). This year we wanted to do a demo that continued from Agenda with the particle theme, but took things further &#8211; we felt like we barely scratched the surface of what was possible. And we actually started quite early, almost three months before. The core plan and direction was laid down and we organised the soundtrack. We wanted to try and really plan something out and make something big.</p>
<p><img src="http://directtovideo.files.wordpress.com/2011/02/ceasefire02.jpg" alt="ceasefire" /><br />
<em>100% particles</em></p>
<p>Unfortunately when man makes plans.. well, it completely didn&#8217;t work out. The soundtrack didn&#8217;t come out as we hoped, the demo plot was far too bound to the soundtrack and the visuals were far too bound to the plot &#8211; we were at the mercy of it. Every scene was required, every part needed for it to make sense. We realised the whole thing wasn&#8217;t going to happen. So we started again. We hunted around for possible tracks and in the end <a href="http://hunz.com.au/">Hunz</a> came to the rescue &#8211; he let us use his beautiful track &#8220;All Falls Down&#8221; and also remixed it for us to fit the direction on a very short timescale.</p>
<p>So about that direction &#8211; well, the original plan for the demo was this sort of time-shifted end-of-the-world meet your maker theme where a city gets destroyed by some sort of holocaust, but then a phoenix rises from its ashes. It was going to be great! Trust me. Well, happily the new soundtrack &#8211; with strong vocals leading the way &#8211; actually did support this theme, but we were able to do something more loose than we had originally planned &#8211; disaster-related scenes, but less of a central plot to be reliant on.</p>
<p>Naturally the engine had matured a bit since Agenda, and we now benefitted from overall better performance, as well as a number of new features and effects; in particular lines / hair, displacement mapping of particles and collisions with distance fields. There were also a few effects I made specifically for the demo: fire using fluid solvers, raytraced spheres and a tidal wave thing. I&#8217;ll go through some of those in turn.</p>
<p><strong>Hair</strong></p>
<p>A natural step when you&#8217;ve got a particle system is to try linking the particles together with lines so you get something like hair, and that&#8217;s how this started out. Then you&#8217;ve got two immediate issues to overcome: how to get the right particles linked together so it isn&#8217;t a jumbled mess; and how to make them move in a way that appears connected. Fortunately if you solve the latter you&#8217;re a long way to fixing the former.</p>
<p>Firstly, we assume that particles next to each other in the texture are part of the same line, up until some line length is reached. For simplicity&#8217;s sake all the lines contain the same number of particles, and that number is a power of two so a number of lines fit neatly into the particle texture. Lines are arranged solely in the X direction of the particle texture and can&#8217;t spread onto multiple rows: i.e. the maximum line length allowed is the width of the particle texture. With this arrangement you&#8217;ve got a pretty easy way of finding the particles that make up one line, of finding the next and previous particles in the line and so on. For example, in a 1024&#215;1024 particle texture and a line length of 256, I have 4 lines per row &#8211; 4096 lines in total.</p>
<p>Connected movement is achieved by using a spring solver. Particles attempt to maintain a certain distance from their connected neighbours in a line by pushing and pulling towards them; several iterations of that are performed per update. So it&#8217;s simply a case of looking at the next and previous particles in the line and moving the particle towards or away from its neighbours as appropriate. End points can be anchored if we want.</p>
<p>Ah, but why do the previous and next particles actually make sense as line neighbours in the first place? Can&#8217;t they be anywhere in space? No, because I have a special emitter that emits particles in a suitable way &#8211; i.e. as lines in the first place. This can be done using a random direction, or using normals from a mesh, or to fill a mesh, or along contours of a distance field. If they start off in a good shape, and there&#8217;s a spring solver on them to keep them in a good shape, they stay in a good shape. Easy.</p>
<p>For rendering we have a couple of options: line primitives or camera-facing quad strips. Quads have the advantage of having actual thickness, but they&#8217;re slower to render and have to be at least a minimum thickness or they get culled by the hardware. We tessellate at render time using catmull rom splines so lines can be smoother &#8211; that&#8217;s just done in the vertex shader. We use opacity shadow maps just like the particles use &#8211; so the lines are self shadowed nicely.</p>
<p>The shading had quite a lot of faking involved too, actually. I used a blend between a few colours; a dark tone which is used as an &#8220;occluded&#8221; colour near the root, and a lighter &#8220;unoccluded&#8221; tone; then a couple of tones to randomly pick between for each strand of hair.</p>
<p><img src="http://directtovideo.files.wordpress.com/2010/02/horse_06.jpg" alt="hair" /><br />
<em>*Unreleased material alert!* The hair effect when used on a horse, a while ago</em></p>
<p>Naturally as with all these particle things, the issue isn&#8217;t about numbers, it&#8217;s about control &#8211; and that was the trick: emitting to fill a mesh (a match stick in this case), getting all blown about and then reforming into that mesh again. It turned out the curl noise affector worked great on lines because it has spatial continuity &#8211; it made it look like hair underwater, which is exactly what we wanted.</p>
<p><strong>Fire</strong></p>
<p>I spent some time looking into how to do a good fire effect with the help of some Siggraph papers. Fire is quite hard to do properly &#8211; you have to capture the large-scale and small-scale movements. The really good way to do fire is to use a massive 3D fluid solver which is big enough to capture the small-scale details &#8211; but that&#8217;s completely prohibitive in terms of memory and performance. So there&#8217;s an approximation. The basic theory is, you use a small number of screen-aligned 2D slices each running their own separate 2D fluid solver; and you blend the input velocity and density across the slices so they all have pretty similar source data, which means they all move in a way that makes sense across the slices. Then you add some procedural fluid flow (read: curl noise) on top to add detail.</p>
<p>The way I started was to follow the paper and use particles for inputs. You render them as particles extruded into quads to capture the motion, rendering both density &amp; temperature and velocity into the slices as MRTs; then you apply 2D fluid solvers to the slices, apply some procedural motions and render the slices view aligned with some shader to generate colour from temperature. Well, it turned out to be a total bitch. It appeared the paper left out a few critical details, and it didn&#8217;t work out quite the way I hoped. The biggest problem was one of scale &#8211; getting a fire that would work for a big volume of it &#8211; like the heads of some tikis &#8211; was very different to one that worked for a small one like the burning head of a match. Also we couldnt get quite as many slices as we wanted because it was just too heavy with large, high resolution fluid simulations, even in 2D. The particles also didn&#8217;t give a clean and smooth enough result, even when extruded into quads.</p>
<p>In the end we ditched particles as inputs totally and used meshes instead. Well, GBuffers anyway. I rendered the meshes to GBuffers and blended those into the fire buffers, weighted by depth from slice and generating velocities using perlin noise and the screen space normals. This gave a much cleaner result which was more controllable and a massive amount faster. Still a total bitch to get the scales working well for different fires, though.</p>
<p>&nbsp;<br />
<em>Evolution of the fire effect</em><br />
<img src="http://directtovideo.files.wordpress.com/2011/02/tikifire2.jpg" alt="evolution of fire" /><br />
&nbsp;<br />
<img src="http://directtovideo.files.wordpress.com/2011/02/firenew02.jpg" alt="evolution of fire" /><br />
&nbsp;<br />
<img src="http://directtovideo.files.wordpress.com/2011/02/firenew03.jpg" alt="evolution of fire" /><br />
<em>See, it got better</em><br />
&nbsp;</p>
<p>And then there was the rendering. You would think it&#8217;d be easy to map a floating point temperature value into good looking colours, but it wasn&#8217;t. I also had to blend them across the slices and with other scene elements, and there just didn&#8217;t seem to be a mode that made it look good. It took an age of tweaking and I never was satisfied with it.</p>
<p>In the end we got.. something. I wasn&#8217;t totally happy with the effect but it did add something to the demo that wasn&#8217;t particles.  It looked pretty good when applied to that fucking phoenix at the end though.</p>
<p><strong>Raytraced spheres</strong></p>
<p>Problem: render a reasonably large number (lets say 100s) of moving spheres that can overlap in screen space, and are all refractive, with a reasonable degree of accuracy. Solution? Lets see.. they need to refract the background which is easily achieved through render to texture; that alone could be achieved with a simple rasterisation-based approach. But they also need to refract each other given that they could overlap a lot &#8211; and that overlapping makes rasterisation inappropriate, and a raytracing solution would be better. Oh, and we also need it to not eat too much frame time given that it&#8217;s a small part of a much larger scene, so that &#8211; combined with the large number of spheres &#8211; prohibits a simple brute force approach of checking the ray against each sphere per pixel and then again for the refractions.</p>
<p><img src="http://directtovideo.files.wordpress.com/2011/02/parade03.jpg" alt="Spheres, during development" /><br />
<em>Raytraced spheres turned into particles, early in development. This effect was a right pain</em></p>
<p>What I needed was a way of reducing the problem down to a smaller set of spheres per pixel which are likely to affect the ray at that pixel. One way would be to build a 3D spatial database for the spheres and use that to trace more efficiently, but that isn&#8217;t all that pixel shader friendly &#8211; or easy to update per frame. So I cut a few corners and went for a 2D approach. The idea was, at a low resolution I worked out which spheres overlapped each pixel and stored those spheres in render targets; then at a high resolution I only consider the spheres in those render targets to trace through, rather than all of them. In order to cope with refractions I had to be a bit generous on the overlap test, but it worked well. The low resolution classification step was a long shader that looped through the large number of spheres &#8211; sorted front to back and roughly pre-classified on CPU to only check those vaguely near the pixel &#8211; and gathered the first 4 that overlapped, writing them to MRTs. The high resolution tracing shader loaded the 4 spheres from the render targets and checked them for ray intersections, then traced the ray through for refractions, finally getting an exit direction to look up the back buffer. 4 spheres was usually enough overlap to get believeable refractions &#8211; and hey, we were going to turn it all into particles anyway, so there was room for error.. wait, what was that about overkill?</p>
<p>I&#8217;ve used this approach before to render large numbers of metaballs (1000s) too; the problem is that with a lot of balls you start to need a lot of overlapped spheres per pixel, and you simply can&#8217;t cache enough, so it breaks down. To do 1000s of metaballs you need a different approach, but that&#8217;s something for another post..</p>
<p><strong>Particle fun</strong></p>
<p>One of the main scenes in the demo involves a street of buildings which gets blown up, building at a time, into particle explosions. That got.. pretty heavy. Each building was built of 1m particles, so we ended up pushing 10m particles per frame through the render. Ow. That was just not going to fly as regular particles where we maxed out any reasonable GPU at 2m &#8211; and blew all kinds of memory limits with more than that - so we had to do some things to cut it down.</p>
<p><img src="http://directtovideo.files.wordpress.com/2011/02/ceasefire01.jpg" alt="blow that shit up" /><br />
<em>New PC shadebob record</em></p>
<p>The first idea was &#8220;static particles&#8221;. The idea was, don&#8217;t do all the simulation and sorting the particles go through; just use the position and colour textures that were pregenerated for emission from a mesh, and pass them straight to the particle renderer. The particles could be pre-sorted in that texture for a rough camera direction so it looked alright. This obviously slices the amount of work done per frame a lot. The particles would be static though, but we could use displacement mapping effects (see later) to add some movement. We could also fake them fading in and out for lifetime cycles.</p>
<p>This trick bought us a lot of the time back; we could actually render the scene with this and get some sort of sensible framerate. But we didn&#8217;t want a static scene, we wanted to explode the buildings. So I devised a scheme of smoke and mirrors, whereby a building is static particles until it explodes, and then switches seamlessly to a proper particle system.  Buut, you cant very well keep them all as particle systems after explode because it wastes loads of VRAM, which we&#8217;re already pushing too hard; so I wait until the explosion gets almost static and then switch them to an imposter by rendering them to a texture.</p>
<p><strong>Displacement Mapping</strong></p>
<p>Displacement mapping was used to add a per-frame offset to particle positions. This is done at render time only; well, not actually in the vertex shader, but as a pre-pass just before the render which processes the position buffer. It&#8217;s means it&#8217;s a temporary operation &#8211; it doesn&#8217;t have to persist to the next frame so it&#8217;s not part of the simulation, so the results don&#8217;t get stored and eat memory. So it works on static particles like on the street scene, which is ideal because we needed to add some movement there.</p>
<p>I added a bunch of operators &#8211; audio-based FFT modifiers, perlin noise movement modifiers, and things using images. We used it for some pulsing audio effects and a few other bits and pieces. Simple but oh, so effective.</p>
<p><strong>Depth of field</strong></p>
<p>Jani came up with this and it worked out a total treat. The idea is that we had so many particles that we could achieve a depth of field look just by randomising the positions a bit at render time (in vertex shader), where the randomness is controlled by the distance from focus. It took a fair few goes for him to explain it to me in a way that I understood, but once we got there I added it and it totally worked &#8211; it looked great. We could use it for focus pulls, &#8220;blurring out&#8221; shots and so on.</p>
<p><img src="http://directtovideo.files.wordpress.com/2011/02/ceasefire03.jpg" alt="ceasefire" /><br />
<em>Particle randomisation for depth of field</em></p>
<p><strong>Distance fields</strong></p>
<p>The subject of collisions with particles against meshes had come up before. Like that of real particle fluids &#8211; i.e. SPH &#8211; or rigid bodies or meshing, it usually get met with &#8221;in realtime? fuck off&#8221; or &#8221;yea.. I bet in 5 years we&#8217;ll be doing that&#8221; or &#8220;I&#8217;ll get to it when I&#8217;m done adding the radiosity solver&#8221; or some other smartass coder vs artist remark. Like what we used to say about shadows in the 90s.  Of course, those arguments always end up evaporating because it actually gets done in the end when someone comes up with a practical, simple, workable way of doing it. And so it is here. All the hype about distance fields made me get around to writing a proper mesh to  signed distance field conversion routine for some effect or other, and I realised it would make perfect sense to use for particle collisions. With meshes.</p>
<p>It&#8217;s a pretty simple routine; get the particle position in the space of the distance field, see if it&#8217;s inside, work back to find the 0 contour and the field normal at the hit point and then do something. Like move the particle and set some bounce velocity.  So I did it and we used it with the tidal wave scenes, and it was great! Particles colliding with logos, with 3d scenes, and so on.</p>
<p>Well, it <em>would</em> have been great if the routine had worked. It didn&#8217;t; the mesh to distance field conversion was broken, so parts of the field were all wrong and it produced all kind of funny results. We managed to fudge the effect enough to  get through the demo but it wasn&#8217;t until months later that I realised the mistakes and made something that really worked properly. In the demo it works in a few places but it&#8217;s not quite what it should have been.. so you get a few splashes off the logo and some collisions with what basically ended up as boxes in the subway scene.</p>
<p>The good news is I fixed it since, and it&#8217;s brilliant. So many applications for it; although the real challenge is in getting an accurate signed distance field of an arbitrary complex mesh efficiently in the first place, and that was what took so long to solve. It probably deserves a whole article on it&#8217;s own so let&#8217;s leave it there.</p>
<p><strong>Water </strong></p>
<p>I don&#8217;t know how this came about, but someone &#8211; might have been me actually &#8211; had the idea of using an ocean water effect and making the particles follow it. That water routine is so old. I&#8217;ve had it working since about 2003 and never actually used it in a demo, although it was planned for a couple and didn&#8217;t make it. It&#8217;s the implementation of Tessendorf&#8217;s FFT-based ocean water simulation, and it gives you a nice realistic ocean water heightfield which people usually use for meshes. I remember at the time I wrote it it worked fast on something like 32&#215;32 or 64&#215;64 grids on a PC CPU (due to the inverse 2D FFT you need to do), which wasn&#8217;t all that good looking. Since then Caspar did one on the PS3 running on SPU which ran at 256&#215;256 if I remember right; fortunately PC CPUs caught up and now I can run it at a decent resolution pretty comfortably. If you want to know how the ocean routines work, google Tessendorf FFT ocean water and you&#8217;ll no doubt be presented with a load of material.</p>
<p><img src="http://directtovideo.files.wordpress.com/2011/02/particlesfluids01.jpg" alt="water" /><br />
<em>Original version of the water effect</em></p>
<p><img src="http://directtovideo.files.wordpress.com/2011/02/subway02.jpg" alt="ceasefire" /><br />
<em>The water in the subway scene, later</em></p>
<p>That was the first step; but then we started messing with it. We had a subway scene where we wanted to fill it with water and make it look like a wave was crashing through it. In an ideal (fantasy) world that would be done with proper fluid dynamics; I thought it&#8217;d be better (i.e. achieveable) if we faked it by taking the ocean effect and applying some magical space modifier to it to warp it into the shape of a wave. Simple.. a wave curl is a bit like some warped bell curve shifted and curled around by a twist / vortex equation. Right? Except somehow I was attempting to do this really late at night not all that long before the deadline, and I just couldn&#8217;t get it for ages and ages. GCSE maths is hard.</p>
<p><img src="http://directtovideo.files.wordpress.com/2011/02/ceasefire04.jpg" alt="ceasefire" /><br />
<em>The subway scene</em></p>
<p><strong>Post Processing</strong></p>
<p>I have to quickly mention the post processing effects &#8211; well, <em>effect</em> &#8211; that we used to make the screen all break up and look like a broken video recording. A lot of people moaned about it, some liked it. Personally I love it<strong>. </strong>It&#8217;s a combination of a load of different small things which go together to make something cool. We mix between a load of distortions using sinewaves and noise &#8211; some on scanlines, some on blocks; stretching, offseting and flipping the screen; and then this frame-holding effect where we keep a history of a few frames and randomly hold them or jump between them for a little while. There&#8217;s something really satisfying about taking a scene you&#8217;ve spent ages lovingly crafting, and then messing it up on purpose. <strong><br />
</strong></p>
<p>So there it is &#8211; we tried to make plans, it didn&#8217;t work out, and we made something much quicker instead. I&#8217;m really glad I got to work with Hunz, and I&#8217;m happy with some of the routines that were put together pretty fast. Demo compos, like war, can be the source of great innovation and technical advancement &#8211; if things have to get done, they get done. Yep, demo compos are a lot like war actually. Except you can watch them with a few beers in the grandstand of a hockey arena, not on CNN.</p>
<p>I happened to do a seminar at Assembly which is <a href="http://vimeo.com/14458079">here</a> &#8211; if you want to watch 50 minutes of me discussing how we made our recent demos and at the same time being a cocky little shit. Go on, you know you want to.</p>
<p><em>Coming soon: all the new things we&#8217;ve been doing between when the content of this blog post was actually fresh and relevant, and now..</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/285/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/285/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/285/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=285&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2011/02/25/ceasefire-all-falls-down/feed/</wfw:commentRss>
		<slash:comments>7</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/ceasefire02.jpg" medium="image">
			<media:title type="html">ceasefire</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/02/horse_06.jpg" medium="image">
			<media:title type="html">hair</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/tikifire2.jpg" medium="image">
			<media:title type="html">evolution of fire</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/firenew02.jpg" medium="image">
			<media:title type="html">evolution of fire</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/firenew03.jpg" medium="image">
			<media:title type="html">evolution of fire</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/parade03.jpg" medium="image">
			<media:title type="html">Spheres, during development</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/ceasefire01.jpg" medium="image">
			<media:title type="html">blow that shit up</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/ceasefire03.jpg" medium="image">
			<media:title type="html">ceasefire</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/particlesfluids01.jpg" medium="image">
			<media:title type="html">water</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/subway02.jpg" medium="image">
			<media:title type="html">ceasefire</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2011/02/ceasefire04.jpg" medium="image">
			<media:title type="html">ceasefire</media:title>
		</media:content>
	</item>
		<item>
		<title>scene.org awards thinks we&#8217;re alright.</title>
		<link>http://directtovideo.wordpress.com/2011/02/17/scene-org-awards-thinks-were-alright/</link>
		<comments>http://directtovideo.wordpress.com/2011/02/17/scene-org-awards-thinks-were-alright/#comments</comments>
		<pubDate>Thu, 17 Feb 2011 12:50:00 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=304</guid>
		<description><![CDATA[Look, sorry &#8211; I&#8217;ve not updated this blog in months and I feel a bit bad about it. It&#8217;s not because I&#8217;ve given up on the whole thing and become a hermit living on herring in the uninhabited part of the scottish isles &#8211; oh no. It&#8217;s the opposite &#8211; I&#8217;ve been so busy actually [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=304&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Look, sorry &#8211; I&#8217;ve not updated this blog in months and I feel a bit bad about it. It&#8217;s not because I&#8217;ve given up on the whole thing and become a hermit living on herring in the uninhabited part of the scottish isles &#8211; oh no. It&#8217;s the opposite &#8211; I&#8217;ve been so busy actually making shit that I havent had time to finish that big writeup of Ceasefire that&#8217;s been sitting in the outbox for months, let alone talk about all the new things we&#8217;ve got going on that you&#8217;ll see soon enough. They make the old stuff look a bit silly.</p>
<p>But anyway. Just wanted to mention: we got some nominations for the <a href="http://awards.scene.org/">Scene.org Awards 2010</a>. <a href="http://www.pouet.net/prod.php?which=55558">Ceasefire</a> got nods for best demo and best soundtrack (for Hunz &#8211; well deserved!), and <a href="http://www.pouet.net/prod.php?which=54603">Agenda Circling Forth</a> scored a record 7 nominations: best demo, graphics, effects, direction, original concept, technical achievement and public&#8217;s choice. Our c64 superstars also got a nomination for best demo on an oldschool platform for <a href="http://www.pouet.net/prod.php?which=56000">We Are New</a>.</p>
<p>You can actually vote in the public choice category (for us) right now <a href="http://awards.scene.org/voting.php">if you feel like it</a>. </p>
<p><a href="http://awards.scene.org/voting.php">Go on.</a> </p>
<p><a href="http://awards.scene.org/voting.php">Do it.</a></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/304/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/304/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/304/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/304/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/304/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/304/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/304/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/304/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/304/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/304/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/304/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/304/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/304/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/304/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=304&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2011/02/17/scene-org-awards-thinks-were-alright/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>
	</item>
		<item>
		<title>agenda vs siggraph.</title>
		<link>http://directtovideo.wordpress.com/2010/08/10/agenda-vs-siggraph/</link>
		<comments>http://directtovideo.wordpress.com/2010/08/10/agenda-vs-siggraph/#comments</comments>
		<pubDate>Tue, 10 Aug 2010 08:32:08 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=281</guid>
		<description><![CDATA[Just wanted to mention &#8211; Agenda Circling Forth got shown in the Live Realtime Demos show at Siggraph 2010! This show was strictly speaking for interactive pieces only, and ours was a realtime non-interactive demo. So I did some work to make it interactive via the use of a webcam &#8211; basically you can wave [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=281&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Just wanted to mention &#8211; Agenda Circling Forth got shown in the <a href="http://www.siggraph.org/s2010/for_attendees/live_real_time_demos">Live Realtime Demos</a> show at Siggraph 2010! </p>
<p>This show was strictly speaking for interactive pieces only, and ours was a realtime non-interactive demo. So I did some work to make it interactive via the use of a webcam &#8211; basically you can wave your hands around in front of the camera and it uses optical flow to calculate a motion buffer, which is used to manipulate the velocities of the particles, the fluid effects and the intensity of some of the post processing. It worked out quite nicely actually. I might have to make an exe available at some point..</p>
<p>I wasn&#8217;t able to attend in person so my friend Steve McAuley kindly presented it at the show. Unfortunately there were some technical difficulties at the show and when they attached a webcam to the PC it made it really unstable, so they ended up showing the non-interactive version. Still, the work on interactivity isn&#8217;t wasted &#8211; I&#8217;m sure it&#8217;ll be useful for other projects in future. </p>
<p>Many thanks to the organisers and to Steve for doing the business over there. Hopefully we&#8217;ll be able to do this again in future.</p>
<p>Next up: <a href="http://pouet.net/prod.php?which=55558">this</a>..</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/281/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/281/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/281/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/281/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/281/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/281/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/281/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/281/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/281/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/281/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/281/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/281/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/281/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/281/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=281&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2010/08/10/agenda-vs-siggraph/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>
	</item>
		<item>
		<title>agenda circling forth.</title>
		<link>http://directtovideo.wordpress.com/2010/04/19/agenda-circling-forth/</link>
		<comments>http://directtovideo.wordpress.com/2010/04/19/agenda-circling-forth/#comments</comments>
		<pubDate>Mon, 19 Apr 2010 11:35:02 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[demoscene]]></category>
		<category><![CDATA[realtime rendering]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=223</guid>
		<description><![CDATA[youtube capped.tv video download pouet Anyone remember what happened this easter weekend? I&#8217;m a bit hazy about it myself &#8211; because I was in Germany living it up at the last ever Breakpoint. A party / festival that&#8217;s been running on the easter weekend for the past 8 years in Bingen am Rhein, it was [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=223&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://directtovideo.files.wordpress.com/2010/04/agenda005.jpg" alt="agenda circling forth" /></p>
<p><a href="http://www.youtube.com/watch?v=ON4N0yGz4n8">youtube</a> <a href="http://capped.tv/fairlight_cncd-agenda_circling_forth">capped.tv</a> <a href="http://cappedtv.dkev.org/vhq/fairlight_cncd-agenda_circling_forth.mp4">video</a> <a href="ftp://ftp.untergrund.net/breakpoint/2010/pc_demo/agenda.zip">download</a> <a href="http://www.pouet.net/prod.php?which=54603">pouet</a></p>
<p>Anyone remember what happened this easter weekend? I&#8217;m a bit hazy about it myself &#8211; because I was in Germany living it up at the last ever <a href="http://breakpoint.untergrund.net/">Breakpoint</a>. A party / festival that&#8217;s been running on the easter weekend for the past 8 years in Bingen am Rhein, it was a unique gathering of 1000+ creative and technical types who go there to show their work or see what the others have done, but stay for the massive, massive party. I can&#8217;t overstate how much I&#8217;ve enjoyed it over the past few years. The atmosphere is unique &#8211; it&#8217;s amazing to see the enthusiasm everyone there had just for being part of it. </p>
<p>So much has happened to me personally at that event on previous occasions: hotel food fights; TV appearances I can&#8217;t remotely remember; big-screen and stage appearances I&#8217;ll always remember; appalling hung over football performances; all-nighters working in the freezing cold, where you had to get a coffee just to hold it; close brushes with hospitalisation; and I never even got thrown out once! (I was &#8220;helped out&#8221; once though. Cheers Docd!) My record of actually getting something finished for it is patchy; we have won there before, but most years we either end up working all weekend in the hotel to just make the deadline, or giving up immediately and going on a massive bender instead &#8211; because the party was too much fun to miss. But seeing as it was the last one ever, we thought we owed it to get something done and support the event. And to actually get it done beforehand. And then go on that massive bender I was talking about. </p>
<p>A couple of months ago we started thinking about what we could do in the time available. Frameranger was our last big piece &#8211; a blockbuster, the kind of piece you go into a  competition with being pretty confident you&#8217;re going to win &#8211; but that took far too much time, effort and pain to want to repeat in a hurry. Besides, one problem with Breakpoint is Farbrausch &#8211; they co-organise Breakpoint, and it was very likely they were going to show up with another massive production like <a href="http://www.pouet.net/prod.php?which=30244">Debris</a>, which effectively defined the scene in 2007 and who&#8217;s influence still ripples across it. Without 6 moths or a year to work on something we probably wouldn&#8217;t be able to compete with them if they decided to push something big out. We also realised that we didn&#8217;t actually <em>want</em> to compete: it would be better to make something that we liked, that was enjoyable to produce, that had a bit more depth to it, showed more maturity, and that perhaps had some more relevance outside of the scene than the big multi-part spectacular we could make that might win. So, there and then we gave up on winning. That was actually quite liberating, and we got on with making it. </p>
<p><img src="http://directtovideo.files.wordpress.com/2010/04/agenda001.jpg" alt="agenda circling forth" /></p>
<p>First we tried to source a soundtrack. Our first idea was to try and chance it, and contact a major label we admired and asked them to use a certain track we liked. Unfortunately they never even bothered to reply, so we started talking to some musician friends of ours to try and sort something out. They showed us a big load of material that they had been working on and we picked something out, and did a great job fixing it up. Result &#8211; we were hooked up with the ideal soundtrack from day one and could design the whole piece around it.<br />
<em>Note: since release, some copyright issues have come to light with the soundtrack due to the large volume of material sampled from one source: &#8220;Queen of the Universe&#8221; by Socrates. This didn&#8217;t get credited in the original release of the production because of a misunderstanding between the musicians and the artists; we&#8217;re working to resolve it ASAP.</em></p>
<p>We decided early on that the approach we should take with the visuals was to try and do everything with particles using a developed version of the particle system from Blunderbuss. It meant we had the ability to make something much more organic and flowing &#8211; every part of the screen could be moving all the time, and it would give it an abstract element that&#8217;s hard to achieve with polygons alone. However, this time we wanted to combine particles with actual graphics and make a full piece with multiple scenes from it. </p>
<p><img src="http://directtovideo.files.wordpress.com/2010/04/agenda012.jpg" alt="agenda circling forth" /></p>
<p>Usually we develop the tech and the visuals in tandem &#8211; and sometimes the tools too &#8211; which often causes a lot of problems. The good thing for this project was that for once we actually had most of the core tech done before we started on the visuals. What we had from Blunderbuss gave us the basics &#8211; lots of particles, sorted and rendered with shadows, and a few effects like (fake) fluids on top. But it was essentially quite simple &#8211; one (point) emitter at a time, affectors affect all the particles, and the affectors and emitters were quite basic in themselves. It worked for that piece but it wasn&#8217;t enough for something larger and more complicated. What it lacked was control &#8211; we needed to combine multiple emitters, control particle counts per emitter, and add some more advanced features to cope with using meshes and scenes with animation, targeting into other meshes and scenes, more advanced affectors and much more control and quality in the rendering. And to not completely destroy the frame rate on the way. </p>
<p><strong>Particles and meshes</strong></p>
<p>Emitting from meshes was something I&#8217;ve already worked out &#8211; I have a routine that generates a big texture containing lots of positions spawned at random places on the polygonal surface of the model. This was simply done using random barycentric coordinates on each triangular face of the mesh. The number of random positions per triangle is weighted by the area of the triangle, so the points are evenly spread across the mesh and you get a pretty solid object. On top of this I added support for sampling the model&#8217;s material colours, vertex colours and textures and storing those resulting colour values in a texture too. </p>
<p>Adding support for skinned, animated meshes as emitters and targets was more difficult. The first task was to be able to apply skin/bone transforms in pixel shaders to calculate the animated position of the particle representation at the current point in time. That was quite straightforward: I added the skin weights and bone indices as additional textures, then the bone matrices themselves in a dynamic 1d texture which could be looked up by the bone indices. The skinning code was basically a copy-paste from the vertex shader version I already have, reading bone matrices from texture lookups instead of from vertex shader constants. Spawning a particle at this final resulting skinned position worked fine &#8211; the spawned particles appeared where the mesh was posed at the current frame. However, it gave an ugly motionblur-esque trail to the movement because the particles didn&#8217;t move with the animation &#8211; they spawned where the animation posed them and then the affectors (fluid, forces etc) took over. </p>
<p>What I needed was to combine the spawning with an affector step which also moved the particles with the animated pose, by calculating the current position and moving them to it. I had set it up so that each particle had a unique and corresponding entry in the mesh position texture so it was easy to follow which particle was tied to which point on the mesh, so it was also straightforward to add &#8211; but it didn&#8217;t do the job either, because I&#8217;d swapped following the affectors for following the animation. I needed a blend of those two functions which allowed a particle to follow the animation &#8220;a certain amount&#8221;, and follow the affectors too. That&#8217;s where it all starts to get fuzzy &#8211; there&#8217;s no &#8220;right result&#8221; for that, so I just had to work on what looked good. </p>
<p>I computed the current and previous frame&#8217;s skinned mesh position for the particle using the skinning routine. Then I had a weighting function based on the particle&#8217;s position compared to the mesh position, which also factored in a function based on the life of the particle. I computed that weight for the particle&#8217;s previous position and against the previous mesh position, and for the current positions, and picked the greater of the two weights; then I used that weight to blend between the particle&#8217;s current position and the current mesh position. The result was that the particle was able to become more affected by the mesh as it got closer to it, until it eventually &#8220;stuck&#8221; to the mesh and followed it through the animation &#8211; until it got towards the end of its life, when it stops being affected as much and gradually falls away from the anim (under control of other affectors). </p>
<p><img src="http://directtovideo.files.wordpress.com/2010/04/agenda013.jpg" alt="agenda circling forth" /></p>
<p>Being able to emit from one mesh wasn&#8217;t good enough &#8211; we wanted to throw a whole Lightwave scene at it with multiple objects and animation and even modifiers like Fertilizer (making meshes appear to grow in over time) and animated visibility, and it would figure it out. This just meant that we had to combine all the meshes into one big soup when generating the particles &#8211; as long as we kept the object ID per particle in the texture, it was possible to match up the particle to information about the source mesh &#8211; transforms, visibility &#8211; stored in 1d texture lookup tables, and perform the necessary processing in the pixel shader. </p>
<p><strong>Multiple emitters and materials</strong></p>
<p>Part of the requirement for building a more complex scene was that we had more than one mesh / scene emitting particles at once. The naive solution was to just add more particle systems, but &#8211; apart from the performance implications &#8211; this had a fundamental flaw: the particles weren&#8217;t in the same render targets anymore so they didn&#8217;t sort against each other. In some places this was not a problem &#8211; it was an easy way to solve e.g. the background / sky particles &#8211; but it wasn&#8217;t sufficient for a complex scene. We needed to be able to emit from multiple emitters and share the same render targets. So I assigned each emitter a scissor region dynamically which controlled which part of the spawn information targets they could write to, and in turn which particles were spawned from each emitter. I also preserved the ID of the emitter which spawned a particle in the particle GBuffers. </p>
<p>Those particle GBuffers are starting to look more and more like a deferred renderer. The emitter index can be used to access all sorts of things that can now be controlled per emitter rather than globally &#8211; e.g. material colour, diffuse, ambient, particle size and so on &#8211; just like we had in the deferred renderer using a material index or object index. We can also use the emitter index to look up a table of transforms &#8211; so we can choose to move the particles with the emitter they came from. </p>
<p><strong>Screen-space emitters</strong></p>
<p>The particle system started to look increasingly like a deferred renderer, but what about that deferred renderer we had for rendering polygons? It&#8217;s not producing anything that makes it to the final render in the demo, but it still has a role. The GBuffers produced when rendering solid objects are now used to emit particles from. The depth buffer can be used to reconstruct the world position of a pixel on screen; the colour buffer provides the base material / texture colour; and we can even run the usual lighting passes and post-fx and obtain a buffer of lit, shaded pixels as would be rendered to the final screen. Most of the background scenes were rendered with lighting and SSAO before the particles took on the colour; they were then lit additionally by the particle lighting and shadowing. </p>
<p>This gives you something that&#8217;s 3d-in-2d &#8211; 2.5d? &#8211; so it isn&#8217;t as solid as emitting in 3d from a mesh, but it has that advantage of looking much more solid (from the initial perspective it was emitted from) with far fewer particles than when emitting from a mesh. </p>
<p><img src="http://directtovideo.files.wordpress.com/2010/04/agenda016.jpg" alt="agenda circling forth" /></p>
<p>A side issue was how to make affectors override other affectors, given that they only produce a velocity buffer. That was quite simple &#8211; I sorted them using a controllable key and changed the blend mode. Whereas most affectors (velocity, fluid) blend additively, certain affectors (mesh / image attractors) render towards the end of the list and blend linearly &#8211; so they override the motions of the additive affectors but blend by their affecting weight. We also wanted to be able to tie emitters only to certain affectors, and this was handled again with a 1d lookup table on emitter index. </p>
<p>Another thing we wanted was to be able to keyframe the effect of an affector on a particle, and other properties of a particle, over the particle&#8217;s life time. Given everything was done on the GPU and needed to be efficient, arbitrary keyframe data was never going to be practical &#8211; so we used a simple approximation that still gave us control: we attached 1D bezier curves for many properties. They can be evaluated very efficiently in a shader and they still give a decent amount of control. </p>
<p>We added a few new affectors to the engine; not least was proper fluid dynamics support. I&#8217;ve had GPU versions of 2D and 3D Navier Stokes grid solvers for quite some time, and I tied in the 2D one to drive particles. 2D solvers are very effective in the right use-case, even in a 3D scene: we put them on a plane in 3D space, projected the particles onto that plane and sampled the velocity, and applied a falloff towards the edges of the plane and on the z distance from the plane. This did the job neatly for destroying the moon &#8211; the one place we needed &#8220;proper fluid dynamics&#8221; to make it look good. </p>
<p><strong>Making the demo</strong></p>
<p>As usual, we were running late. With 4 weeks to go we had a few sketches of ideas and the basis for some scenes, but nothing was too far along. The problem was that we had the music and the technical plan already sorted, but we didn&#8217;t have a solid visual concept and story locked down &#8211; just a few bits of graphics and some test scenes. The first scene that was laid down was the flowers, which was the first thing we really tried to do with it and existed in some form for several weeks. That helped us nail down our look and flow, and got the tech more or less finalised too. Much of the time was spent with Jani trying out ideas in the tool, and me fixing all the many things that didn&#8217;t work and responding to feature requests. With 2 weeks to go we hit the point where we couldn&#8217;t go any further without a fully fleshed out concept and we rapidly went through a few revisions &#8211; some were quite tight and story-driven and others much more vague. Finally we hit upon something that would flow and we got busy making the extra content. </p>
<p>In the last week things finally started to move. We went into full-on crunch, and worked late into the night every evening. My day started at 6am and ended around 1-2am; living on coffee, squeezing work on the piece into spare minutes at lunchtime or on the train, and then hammering on at it late into the night. It turns out you can adjust quite quickly to less than 5 hours sleep a night as anyone with kids would probably know, but still &#8211; apologies to any of my friends or colleages who thought I looked like a big stupid zombie that week. Of course I was totally 100% mentally switched on at all times. Honestly. </p>
<p>The final stages of production on any large project are always a bit painful. The start of a project is a slow, steady high &#8211; you have so many possibilities and the deadline where you actually have to deliver something seems so far away, and it&#8217;s all about ideas and the fun of trying to implement them. But then there&#8217;s a horrible point where you realise you actually have to get something made pretty soon, and you have nothing. From then on it&#8217;s a constant stream of ups and downs &#8211; something goes right or somebody does something great and you feel like it&#8217;s all going to work out, and you&#8217;re on a high; then something doesn&#8217;t go to plan and you feel like it&#8217;s just never going to happen and you&#8217;re right back down again. This gets more and more extreme until the end, where you feel this huge wave of relief / joy / anger / exhaustedness (depending on how it ended up). The whole process is a bit like romancing a really high maintainance nymphomaniac. Who&#8217;s on uppers. And never stops calling you up during the day. And keeps making you buy her shoes. Highly enjoyable in some ways but you sure suffer for it in others. And somehow you always forget enough of the downsides to want to repeat the experience a few months later. </p>
<p>As the week progressed the demo moved forward a lot. The scene with the running people, then the intro and the final part were built in quick succession. The part with the creature in the forest was done last &#8211; that was the one part I actually built myself, although Jani did a lot of work on it afterwards to make it into something decent. One advantage with building everything from particles is that it hides a multitude of sins &#8211; you don&#8217;t need the same level of polish and work on the models and textures as if you were showing them as plain 3D because it becomes so vague when it gets turned to particles anyway. By the end of the week we still seemed a long way from finishing, but somehow on thursday night &#8211; after a final almost-all-nighter &#8211; it all came together.</p>
<p>It was a very strange situation &#8211; we were done <em>early</em>. Let me illustrate the significance of that: that the last production I submitted to Breakpoint was entered during the competition while the 8th entry was currently <em>playing</em> on the big screen. So on past experience I had expected a certain amount of pressure this time. I think I&#8217;ve come to enjoy that a little bit over the years &#8211; even rely on it &#8211; so having almost nothing to do during the event was a little disconcerting. This was the first time I can remember us having time to sit and polish something in years. Naturally I spent most of the time living it up instead, but Jani used the extra time wisely and kept polishing it. New versions appeared over the weekend, each one getting better and better, until Sunday when we packed the final version. Naturally, in keeping with tradition, we did our best to ignore the deadline; shortly after it had passed, and after a kind announcement by KB over the loudspeaker to remind us to enter, I wandered up to the organisers area with the finished piece. </p>
<p><img src="http://directtovideo.files.wordpress.com/2010/04/agenda007.jpg" alt="agenda circling forth" /></p>
<p>For the whole time we worked on the project, I didn&#8217;t expect to win. Although I was really happy how it turned out I thought it could go down like a lead balloon in the competition &#8211; it&#8217;s slow, abstract and it doesn&#8217;t have the crowd-pleasing bling you need to win big. It was a risk for us to do something like this, and I hadn&#8217;t thought of it as a competition piece at all. Yet somehow, in a strong field and up against our old german friends from Farbrausch, it won out. I still don&#8217;t really understand it, and I received the prize in a bit of a state of shock. I was half thinking &#8220;there&#8217;s been a mistake; run for it before they change their minds&#8221;. </p>
<p>In this scene of ours it&#8217;s easy to get obsessed with winning. But if there&#8217;s something I&#8217;ve learnt from this it&#8217;s that it&#8217;s so much better just to make something you want to make and are happy with; competition, winning, that&#8217;s something that happens sometimes and won&#8217;t happen other times, but either way it doesn&#8217;t really matter. It&#8217;s the icing on the cake if it happens, but the cake still tastes pretty good un-iced. I never liked marzipan anyway. </p>
<p>In the end the piece is a series of scenes that were connected by the common motifs that they followed through it, and a loose storyline driven in part by the music. Yep, you got it &#8211; I don&#8217;t want to explain the content of the piece too much. It&#8217;s much better if you draw your own conclusions. Some underlying themes are explored, and you&#8217;re welcome to look for them or just take it at face value. </p>
<p>People have said to me, &#8220;what&#8217;s next, aren&#8217;t you bored of particles?&#8221; &#8211; and I say &#8220;no&#8221;. Particles/points are a primitive, just like polygons. We haven&#8217;t got bored of polygons yet after what &#8211; 30 years+? There&#8217;s much more that can be done, and we&#8217;ve only scratched the surface of what&#8217;s possible. New ideas and new hardware make more things happen all the time. We&#8217;ll be back. I&#8217;m not sure in what form, but watch this space.</p>
<p><em>There&#8217;s been quite a lot of coverage of Breakpoint that we&#8217;ve benefited from &#8211; including a feature on the German TV channel <a href="http://www.youtube.com/watch?v=zM7tMJP2rpU">3Sat</a>.</em> </p>
<p><em>By the way, you really do need a good GPU to watch this in realtime. The &#8220;detail settings&#8221; we had for Blunderbuss to make it watchable on low-end hardware didn&#8217;t work here because the scenes were much more complex and we had to tune it for one detail setting &#8211; the highest. So you&#8217;ll need something top-of-the-line (think Geforce 280) to be able to enjoy it. Don&#8217;t worry about running it at less than highest resolution though &#8211; you won&#8217;t gain much from 1080p over 720p, for example. CPU and memory don&#8217;t make much difference, though.</em></p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/223/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/223/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/223/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/223/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/223/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/223/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/223/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/223/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=223&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2010/04/19/agenda-circling-forth/feed/</wfw:commentRss>
		<slash:comments>17</slash:comments>
<enclosure url="http://cappedtv.dkev.org/vhq/fairlight_cncd-agenda_circling_forth.mp4" length="202041662" type="video/mp4" />
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/04/agenda005.jpg" medium="image">
			<media:title type="html">agenda circling forth</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/04/agenda001.jpg" medium="image">
			<media:title type="html">agenda circling forth</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/04/agenda012.jpg" medium="image">
			<media:title type="html">agenda circling forth</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/04/agenda013.jpg" medium="image">
			<media:title type="html">agenda circling forth</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/04/agenda016.jpg" medium="image">
			<media:title type="html">agenda circling forth</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/04/agenda007.jpg" medium="image">
			<media:title type="html">agenda circling forth</media:title>
		</media:content>
	</item>
		<item>
		<title>vimeo.</title>
		<link>http://directtovideo.wordpress.com/2010/03/05/vimeo/</link>
		<comments>http://directtovideo.wordpress.com/2010/03/05/vimeo/#comments</comments>
		<pubDate>Fri, 05 Mar 2010 17:41:38 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=216</guid>
		<description><![CDATA[I feel slightly dirty about this, but I&#8217;ve joined vimeo. Did I suddenly get overcome by an urge to get a bit more web2.0, you ask? No, it&#8217;s because I had to in order to submit something to the Victoria&#38;Albert Museum&#8217;s Decode:Recode gallery. The idea being: you take their ident piece as source code, mess [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=216&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I feel slightly dirty about this, but I&#8217;ve joined vimeo. Did I suddenly get overcome by an urge to get a bit more web2.0, you ask? No, it&#8217;s because I had to in order to submit something to the Victoria&amp;Albert Museum&#8217;s <a href="http://www.vam.ac.uk/microsites/decode/recodegallery/index?page=1">Decode:Recode gallery</a>. The idea being: you take their ident piece as source code, mess with it and send it back to them.</p>
<p>We were looking at the other entries a couple of weeks ago, and several things struck us: 1. most of them didn&#8217;t change the code at all except for messing with the colours; 2. the colours were a bit.. dutch to start with; 3. the nicer ones were generally done with AfterEffects messing with the video, not running realtime. It felt like it was time the demoscene struck back and offered an education on the fine art of realtime graphics. So, that&#8217;s what we did. Paul offered up a 512 byte intro done on the spectrum, and I made something for high end PCs. Very, very high end PCs. </p>
<p><img src="http://directtovideo.files.wordpress.com/2010/03/decode002.jpg" alt="decode" /></p>
<p>The piece uses a particle system generated off a voxelised version of their logo (the only thing I preserved from the original), rendered as 250,000 cubes and affected by various swarming modifiers, and then using a crafty ambient occlusion / radiosity tracer using raycasts through a voxel set version of the particle system. Insert the usual lighting + shadows, cameras, fine selected colours (not dutch!) and post processing effects here, and we&#8217;re done. A few hours work well spent.</p>
<p><img src="http://directtovideo.files.wordpress.com/2010/03/decode001.jpg" alt="decode" /></p>
<p>Anyway. Check the video on Vimeo <a href="http://vimeo.com/9888065">here</a>, and check my Vimeo page <a href="http://vimeo.com/user3247986">here</a>. Now I&#8217;ve got it, I might as well use it. I&#8217;m afraid the encoding on the vimeo video is terrible, but there&#8217;s a high quality version on download &#8211; I might upload a proper one at some point if anyone wants it.</p>
<p><em>Update:</em> My Decode:Recode made it into the Metro newspaper in the UK! I&#8217;ve uploaded a scan <a href="http://directtovideo.files.wordpress.com/2010/04/metro_decode.pdf">here</a>. </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/216/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/216/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/216/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=216&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2010/03/05/vimeo/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/03/decode002.jpg" medium="image">
			<media:title type="html">decode</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/03/decode001.jpg" medium="image">
			<media:title type="html">decode</media:title>
		</media:content>
	</item>
		<item>
		<title>ambient occlusion in frameranger.</title>
		<link>http://directtovideo.wordpress.com/2010/01/15/ambient-occlusion-in-frameranger/</link>
		<comments>http://directtovideo.wordpress.com/2010/01/15/ambient-occlusion-in-frameranger/#comments</comments>
		<pubDate>Fri, 15 Jan 2010 09:13:43 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=117</guid>
		<description><![CDATA[Following on from deferred rendering, here&#8217;s how we added ambient occlusion / indirect illumination to Frameranger. Baking ambient occlusion was standard practice in our demos for some time now. You make the scene, go into Lightwave and hit the big red button called &#8220;bake&#8221;, and it comes back a bit later with something nice. Well [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=117&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Following on from deferred rendering, here&#8217;s how we added ambient occlusion / indirect illumination to Frameranger.</p>
<p>Baking ambient occlusion was standard practice in our demos for some time now. You make the scene, go into Lightwave and hit the big red button called &#8220;bake&#8221;, and it comes back a bit later with something nice. Well it&#8217;s a bit more complex than that, but you get the point. However, our scenes in frameranger were a) numerous, b) on the large side and c) full of dynamic stuff. Which means that a) there&#8217;ll be a lot of lightmap textures, b) the light map textures will either be massive or look shit (or both), and c) the lightmaps won&#8217;t work anyway because most of the geometry doesn&#8217;t exist in Lightwave in the first place (or it&#8217;s moving). Oh, and there&#8217;s a 64mb file size limit in the demo competition at Assembly and most other demo competitions &#8211; ridiculous given the amount of memory and bandwidth we have nowadays, but a rule&#8217;s a rule. So it soon became apparent that baking ambient occlusion was not going to happen. It would have to either not be done, or be done in a different way. 3D doesn&#8217;t look too hot without some form of indirect lighting, so it looked like we&#8217;d need something clever to do it in the engine.</p>
<p>SSAO, then?</p>
<p>No. Allow me to now share my dislike of SSAO. I admit it &#8211; in the past I might have evangelised about it at some point in a conference talk or sample application and I&#8217;ve even used it in a couple of demos. Back in 2007 when I first started using it it seemed all cool and novel, like a magic silver bullet for a very difficult and long-standing problem. But it&#8217;s now 2009, for the last couple of years every other &#8220;my first game engine&#8221; image of the day on gamedev.net has managed an even worse bastardisation of the effect since the last one. The &#8220;SSAO look&#8221; has become something you can spot a mile off.</p>
<p>Let&#8217;s call SSAO what it really is: &#8220;crease darkening&#8221;. It picks out creases in the z buffer, and darkens them. The size of those creases, the amount of artefacts and the quality of approximation to real AO depends on the implementation, the amount of GPU time you&#8217;re willing to sacrifice and the scene itself &#8211; I bet we&#8217;ve all seen nice images of some object sitting on a plane and neatly getting shaded in a way that looks just like the cleverly constructed reference image from a Mental Ray render. But what they didn&#8217;t show was what happens when you put a massive wall by it some way off out of shot, and how the reference image and the SSAO version suddenly look completely different. And that&#8217;s the problem with SSAO &#8211; it&#8217;s great at capturing small crevices, and shit at capturing the large-scale global effects of ambient occlusion. The geometry we had in Frameranger wasn&#8217;t very crevicey, and we really needed those large-scale effects.</p>
<p>Having SSAO might be better than not having SSAO. But it is certainly not the final solution to the indirect lighting problem. And don&#8217;t even get me started on SSDO for faking radiosity. Sounds great in theory huh? SSAO kindof works, faked colour bleeds kindof works, so if we glue them together we get.. one reason why artists don&#8217;t like coders helping out on the visuals.</p>
<p>In short, SSAO went out the window quite early on. We tried it and it didn&#8217;t work for the scenes we had. So it was time to come up with something better. Unfortunately, there were no magic solutions &#8211; no one technique we could use to say &#8220;generate AO for everything&#8221;. There are a lot of techniques that can be used to generate ambient occlusion effects in realtime, or at least interactively, and each of them works for a certain case &#8211; some only work for small scenes, others for rigid or static objects or a certain rough shape. Until the hardware gets sufficiently powerful to make a silver bullet that kills the problem, we&#8217;re stuck with taking all of these methods and working out what we can do for each case &#8211; how best to solve the specific problem, the specific scene, at hand.</p>
<p>I tried out a lot of different techniques. Here&#8217;s a quick list:</p>
<ul>
<li>Ambient occlusion fields</li>
<li>Analytical methods (the &#8220;ao for a sphere&#8221; calculation)</li>
<li>Shadow maps / &#8220;loads of lights&#8221;</li>
<li>Ray marching through voxel sets</li>
<li>Signed distance fields</li>
<li>Heightmap ray marching</li>
<li>Hand-made approximations</li>
</ul>
<p>Now to go through those in a bit more detail. Skip along if you&#8217;ve heard it before.</p>
<p><strong>Ambient occlusion fields</strong> (as described <a href="http://www.tml.tkk.fi/~janne/aofields/">here</a>).</p>
<p>A method for precomputed, static AO for an object stored in a volume texture, used for casting AO onto other objects. All you need to do per frame is read the volume texture in a deferred pass, and write the value out. It&#8217;s a bit like a modern, 3D, accurate version of the blob shadow. It looks nice, but it only works for rigid or static meshes &#8211; and those meshes had better be quite small, because the bigger it is the bigger a volume texture you need &#8211; for small objects a 32x32x32 or smaller can suffice, but you&#8217;d want larger for bigger objects. You also need a bigger map the wider area you want to spread the AO out over. Generating it needs a lengthy precalc or offline storage of the volume texture &#8211; you need to cast a bunch of rays at every volume cell to work out the AO, so it&#8217;s pretty dog-slow to update in realtime. I experimented with it a while back, and it gives nice soft results, although it doesnt work too well for self-occlusion.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/ao_field.jpg" alt="Ambient occlusion fields - early test" /></p>
<p>I discarded this one &#8211; I was thinking about using it for the car and spider, but it was too slow to generate and not updatable in realtime. I did however use a similar idea for the arcs in the first scene, which I&#8217;ll get onto later.</p>
<p>A quick word on the analytical AO techniques &#8211; iq covered it <a href="http://iquilezles.org/www/articles/sphereao/sphereao.htm">here</a> much better than I could, but in short: you can analytically calculate the occlusion effect for a given object, such as a sphere, pretty easily. If only we just had a bunch of spheres. I tried making sphere-tree versions of some of my objects but it looked rubbish, so I moved on.</p>
<p>Now a completely different approach: using <strong>shadow mapping for ambient occlusion</strong>. I&#8217;ve used this idea a few times, first in <a href="http://www.pouet.net/prod.php?which=14110">Fresh!</a> &#8211; although as a precalc. It works something like this: ambient occlusion is meant to be &#8220;the amount of ambient / sky light which reaches the point&#8221; &#8211; so the easiest way to calculate that is to cast a load of rays out from the point and see how many don&#8217;t hit anything. In GPU land, where we presently reside, you render the scene from your point with a 180 degree FOV (or near to it) to a small buffer and see how many of the pixels don&#8217;t get filled. But that implies that to generate a 1024&#215;1024 lightmap you&#8217;d need to render the scene about a million times, which is insane. So instead of that we can flip the problem on it&#8217;s head. Render the scene from the outside in, using fake lights and rendering to shadow maps, a few hundred times. Then apply the lights / shadow maps as usual to each pixel in the lightmap, and sum up how many of the lights each pixel was in shadow for. That means you only need to render the scene a few hundred times instead of a million &#8211; which is a great optimisation in my book. I wish I thought of this first, cos it&#8217;s clever, but <a href="http://www.andrew-whitehurst.net/amb_occlude.html">I didn&#8217;t</a>.</p>
<p>For realtime, rendering the scene a few hundred times still sounds pretty bad. If you really cut it down you could get away with around 50-100 passes, render depth only with low resolution geometry &#8211; and now we&#8217;re almost approaching something that could actually run. In fact I used this in realtime in a 4k called <a href="http://video.google.com/videoplay?docid=7516420937688305260&amp;hl=en#">Glitterati</a> back in 2006, and we&#8217;ve got a lot more GPU power nowadays.</p>
<p>For my new deferred implementation I wanted to up the speed and the quality. The main optimisation to be able to cache the results between frames &#8211; after all, for relatively static scenes the ambient occlusion doesn&#8217;t change very much from frame to frame &#8211; and to be able to use less shadow map passes per frame but giving the effect of more passes to increase the quality.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/realtime_ao002.jpg" alt="early version of shadow map ao with no interpolation or colours" /><img src="http://directtovideo.files.wordpress.com/2009/10/realtime_ao001.jpg" alt="added colours " /></p>
<p>The way I did this was to limit the effect to a grid. This effect only works on a limited area anyway as it&#8217;s using shadow maps &#8211; which are limited in the size they can work on without the aliasing becoming too much of an issue &#8211; so gridding it didn&#8217;t hurt too much. I created a 3D grid of points which was something like 256x16x256 &#8211; much larger in X and Z than Y, which fit the scenes at hand quite well. This mapped to a number of slices of 2D 256&#215;256 textures, and I used 4 channels to store 4 Y slices per 2D slice. This gave me a format which was quite easy to render to and sample with flitering. To render to it, I simply worked out the world position per pixel in the slice texture, sampled all the shadow maps for that position and averaged the results.</p>
<p>I sampled from it using my deferred normal and depth buffers, by working out which 2D position to sample from per pixel and processed several values in a kernel around it. I used the normal and distance from my deferred position to weight the samples, so they had a directional element to them.</p>
<p>I also wanted to avoid having to re-render all the shadow maps and sample from every frame. This turned out to be quite easy &#8211; I had a rolling buffer of shadow maps packed onto a 2D slice texture &#8211; fortunately they didn&#8217;t need to be high resolution. In total I had 64 shadow maps of 256&#215;256 each packed onto a single 2048&#215;2048 texture in an 8&#215;8 arrangement. I re-rendered a number of shadow maps each frame. The grid was only updated using the re-rendered shadow maps like with a rolling sum &#8211; so first I subtracted the effect of the previous data I had for the shadow maps, and then I re-rendered them and added the new effects back in. Of course, the problem was that there was a lag when things moved, as the shadow maps updated over several frames to reflect the changes, but it meant that I could compromise speed for update rate. It proved very effective for background geometry as a lightmap replacement.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/realtime_ao003.jpg" alt="realtime ao in the frameranger city" /><br />
<img src="http://directtovideo.files.wordpress.com/2009/10/realtime_ao004.jpg" alt="realtime ao in the frameranger city - 2" /></p>
<p>I tried it out on the city in Frameranger. The main problem is the speed/memory vs quality trade-off &#8211; for the size of the scene, it needed bigger shadow maps and a bigger grid, and the quality just wasn&#8217;t good enough with acceptable performance. Works well with a smaller scene though.</p>
<p><strong>Raytracing voxels</strong></p>
<p>Going back to &#8220;ambient occlusion as a ray tracing problem&#8221; again &#8211; wouldn&#8217;t it be clever to make an easily raytracable version of my geometry, then cast a load of rays at it and see how many don&#8217;t hit anything? &#8220;Easily raytracable&#8221; on a GPU for arbitrary geometry probably means &#8220;voxels in a volume texture&#8221;. And, as we demo sceners have got to know well recently, an easy way to raytrace a voxel field is to convert it into a signed distance field and then ray march through it with sphere marching &#8211; it greatly reduces the number of samples on the ray march that you need to take, potentially skipping large (empty) parts of the field and getting to the answer quickly. Given that AO likes a bit of noise &#8211; ironically although it looks &#8220;worse&#8221; it makes it look more like a render, and therefore more believeable and &#8220;better&#8221; &#8211; you could spread the rays randomly and get away with quite few per pixel. Combined with distance field tracing it might come up with something sensible performance-wise.</p>
<p>I did try this approach out briefly, but it just wasn&#8217;t quite fast enough (or good looking enough at acceptable speed). But then I remembered an old talk from <a href="http://www.jshopf.com/blog/?p=43">Alex Evans at Siggraph</a>. It turns out that signed distance fields have a useful fringe benefit &#8211; if you take a few samples from them and perform some dirty function on the results, the result can look a lot like ambient occlusion. It&#8217;s quite efficient, and a lot of 4k intros have been using the approach recently to <strong>approximate ambient occlusion using signed distance fields</strong>. Iniqo Quilez talks about this in <a href="http://iquilezles.org/www/material/nvscene2008/nvscene2008.htm">one of his presentations</a> on distance field raytracing.</p>
<p>The main difficulty with this approach was that my scene did not currently exist as a distance field. Or as voxels in a volume texture. It was a bunch of polygons. So, how do you convert a mesh to a signed distance field? Option 1: compute the closest distance to triangle for all the triangles, and take the closest result. Option 2: voxelise the mesh and convert it to a signed distance field. I used both approaches. I wanted to do it on the GPU for speed, and so I rendered to a volume slice map &#8211; a 2D render target containing the slices of a 3D texture laid out in sequence.</p>
<p>The first method I used for offline computation, and only for static meshes &#8211; although it was still done on the GPU; the pixel shader for closest distance from point to triangle for a given triangle isn&#8217;t too difficult to do, and it can be optimised (I used an octree to reduce the areas where I computed the actual result) but the time to compute for a reasonably serious triangle mesh was too long for real time.</p>
<p>The second method I solved by rendering the object&#8217;s depth to the 6 views of a cube map around the object, from the outside pointing in. Then I performed a slightly dubious process whereby I try and work out the closest distance to the surface by something close to guesswork. It&#8217;s like this: given a 3D point inside the cube, sample the depths for that point from the 6 faces of the cubemap. This is enough to tell you if the point is inside the object, and get the closest of the 6 distances to the point &#8211; giving you a signed distance. Of course one sample isn&#8217;t enough to handle more than the simplest shapes, so you actually do a kernel around the point, check the different results and pick the closest from that. If this was a research paper I&#8217;d do a diagram. Nevermind. This solution gives good results for convex shapes, and pretty decent results for any shape where you don&#8217;t have completely inaccessible points from an exterior viewpoint &#8211; and it runs realtime, so I could use it to work with animated objects. Unfortunately the &#8220;signed distance&#8221; it gives isn&#8217;t exactly the closest possible result &#8211; more of a &#8220;reasonable guess&#8221;.</p>
<p>This problem was a pain to solve, but worth it. It opened a few doors for implementations of other effects like CSG, and for using meshes as inputs to fluid dynamics solvers working on level sets. It worked alright for small objects like the car, but had no chance on something like the city &#8211; it relied on the resolution of the volume texture, and would need a massive one to handle the city.</p>
<p>The ambient occlusion result it gave was alright for the polygonal objects I tried it on, but I found better and less memory/computation heavy solutions. The really good application of this technique came along later when I was working with the aforementioned liquid effects using level sets. See, in that case I already *had* a signed distance field that I was using to raytrace the effect in the first place &#8211; I didn&#8217;t have to mess around and generate one. So I simply took that as an input and used my already-existing ambient occlusion code to render AO for it. And hey, it worked!</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/fluid_ao.jpg" alt="Distance-field raytraced effect with AO" /></p>
<p>This was great for the fluid dynamics scenes, but it was overkill for handling things like the car. So I simplified it..</p>
<p>I&#8217;m not completely proud of this, but here goes: the AO solution for the car and the spider in the end was basically a glorified .. blob shadow. A fancy one updated in realtime and with some additional terms and trickery, but still essentially a blob shadow. And why not? Essentially what I had to produce, when you break it down, was what looked like the ambient occlusion effect of the car on the ground &#8211; which is usually some variation on a flat plane. Which is a pretty good fit for a blob shadow.</p>
<p>So I rendered the object from above and stored the depth value (so, the height value) in a 2D render target. Then I sampled that by using the GBuffer depth and normal values, projecting the derived world space position into the heightmap&#8217;s space, reading the height value from the map and doing some monkeying around with it to create a darkness value based on the distance between the heightmap&#8217;s position and the read position &#8211; so the value got darker as they got closer together. I took multiple samples with a randomly rotated poission disc, and calculated a blurred/softened result with that. In a way it was a bit like that ATI DOF technique, actually. I blurred the blob heightmap as well, because well &#8211; it made it look better. If you&#8217;re faking anyway, why stop? And the results were pretty good! It did the job it was required to do neatly and efficiently &#8211; and it was easy to combine multiple blob shadows because they were applied in the deferred render / lighting stage, so I could just render them in screen space and blend them together.</p>
<p><img src="http://directtovideo.files.wordpress.com/2010/01/spider_blobao01.jpg" alt="" /></p>
<p>If there&#8217;s a lesson there, it&#8217;s that you shouldn&#8217;t discard simple, old-fashioned techniques just because they&#8217;re simple and old-fashioned. Sometimes you can get more joy and much better visual results from combining a bunch of simpler techniques in a clever way than you can from trying to find one magic super-technique which handles everything.</p>
<p>But I still needed a solution for the city / environment. There was an effect I came up with a couple of years ago but never used in a demo which calculated ambient occlusion in realtime for a very specific case: a heightfield (heightmap) rendered as boxes. This is getting back to the idea of raytracing ambient occlusion again &#8211; the trick is, as with the other cases, to simplify the representation of the stuff to render to get it into some form that you can raytrace through easily. In this case, the representation is a heightmap, which can be raytraced efficiently by raymarching through it in 2D. You project your world-space position into the heightmap&#8217;s space, then march through the heightmap in the ray&#8217;s direction, and at each point you compare your ray&#8217;s height against the heightmap&#8217;s height. The simple way is to just compare and see when the height of the ray is below the height of the heightmap, but you can do a bit of maths with the two values instead and get a better result for occlusion. Finally, return the total occlusion for all the rays you cast for the pixel.<br />
Cutting out one axis means you don&#8217;t need as many rays per pixel, and I could get good results with casting only 5-10 rays (spread with noise) per pixel and with only around 8 march steps per ray. i.e. the kind of figures that can actually work in realtime.</p>
<p><img src="http://directtovideo.files.wordpress.com/2010/01/block_heightfield_ao.jpg" alt="block heightfield ambient occlusion" /></p>
<p>Naturally it works for boxes because you just render a box per pixel in the heightmap, and the heightmap is a perfect fit to the boxes: you&#8217;ve got a 2D and 3D representation of the same thing and you don&#8217;t lose any information between them. But can it be stretched further? Why stop at boxes? Why not render any scene you like to a heightmap, and use that heightmap to raytrace the ambient occlusion for that scene? Of course, the quality is going to depend on how well a heightmap can approximate the scene &#8211; if there are a lot of overlaps, holes and interior details it&#8217;ll start to break down. But how much does it matter?</p>
<p><img src="http://directtovideo.files.wordpress.com/2010/01/cubewall01.jpg" alt="cube wall - logo and object with heightfield ao" /></p>
<p>It turns out that for the city, it didnt matter that much at all. A city can map quite well to a heightmap superficially &#8211; buildings and roads and so on. The great thing about it is that it efficently captured the larger-scale occlusion &#8211; precisely the stuff that SSAO lost &#8211; without rendering a huge amount of stuff every frame. It just needed the heightmap and the rays, which were cast off the GBuffer. All the overlapping stuff, bridges etc, didn&#8217;t work perfectly of course, but hey &#8211; who gives a shit? It looked good! One way this could be used is to bake the local AO for the buildings themselves &#8211; i.e. the windows of a building on that building&#8217;s walls &#8211; and blend it with this technique used for the global AO.</p>
<p><img src="http://directtovideo.files.wordpress.com/2010/01/city_heightao.jpg" alt="heightfield ao used on the city scene" /></p>
<p>One final thing remained. The first scene with the wave of arcs needed a solution and none of the methods worked well. It was fully dynamic &#8211; the geometry was procedural and animated; there was loads of arcs; the scene covered a lot of ground; and the ambient occlusion was absolutely key to the look &#8211; it would look flat and lifeless without it. Quite a challenge.</p>
<p>After trying out numerous things, I came up with a solution that was completely specific (read: hardcoded) to the scene in question. I knew that I was dealing with an object &#8211; an arc &#8211; that could be defined easily and mathematically, and that I had to deal with a large number of them (hundreds) but that their movement pattern was procedural, they only moved in Z and rotated, they were rotating centering on Y=0, and each occupied it&#8217;s own unique space in the X axis. I needed to cast AO from each arc onto the other arcs, and also on the ground and any other objects around (e.g. the car) &#8211; but the range of the AO effect around an arc could be limited.</p>
<p>First, I came up with a solution to generate the occlusion effect for one arc on any given point. Initially I defined it mathematically but in the end I just made a 2D texture which was the result of projecting the arc side-on and blurring it &#8211; my 2d &#8220;arc AO map&#8221;. In the shader I projected the world space point into &#8220;arc space&#8221; and sampled the arc AO map, and combined that with a falloff term using the distance to the arc&#8217;s major plane (and some other dirty bits of maths to beautify it). That gave me a nice AO shape for one arc.</p>
<p><img src="http://directtovideo.files.wordpress.com/2010/01/arc_aomap.jpg" alt="the ao map for an arc" /></p>
<p>Then, to solve it for ALL the arcs in the scene, I created a 1d lookup texture which gave me some information about each arc &#8211; it&#8217;s position, rotation, size and so on &#8211; packed into 4 channels. I defined the arcs and the lookup texture in such a way that they were sorted in X &#8211; so I could project my world space position into &#8220;arc lookup texture space&#8221;, and I&#8217;d instantly have the closest arc in X; I could step back and forward through the texture to find the neighbours. So my solution was simple: take N samples from the 1d lookup texture around my point; calculate the AO for each arc it defined; accumulate the AO using some dirty blend function; and output the value. By changing N I could trade off performance and quality. It worked great!</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/09/frameranger_inprogress4.jpg" alt="arcs with ao" /></p>
<p>I think that last technique sums up the point I&#8217;m trying to get across. Generating complex rendering effects like ambient occlusion in realtime for arbitrary scenes with loads of animation that are completely beyond your control is very difficult &#8211; a single technique that works great for anything just doesnt really exist. But how many scenes are you working with that are really like that? I had a scene which, at first look, seemed to be impossible to generate good quality ambient occlusion for in real time. By breaking it down, working out ways to approximate it and mixing and matching techniques, it&#8217;s often possible to find that solution. If you know you&#8217;re only dealing with cubes or you can define it mathematically, if you know your scene is basically all sitting in a single plane, or if there are small static pieces you can bake &#8211; e.g. the local AO for a car &#8211; then exploit those things. Use all the knowledge you can to break the scene down into pieces that you <em>can</em> handle. </p>
<p>Deferred rendering is a really great help for this because it makes applying all those separate pieces very easy &#8211; you can just make a screen-sized buffer, render the different AO elements seperately sampling from the GBuffers for position, normal and object ID data, and blend the different AO elements into that screen-sized AO buffer. Then use that AO buffer as part of your lighting equations, and you&#8217;re golden.</p>
<p>It can be done.</p>
<p>It&#8217;s just a bloody pain.</p>
<p>But it&#8217;s worth it. </p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/117/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/117/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/117/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/117/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/117/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/117/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/117/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/117/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/117/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/117/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/117/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/117/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/117/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/117/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=117&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2010/01/15/ambient-occlusion-in-frameranger/feed/</wfw:commentRss>
		<slash:comments>18</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/ao_field.jpg" medium="image">
			<media:title type="html">Ambient occlusion fields - early test</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/realtime_ao002.jpg" medium="image">
			<media:title type="html">early version of shadow map ao with no interpolation or colours</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/realtime_ao001.jpg" medium="image">
			<media:title type="html">added colours </media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/realtime_ao003.jpg" medium="image">
			<media:title type="html">realtime ao in the frameranger city</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/realtime_ao004.jpg" medium="image">
			<media:title type="html">realtime ao in the frameranger city - 2</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/fluid_ao.jpg" medium="image">
			<media:title type="html">Distance-field raytraced effect with AO</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/01/spider_blobao01.jpg" medium="image" />

		<media:content url="http://directtovideo.files.wordpress.com/2010/01/block_heightfield_ao.jpg" medium="image">
			<media:title type="html">block heightfield ambient occlusion</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/01/cubewall01.jpg" medium="image">
			<media:title type="html">cube wall - logo and object with heightfield ao</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/01/city_heightao.jpg" medium="image">
			<media:title type="html">heightfield ao used on the city scene</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2010/01/arc_aomap.jpg" medium="image">
			<media:title type="html">the ao map for an arc</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/09/frameranger_inprogress4.jpg" medium="image">
			<media:title type="html">arcs with ao</media:title>
		</media:content>
	</item>
		<item>
		<title>deferred rendering in frameranger.</title>
		<link>http://directtovideo.wordpress.com/2009/11/13/deferred-rendering-in-frameranger/</link>
		<comments>http://directtovideo.wordpress.com/2009/11/13/deferred-rendering-in-frameranger/#comments</comments>
		<pubDate>Fri, 13 Nov 2009 12:43:31 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[demoscene]]></category>
		<category><![CDATA[realtime rendering]]></category>
		<category><![CDATA[anti aliasing]]></category>
		<category><![CDATA[antialiasing]]></category>
		<category><![CDATA[deferred rendering]]></category>
		<category><![CDATA[graphics]]></category>
		<category><![CDATA[lighting]]></category>
		<category><![CDATA[real time]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=88</guid>
		<description><![CDATA[(This is going to get technical. Fast.) I&#8217;m a big fan of deferred rendering as you might have gathered from my GDC 09 talk. So it made sense that for Frameranger, and subsequent projects, I moved my demo engine over from what was essentially a forward renderer to a complete deferred renderer. I wanted to [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=88&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>(This is going to get technical. Fast.)</p>
<p>I&#8217;m a big fan of deferred rendering as you might have gathered from my <a href="http://research.scee.net/files/presentations/gdc2009/DeferredLightingandPostProcessingonPS3.ppt">GDC 09</a> talk. So it made sense that for Frameranger, and subsequent projects, I moved my demo engine over from what was essentially a forward renderer to a complete deferred renderer. I wanted to share some of the experience here. There are some good introductions to deferred rendering out there, like <a href="http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter09.html">this one</a>, which cover the basics so I don&#8217;t have to.</p>
<p>Note, I&#8217;m working on DX9 &#8211; DX10 wasn&#8217;t an option at the time of development (and still isn&#8217;t, really, until the supporting OSes take the vast majority of the market) &#8211; so that adjusts my available feature set. No depth buffer reads, no hardware MSAA.</p>
<p>So, why go deferred?</p>
<ul>
<li>Only need one geometry pass for the main render. Previously we usually needed a z prepass for performance, and sometimes even another separate pass for motion blur velocities and for depths for SSAO and other effects. For some of the stuff we had to render which was very high poly or had a lot of draw calls, or where it wasn&#8217;t polygons at all, only having to do them once is important. (Shadows still always cost extra, though.)</li>
<li>Separate rasterisation and shading. Not having to worry about lighting and so on at the geometry stage means that the geometry stage becomes simple, and most geometry shares one shader &#8211; we don&#8217;t need so many combinations of ubershader anymore. Combined with the reduced number of geometry passes it means we can reduce our batch count, amount of state changes and number of shaders we have to generate by a lot.</li>
<li>Lighting as a 2D post process. Apply as many lights as you want, efficiently, and without having to build many uber shader combinations. As many lights as we want, as many types of lights as we want, mixing shadowed and unshadowed, potentially handling 1000s of lights.</li>
<li>Spatially optimise lighting and complex shading. Only apply light to the pixels which are actually in the light&#8217;s area of effect.</li>
<li>Only shade pixels with complex lighting shaders once (per light) &#8211; less issues with overdraw from geometry rendering.</li>
<li>Use the additional GBuffer information to do some more interesting post fx and rendering.</li>
</ul>
<p>So there were a lot of things we wanted a piece of. Unfortunately there are some major downsides too, which is why I hadn&#8217;t gone down this road before:</p>
<ul>
<li>No hardware antialiasing (on DX9). For me this had been the killer up to now. I don&#8217;t like unantialised renders. This time around though the benefits were so big that I decided to just screw it and worry about antialiasing later.</li>
<li>Overhead of memory use &#8211; we need to store those fat-ass GBuffers somewhere &#8211; and rendering. It smooths out the render performance with all the pretty fixed overheads &#8211; which makes the complex cases faster (or work at all), but the simple cases are potentially a lot slower. I decided our general case is complex enough not to care about the simple cases.</li>
<li>Potentially reduced material flexibility. We can&#8217;t just hack a shader which computes some lighting and messes with the equations for that one material &#8211; we have to do everything en-masse in 2D processes.</li>
<li>Alpha stuff still has to be handled in a second, forward rendered pass. Which means we still need a working forward renderer.</li>
</ul>
<p>So, I managed to sufficiently minimise in my head how much I cared about the downsides and just crack on and implement the deferred renderer to see what happened. It was actually very easy to do &#8211; a day or two&#8217;s work had the whole thing up and running with multiple types of light working. Then there were weeks or months of work in adding all the interesting stuff on top.</p>
<p>The first task was to get the geometry rendered to GBuffers. When rendering the GBuffers it&#8217;s important to minimise the number of channels and the bit depth required &#8211; the greater it is, the more memory and slower the render is. You also have to consider how you want to read the data &#8211; you probably don&#8217;t need the colour info until late in the day, whereas the normals are needed a lot &#8211; so don&#8217;t pack something you need a lot with the colours, because it&#8217;ll be a wasted additional GBuffer read.<br />
I used 3 or 4 MRTs for my GBuffer rendering, depending on whether or not motion blur was enabled, where each MRT was 32 bits wide. The channels I rendered to GBuffers were:</p>
<p>Colour+ObjectIndex : RGBA8888;<br />
Normal+MaterialIndex : RGBA8888;<br />
Depth : Float32<br />
VelocityBuffer (optional &#8211; only if motion blur enabled) : RG Float16<br />
and of course a D24S8 depthstencil. Which I cant read from, because DX9 doesnt allow it. How annoying. If I could I could skip that float32 depth, like I would on PS3. But I can&#8217;t. Damnit. Actually it works on ATI, so it&#8217;s just NVIDIA that&#8217;s the problem. Dear NVIDIA: you guys managed to hack in almost every other feature under the sun using 4CCs and other tricks, so how about adding depth buffer value reads for DX9, and multisampled buffer reads while you&#8217;re there? Go on &#8211; I bet it&#8217;s there in the drivers already, and I just don&#8217;t know what the magic incantation is to access it.</p>
<p>One thing that comes up a lot when discussing deferred renderers is the storage of normals. Nearaz has a <a href="http://aras-p.info/texts/CompactNormalStorage.html">really good summary / investigation</a> of the different methods. I&#8217;m lazy, so initially I just wrote out the normal as an XYZ, deciding that if I ever needed an extra GBuffer channel I&#8217;d fix it and use XY in a compacted form and recreate Z. So far I didn&#8217;t.</p>
<p>Next I added the basic light types. It&#8217;s pretty easy to move the lighting code over from a forward render &#8211; it just needed the extra code to pull the values out of the GBuffers and back project to recreate positions, rather than pull everything out of the vertex interpolators.<br />
Spot lights were trivial, of course. A single projected shadow map did the job. Point lights proved annoying because of another D3D limitation &#8211; I couldn&#8217;t create a depth stencil cubemap, and I wanted to render to a depth stencil for efficiency on render and for free hardware PCF &#8211; so I ended up making a &#8220;virtual cube map&#8221; which spread the faces out on a 2D texture. Finally, for directional lights I implemented a varying number of cascaded shadow maps. Directionals proved to be by far the most used light type, and I put some work into making it calculate a good set of shadows for the splits.</p>
<p>The nice trick with a deferred renderer is that I can apply the splits separately to the screen in 2D, rather than sampling all of them per pixel. I first render a series of view-aligned planes at the depth of each split, front to back, into the depth+stencil, marking the stencil where they pass the depth test. This gives me a series of stencil masks that I can use to test against when rendering full screen quads, one per split, which sample just that one split&#8217;s shadow map each.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/deferred_2points.jpg" alt="An early test with deferred lighting using two point lights." /></p>
<p>The initial work of adding the lighting was quite easy, but it instantly proved the benefits of deferred shading. Previously, just adding a new type of light, changing lighting code or adding more influencing lights cost work &#8211; a new shader code path which had to be propogated through the ubershaders &#8211; increasing the compile time every time &#8211;  and then into the code to select the ubershaders. There was a hard limit on the complexity of each light and the number of lights, because of the hard limit on the size of shaders and, more pressingly, the number of textures that could be used at once. But now it was simply a case of adding another 2D pass, and a single piece of shader code which could be edited and reloaded over and over again easily. I had never bothered to add all the different light types or cascading shadow map support to my forward renderer because it required too many permutations and too many simultaneous textures, but now I had it all working easily, and finally could handle shadows well from massive scenes with directional lights. Adding deferred rendering had already paid off.</p>
<p>Now, what about those classic problems with deferred rendering?</p>
<p><strong>Flexibility</strong></p>
<p>The next issue was that of flexibility &#8211; how to get materials that look different to each other. With forward renders it&#8217;s easy &#8211; you just make a different shader for the material, and change the behaviour of that material. But with deferred renderers it doesn&#8217;t work like that &#8211; everything is done in a series of 2D passes on the whole screen. So, you have to render extra information into the GBuffers which tell those passes how to produce the correct behaviour for each pixel. Unfortunately GBuffers get fat fast if you add a lot of parameters. Most of our parameters varied per material, so I simply created a material palette as I was rendering the objects of all the unique materials, and wrote two indices to the GBuffers &#8211; the object index, and the material index. Both limited to 256 indices (lists updated per frame). Bad news? Well, if we had that many draw calls we&#8217;d be screwed performance-wise anyway, so a 256 material+object limit didn&#8217;t matter.</p>
<p>As development continued, the data in our material palette grew and grew. By the end it was several textures worth &#8211; with data for fresnel coefficients, how to apply envmaps, various light equation constant modifiers, and so on. It enabled us to easily adjust the material parameters without adding a lot to our GBuffers. I&#8217;ve heard it said that this doesn&#8217;t work &#8211; because you want to vary a lot per pixel, like specular gloss, specular power, and so on, and this doesn&#8217;t allow it. I&#8217;ve also heard that you need lots of different shaders to get a good look. Well, it was never the case for us. Probably 90% of our geometry always went through the same default shader; we didn&#8217;t adjust that much per pixel in textures except where it was really needed &#8211; it added a lot more work to the art side as well as more space requirements.</p>
<p>Another useful thing about the material palette came to light later on when optimising the renderer to reduce draw call counts. The only remaining per-material parameters that were used when rendering the mesh were the textures and the material index &#8211; everything else was in the material palette. That meant that it was easy to merge meshes together and store a palette index in a vertex buffer channel I didn&#8217;t need (I used vertex colour alpha). This meant I could pack the meshes down completely except where the textures differed, but still allow on-the-fly editing of the separate material properties. In addition I could actually change the material index per pixel if I wanted &#8211; e.g. using a mask texture to select between two materials &#8211; with very little overhead. This exposed a whole new set of tricks and went some way to solving the problem of not being able to vary material properties using textures.</p>
<p>Besides that, where we did need something special there were a few tricks we could use when applying shading in the deferred passes. The material ID + object ID could be used to mask in whole special materials for certain objects (or parts of objects). For example, the car had a special paint shader that was masked in. Each material palette entry had a world-space bound box which was accumulated for all the objects which used it per frame; this was used to generate accurate-enough masks for 2D passes quickly and efficiently. And when it came to that extra bit of per-pixel data we needed but just didn&#8217;t have &#8211; we generated it. A simple function of the position and normal was plenty enough to sample a dirt texture or noise texture for fading reflections in and out or adjusting the bluriness. It&#8217;s a basic, hacked up form of deferred texturing, and it did the job nicely. Fortunately it&#8217;s really easy in my renderer to add extra passes and stages into the rendering pipeline, so this was something we could use a lot to customise things.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/carshader01.jpg" alt="custom shader used for the car paint, applied in the deferred render" /></p>
<p>The deferred approach obviously worked great for lights. Number of lights is usually the big sell for deferred rendering. But it worked great for environment maps too. In Frameranger we had quite a lot of shiny stuff &#8211; e.g. a car and a robot &#8211; and we wanted to handle it by multiple dynamic environment maps. With the deferred render it was easy to apply. We attached dynamic envmap nodes to things so they moved around, and then attached different objects as inputs and outputs. The inputs get rendered to the envmap, and the outputs get the envmap applied to them. To apply, I generated a small dynamic 1d mask texture which mapped the object indices to white or black &#8211; so I could sample it per pixel using the object index and determine whether that object was affected by the envmap. I calculated world-space bounds for the objects which received the envmap and used that to roughly stencil in the shader, and applied the envmap additively, adjusting it using parameters from the material tables. To control fresnel reflection we wanted something better than the usual single &#8220;fresnel power&#8221; parameter. In Lightwave you can create an envelope for it and explicitly control the fresnel response over the range of the incident angle values, and I wanted something similar &#8211; so I exposed 4 control values and used a 1d bezier curve to interpolate them. Worked great &#8211; you could generate a very flexible response with it.</p>
<p><strong>Non-polygonal elements</strong></p>
<p>We hand to render more than just triangle-based meshes. Some of the effects &#8211; specifically the liquids / fluid dynamics &#8211; were raytraced using distance fields. Rather than write a special shading path to handle their lighting I decided to just add these into the deferred rendering pipeline and use the routines that were already in place. This had the benefit of making them interact properly with the other objects in the scene, casting shadows onto them, receiving shadows from them, working with dynamic environment maps and so on &#8211; it meant the effects looked properly part of the scene, not just floating in space separate from everything else.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/11/fluid_jar_01.jpg" alt="raytraced fluid deferred rendered and mixed with poly elements" /></p>
<p>It was pretty easy to add. I raytraced the distance fields and output the results straight to the GBuffer &#8211; depth, normal, colour, etc. This also meant the ZBuffer was correct so the effects overlapped properly with the rest of the geometry. For the shadow map rendering passes I just raytraced them from the light&#8217;s point of view straight into the depth shadow map. It worked out nicely without too much effort.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/11/fluid_jar_02.jpg" alt="raytraced fluid with deferred rendering, mixed with polygonal elements" /></p>
<p><strong>Alpha stuff</strong></p>
<p>Alpha stuff doesn&#8217;t like deferred rendering &#8211; sad but true. Actually I have found a way around it so you can perform deferred rendering on some alpha stuff too &#8211; I&#8217;ll get onto that in another post &#8211; but in terms of general alpha blended stuff, you&#8217;re limited to using a forward render which mimics the look of the deferred rendered geometry. Fortunately in Frameranger we didn&#8217;t have that much generic alpha mesh stuff to deal with &#8211; the particles, smoke, light beams and so on were already special cases or handled with effects in other ways &#8211; so it wasn&#8217;t a massive deal. We also avoided treating punch-through alphas or cutouts as &#8220;alpha&#8221; by using alpha testing dithered with a random noise map. As it turned out I just used my ubershadered forward rendering code for the alpha stuff like a &#8220;legacy&#8221; pass. One compromise I made was to skip shadow receiving for alpha stuff &#8211; although it would have been possible, it would have meant I&#8217;d have had to keep the shadow maps around longer than the deferred passes, whereas at present the same maps could be reused for all the lights. In reality, the only real alpha stuff we had to deal with in this way were a couple of transparent bits on the car.</p>
<p><strong>Antialiasing</strong></p>
<p>One of the major downsides of deferred rendering is the inability to apply standard hardware MSAA to it. On some architectures it&#8217;s possible to use MSAA when rendering the Gbuffers, but you have to do the slow bit &#8211; using the GBuffers to perform deferred rendering passes &#8211; on a per-sample basis. i.e. for 4x MSAA, you have to light 4x as many pixels and then average the results at the end. Our aim is to achieve a comparable quality of antialiasing as MSAA provides forward renderers, but with the cost of deferred rendering with unantialiased rendertargets &#8211; i.e. only lighting the number of actual pixels on screen. With deferred rendering much of the cost of rendering is pushed to the deferred, 2D passes, so it&#8217;s important to avoid incurring a large performance cost there when adding antialiasing &#8211; scaling that cost by 4 to support 4x MSAA is not feasible.<br />
On consoles or DX10 the natural starting point is to render the geometry to MSAA GBuffers and to try and optimise the lighting process so you don&#8217;t need to light every sample. Indeed, I outlined how to optimise the process on Playstation 3 in my <a href="http://research.scee.net/files/presentations/gdc2009/DeferredLightingandPostProcessingonPS3.ppt">GDC 09 presentation</a>. That can reduce the number of additional samples you have to light to only around 20-30% more than the number of pixels in the unantialiased buffer, which is a great improvement but still costs.<br />
On DX9 the problem is even worse because you can&#8217;t read the individual samples from an MSAA buffer, so MSAA is completely unusable for deferred rendering there.</p>
<p>So, plan B then.</p>
<p>There are several ways to tackle antialiasing of deferred renderers. First, you could just render everything 2x or 4x the size, light it as usual, and downsample it at the very end. It looks nice &#8211; much like MSAA on a forward render, really &#8211; and it&#8217;s easy to add. But it has exactly the effect on framerate you might imagine, so it&#8217;s not a practical solution for realtime. So the usual way people try and do it is to fake it &#8211; perform a 2D post process on the result of the deferred render which somehow works out where the edges are and fixes them in a way that looks like they were rendered with antialiasing. This approach is apparently rife on xbox360 titles where the hardware&#8217;s dubious memory arrangement makes using proper MSAA on HD framebuffers problematic.</p>
<p>So, how would that magic post process work exactly? Step 1 &#8211; detecting edges &#8211; is easy, particularly in a deferred renderer where we have a ton of information around to help. An edge detection kernel filter applied to the depth, normal and material index/object index usually gives great results, far superior to using a colour buffer for our purposes. Step 2 &#8211; antialiasing those edges &#8211; is a little bit more difficult. It&#8217;s important to remember that what antialiasing is doing is over-sampling: generating a load of possible values and averaging them. The usual approach with post-process AA is to blur the neighbouring pixels on the edges, which is really the opposite of what we wanted. So it doesn&#8217;t really work. For Frameranger I experimented a lot and managed a reasonable attempt at it which used noise, a poisson disc and some magic weighting of the samples &#8211; and it looked marginally better than a typical edge blur &#8211; more like an &#8220;edge dither&#8221;.</p>
<p>Here&#8217;s some screenshots of the edge noise technique. As you can see, it doesn&#8217;t entirely look like antialiasing. Actually as an effect, adding noise to the edges, it was alright &#8211; but as antialiasing it wasn&#8217;t a good substitute. It appeared that if I wanted to achieve the look of antialiasing I was going to have to move away from using only the 2D results as input and render the geometry differently in the first place &#8211; try and gain some more information that way. By the way, all the antialias comparsion screenshots should be matched so you can download the images and diff them or toggle back and forth between them if you want to see the differences. </p>
<p><img src="http://directtovideo.files.wordpress.com/2009/11/dirty_edge_noise_off.jpg" alt="edge noise off" /><br />
<img src="http://directtovideo.files.wordpress.com/2009/11/dirty_edge_noise_on.jpg" alt="edge noise on" /></p>
<p>The second approach I used borrowed from temporal reprojection techniques, and a very old way of antialiasing in an OpenGL example. In that example, antialiasing was done by rendering the scene over and over again to an accumulation buffer and jittering the projection matrix by a sub-pixel amount each time. It&#8217;s basically stochastic sampling of the render but flipped around so that you render the screen with one stochastic offset, then again with another offset etc. and average the results at the end. When you offset the projection slightly you cause the edges of the triangles to move slightly, and you get slightly different coverage and a different set of aliasing artefacts &#8211; when you average enough of them together you get an antialiased image.</p>
<p>Sadly, as you might expect, rendering the scene loads of times per frame isn&#8217;t too practical for realtime. But we can take the basic idea and split it over frames &#8211; so each frame we slightly jitter the projection matrix when we render, and we blend the current frame on top of the previous frames with a low alpha value. That works great and gives you a really nice antialiased image.. as long as nothing moves. As soon as you move you get big ugly motion-blur-like trails. So what we need to do is try and fix the case where it moves.</p>
<p>I split &#8220;movement&#8221; into two cases: object movement and camera movement. Camera movement is where temporal reprojection comes into play. How this works is, when we&#8217;re blending the current frame onto the previous frame we don&#8217;t just use the same pixel in the previous frame. Instead we project the current frame&#8217;s pixel back into the previous frame&#8217;s camera space by using a combination of the inverse view projection for the current frame and the view projection from the previous frame &#8211; which also requires the pixel depth, which is fortunately kicking around in a GBuffer &#8211; then sample the pixel there and blend it. This is basically trying to track positions in world space as they move in screen space. It does indeed fix most of the camera movement artefacts. Of course, problems do occur &#8211; at the edge of the screen, or where pixels that were occluded become unoccluded and vice versa. To cancel those, I weight the alpha of the blend by the world-space distance between the previous and last frame pixels. What it&#8217;s trying to work out is &#8220;is it really the same pixel&#8221;, and if it isn&#8217;t, don&#8217;t try and blend it &#8211; just overwrite it. Conveniently that can be used to fix object motions too. Finally use an edge detect mask on the whole thing so only the edges get blended.</p>
<p>Here&#8217;s some images to compare (off and on):</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_off01.jpg" alt="temporal AA off" /><img src="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_on01.jpg" alt="temporal AA on" /></p>
<p>Well, it works! It actually works pretty nicely. The thing about temporal techniques is that they settle over time &#8211; so when you get a relatively static screen it quickly convolves to an antialiased image, becoming more aliased as movement is introduced. For a situation where you didn&#8217;t have too much movement on screen it&#8217;s a good technique. The problem is, for Frameranger we had some very fast camera movements &#8211; it just wasn&#8217;t working well enough.</p>
<p>Here&#8217;s some shots showing how it looks when the red cube is in motion (off and on again) &#8211; and with a small camera motion too. As you can see, some aliasing does creep back in.<br />
<img src="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_motion_off01.jpg" alt="temporal aa off in motion" /><img src="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_motion_on01.jpg" alt="temporal on in motion" /></p>
<p>Finally, over time, I came up with an actual real solution to deferred rendering with antialiasing. Unfortunately it requires an additional geometry pass, but it gives you the look of proper MSAA. The idea is to render the GBuffers, lighting passes and so on to a non-MSAA buffer, then re-render the geometry to an MSAA buffer, which a shader that just samples the buffer containing the lighting results using a bilateral filter based on depth. Then resolve that MSAA buffer to give you an antialiased result. This is in a way quite similar to a light prepass using inferred lighting, but the MSAA pass <em>only</em> samples the lighting results buffer &#8211; it doesn&#8217;t need to compute any shading of it&#8217;s own.</p>
<p>We know that the reason MSAA is efficient is it only runs the pixel shader once per pixel, but generates a depth/stencil value for each sample in the pixel &#8211; so depth test+write is performed multiple times. This means that for primitive interiors with no intersections, the value of the pixel will be the same as for the non-MSAA buffer &#8211; only the edges of primitives and the intersections between primitives will have different values. This technique exploits this. We generate the depth values for MSAA by re-rendering the primitives; we just need to work out what colour to write for each pixel on each primitive. On edge pixels for a non-MSAA buffer, the final pixel value will be that of the front-most primitive. On edge pixels for a resolved MSAA buffer, the pixel value will be an average of the MSAA sample values &#8211; the value of the front-most primitive per sample. This technique estimates the value per primitive to write by using the bilateral filter to look around a small area around the pixel in the same screen location as the one it&#8217;s currently shading on the non-MSAA buffer, and asking &#8220;which of these probably came from the primitive I&#8217;m rendering, and is therefore a good estimate&#8221;. Or in practice a weighted average of the pixels in the area, weighted by the difference between the non-MSAA depth and the primitive&#8217;s pixel depth.</p>
<p>Bilateral upsampling is an extremely useful technique for fixing up edges. It&#8217;s also a very good way to upsample lower resolution soft particle buffers, for example. I&#8217;m pretty happy with this method, and it&#8217;s what we&#8217;re now using for the much improved Frameranger final version (due out soon!). It&#8217;s made a lot of difference to the quality and cleanness of the look, with a pretty acceptable overhead. It scales to 4x, 8x or more MSAA samples nicely and only impacts the cost for that one pass, which has a reasonably simple shader and output bandwidth requirement (compared to the GBuffer stages or the deferred lighting passes). It actually has some similarities to <a href="http://graphics.cs.uiuc.edu/~kircher/publications.html">Inferred Lighting</a>, although my method is just for antialasing and not for shading.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/11/bilateral_aa_off1.jpg" alt="bilateral aa off" /><br />
<img src="http://directtovideo.files.wordpress.com/2009/11/bilateral_aa_on1.jpg" alt="bilateral aa on" /></p>
<p>Now, if you happen to be on a more flexible API or piece of hardware than me where you can read samples from MSAA buffers &#8211; like a PS3 &#8211; there&#8217;s an optimisation / extension to this &#8211; you can use the same technique but avoid the re-render. It goes like this: render the GBuffers to MSAA targets; resolve the buffers using a point sampling scheme &#8211; e.g. &#8220;pick top left sample&#8221;; run the lighting processes on these resolved buffers; now perform an additional fullscreen pass:, read in your original MSAA Gbuffers, then for each MSAA sample from that buffer &#8211; perform a bilateral filter w.r.t depth/normal, sampling from the resolved GBuffers and light buffer, to weight the resolved light buffer samples for that MSAA sample from the GBuffer. That will give you a bilateral upsampled light result per MSAA sample of the Gbuffer, and you can average them in the shader and write out one final antialiased light value. Clearly you only need to run these shaders on edges if you wish to optimise it further. So there you go &#8211; a practical solution to deferred rendered antialiasing which only needs one geometry rendering pass, and lets you perform lighting on a single-sample screen-sized buffer without worrying about MSAA at that stage. Shame it doesn&#8217;t work on DX9, because then for me it would be the ideal solution.</p>
<p>Coming up in part 2 : ambient occlusion.</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/88/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/88/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/88/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=88&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2009/11/13/deferred-rendering-in-frameranger/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/deferred_2points.jpg" medium="image">
			<media:title type="html">An early test with deferred lighting using two point lights.</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/carshader01.jpg" medium="image">
			<media:title type="html">custom shader used for the car paint, applied in the deferred render</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/11/fluid_jar_01.jpg" medium="image">
			<media:title type="html">raytraced fluid deferred rendered and mixed with poly elements</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/11/fluid_jar_02.jpg" medium="image">
			<media:title type="html">raytraced fluid with deferred rendering, mixed with polygonal elements</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/11/dirty_edge_noise_off.jpg" medium="image">
			<media:title type="html">edge noise off</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/11/dirty_edge_noise_on.jpg" medium="image">
			<media:title type="html">edge noise on</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_off01.jpg" medium="image">
			<media:title type="html">temporal AA off</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_on01.jpg" medium="image">
			<media:title type="html">temporal AA on</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_motion_off01.jpg" medium="image">
			<media:title type="html">temporal aa off in motion</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/temporaledgeaa_motion_on01.jpg" medium="image">
			<media:title type="html">temporal on in motion</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/11/bilateral_aa_off1.jpg" medium="image">
			<media:title type="html">bilateral aa off</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/11/bilateral_aa_on1.jpg" medium="image">
			<media:title type="html">bilateral aa on</media:title>
		</media:content>
	</item>
		<item>
		<title>a thoroughly modern particle system.</title>
		<link>http://directtovideo.wordpress.com/2009/10/06/a-thoroughly-modern-particle-system/</link>
		<comments>http://directtovideo.wordpress.com/2009/10/06/a-thoroughly-modern-particle-system/#comments</comments>
		<pubDate>Tue, 06 Oct 2009 15:30:56 +0000</pubDate>
		<dc:creator>directtovideo</dc:creator>
				<category><![CDATA[demoscene]]></category>
		<category><![CDATA[fluid dynamics]]></category>
		<category><![CDATA[realtime rendering]]></category>

		<guid isPermaLink="false">http://directtovideo.wordpress.com/?p=57</guid>
		<description><![CDATA[During the making of Frameranger, I spent some time looking into making a &#8220;modern particle system&#8221;. Particles have been around for ever and ever, and by and large they haven&#8217;t changed that much in demos over the last 5-10 years. You simulate the particles (around 1000 &#8211; 100,000 of them) on the CPU, animating them [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=57&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p><img src="http://directtovideo.files.wordpress.com/2009/10/blunderbuss02.jpg" alt="particles in blunderbuss" /></p>
<p>During the making of Frameranger, I spent some time looking into making a &#8220;modern particle system&#8221;. Particles have been around for ever and ever, and by and large they haven&#8217;t changed that much in demos over the last 5-10 years. You simulate the particles (around 1000 &#8211; 100,000 of them) on the CPU, animating them using a mix of simple physics, morphs and hardcoded magic; sort them back to front if necessary, and then upload the vertex buffers to the GPU where they get rendered as textured quads or point sprites. The CPU gets hammered by simulation and sorting, and the GPU has to cope with filling all of the alpha blended, textured pixels. </p>
<p>However, particles in the offline rendering / film world have changed a lot. Counts in the millions, amazing rendering, fluid dynamics controlling the motion. Renderers like <a href="http://www.youtube.com/watch?v=7c4WYzr30B0">Krakatoa</a> have produced some amazing images and animations. I spent some time looking around on the internet at all sorts of references and tried to nail down what those renderers had but I didn&#8217;t &#8211; and therefore needed. This is something I do a lot when developing new effects or demos. Why bother looking at what&#8217;s currently done realtime? That&#8217;s already been done. <img src='http://s0.wp.com/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  </p>
<p>I decided on the following key things I needed:<br />
1. Particle count. I want more. I want to be able to render sand or smoke or dust with particles. That means millions. 1 million would be a good start.<br />
2. Spawning. Instead of just spawning from a simple emitter, I want to be able to spawn them using images or meshes.<br />
3. Movement. I want to apply fluid dynamics to the particles to make them behave more like smoke or dust. And I want to morph them into things, like meshes or images &#8211; not just use the usual attractors and forces.<br />
4. Shading. To look better the particles really need some form of lighting &#8211; to look like millions of little things forming a single solid-ish whole, not millions of little things moving randomly and independently.<br />
5. Sorting. Good shading implies not additive blending, which implies sorting. </p>
<p>The problem with simulating particles on the CPU is that no matter how fast the simulation code on the CPU is, you&#8217;re going to hit two bottlenecks sooner or later: 1. you have to get that vertex data to the GPU &#8211; and that can make you bandwidth limited; and 2. you need to sort the particles back to front if you want to shade them nicely, which gets progressively slower the more you have. Fortunately given shader model 3 and up, it&#8217;s quite doable to make a particle system simulate on the GPU. You make big render targets for the particle positions, colours and so on; simulate in a pixel shader; and use vertex texture fetch to read from that texture in the vertex shader and give you an output position. Easy. Not quite &#8211; simulating on the GPU brings it&#8217;s own set of problems, but on to that later. Modern GPUs are sufficiently fast to easily be able to perform the operations to simulate millions of particles in the pixel shader, and outputting 1 million point sprites from the vertex shader is doable. </p>
<p>Shading is the biggest problem here, and the shading problem is mainly a lighting problem. Lighting for solid objects means a mix of &#8211; diffuse+specular reflection; shadows; and global illumination. Computing diffuse and specular reflection requires a normal, something which particles do not really have, unless we fake it. So that was my first line of attack &#8211; generate a normal for the particle. It would need to be consistent with the shape of the system, locally and globally, if it was going to give a good lighting approximation. I tried to use the position of the particle to generate a normal. It turns out that&#8217;s rather difficult if you&#8217;ve got something other than a load of static particles in the shape of a sphere or box. Then I tried to use a mesh as an emitter and use the underlying normal from the mesh for the particle. It did work, but of course once the particle moves away from it&#8217;s spawn point it becomes less and less accurate. </p>
<p>The image here shows particles generated from the car mesh in Frameranger, matching the shading and lighting.<br />
<img src="http://directtovideo.files.wordpress.com/2009/10/particlecar_small.jpg" alt="particles mapped to the car from frameranger" /></p>
<p>I needed a better reference, so I looked away from solid objects and had a look at how you would light a volumetric object &#8211; e.g. a cloud. Which in real life is actually millions of millions of little particles, so maybe it makes quite a good match to lighting, well.. particles. It works out as a model of scattering and absorbtion. You cast light rays into the volume, and ray march through it. Whenever the ray hits a cell that isnt empty, a bit of the light gets absorbed by the cell and a bit of it scattered along secondary rays in different directions, and the rest passes on to the next cell. The cell&#8217;s brightness is the amount of light remaining on the ray when it gets to that cell. Scattering properly is hideously slow and expensive so we&#8217;ll just completely ignore it, and instead add a global constant to fake it (a good old &#8220;ambient&#8221; term). That just leaves us with marching rays through the volume and subtracting a small amount per cell, scaled by the amount of stuff in the cell. This actually works great, and I&#8217;ve used it for shading realtime smoke simulations &#8211; with a few additional constraints, like fixing to directional lights only and from a fixed direction, you can do it pretty efficiently. It looks superb too.</p>
<p>The problem is that the particles are not in a format that is appropriate for ray marching (like a volume texture). But the look is great &#8211; we just need a way of achieving it for particles. What we&#8217;re dealing with is semi-transparent things casting shadows, so it makes sense to research how to handle that. The efficient way of handling shadows for things nowadays is to use shadow maps. But shadow maps only work for opaque things &#8211; they give you the depth of the closest thing at each point in a 2D projection of light space. For alpha things you need more information than that, because otherwise the shadows will be solid. </p>
<p>Or do you? The first thing I tried was very simple &#8211; to use exponential shadow maps. Exponential shadowmaps have a great artefact / bug where the shadow seems to fade in close to the caster, and this is usually annoying &#8211; but for semi-transparent stuff we can use it to our advantage. Yep, plain old exponential shadowmaps actually work pretty well as shadowmaps for translucent objects &#8211; as long as those translucent objects aren&#8217;t all that translucent (e.g. smoke volumes). The blur step also makes small casters soften with those around them. It&#8217;s pretty fast too, and it almost drops into your regular lighting pipeline. But, for properly transparent (low alpha) stuff like particles, it&#8217;s not quite good enough. </p>
<p>The really nice high end offline way is to use deep shadow maps. That basically gives you a function or curve that gives you the shadow intensity at a given depth value. It&#8217;s usually generated by buffering up all the values written to each pixel in the map (depth and alpha), sorting them, and fitting a curve to them which is stored. Unfortunately it doesn&#8217;t map too well to pixel shader hardware. However there is a discrete version which is much simpler &#8211; opacity shadow maps. For this you divide depth into a series of layers and sum up the alpha value sums at each layer for the stuff written with a depth greater than that layer. On modern GPUs that&#8217;s actually pretty easy &#8211; you can fit 16 layers into 4 MRTs of 4 channels each, and render them in one pass! Unfortunately it&#8217;s not expandable beyond that without adding more passes, but it&#8217;s good enough to be getting on with &#8211; as long as you don&#8217;t need to cover a really large depth range and the layers are too spaced out. But this gives us nice shadows which work with semi-transparent stuff properly. You could even do coloured shadows if you didn&#8217;t mind less layers or multiple passes. </p>
<p>The next issue is how to apply that shadow information to the particles &#8211; it requires samping from 4 maps plus a bunch of maths, and isn&#8217;t all that quick. If we did it for every pixel rendered for the particles, it&#8217;d hammer the already stressed pixelshader. If the particles are small enough we could just sample it once per particle in the vertex shader &#8211; but it&#8217;s too many textures to sample. Fortunately the solution is easy &#8211; just calculate a colour buffer using the fragment shader, with all the lighting and shading information per particle in it, and sample that in the vertex shader. The great thing about that is it&#8217;s really similar in concept to the deferred renderer I&#8217;ve already got for solid geometry. You have a buffer containing positions and other information; you perform the lighting in multiple passes, one per light, blending into a composite buffer; then sample that composite buffer to get the particle colour when rendering to the screen. It&#8217;s so similar in fact to the deferred rendering pipeline that I can use the almost same lighting code, and even the same shadow maps from solid geometry to apply to the particles too &#8211; so particles can cast shadows on geometry, and geometry can cast shadows on particles.</p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/particlelighting02.jpg" alt="particle lighting 01" /><img src="http://directtovideo.files.wordpress.com/2009/10/particlelighting01.jpg" alt="particle lighting 02" /></p>
<p>This shading pipeline &#8211; compositing first to a buffer, one pixel per particle &#8211; opens up new options. We can do all the same tricks we do in deferred rendering, like indexing a lookup table which contains material parameters for example. Or apply environment maps as well as lights. Or perform more complicated operations like using the particle&#8217;s life to index a colour lookup texture and change colour over the life of the particle &#8211; make it glow at first then fade down. It allows multiple operations to be glued together as separate passes rather than making many combinations of one shader pass. </p>
<p>So, we have a particle colour in a buffer. The next job is to render the particles to the screen. We&#8217;ve gone to all this effort to colour them well that we need to consider sorting &#8211; back to front &#8211; so it actually looks right. This could be problematic &#8211; we&#8217;ve got 1 million+ particles to sort, all moving independently and potentially quite quickly and randomly, and it has to be done on the GPU not the CPU &#8211; we can&#8217;t be pulling them back to CPU just to sort. </p>
<p>I had read some papers on sorting on the GPU but I decided it looked totally evil, so I ignored them. My first sorting approach was basically a bucket sort on GPU. I created a series of &#8220;buckets&#8221; &#8211; between 16 and 64 slices the size of the screen, laid out on a 2d texture (which was massive, by the way), with z values from the near to the far plane. Then I rendered the particles to that slice target, and in the vertex shader I worked out which slice fit the particle&#8217;s viewspace z value, and offset the output position to be in that slice. So, in one pass I had rendered all the particles to their correct &#8220;buckets&#8221; &#8211; all I had to do was to blend the buckets to the main screen from back to front, and I got a nicely sorted particle render which rendered efficiently &#8211; not much slower than not sorting at all. Unfortunately it had some problems &#8211; it used an awful lot of VRAM for the slice target, and the granularity of the slices was poor &#8211; they were too spread out, so sometimes all the particles would end up in one slice and not be sorted at all. I improved the Z ranges of the slices to fit the approximate (i.e. guessed) bound box of the particle system, but it still didn&#8217;t have great precision. In the end on Frameranger the VRAM requirements were simply too high, and I had to drop the effect. It turns out that the layers method is very useful for other things though, like rendering particles into volumes or arbitrary-layered opacity shadow maps. </p>
<p>When I revisted the particle effect, I knew the sorting had to be fixed. I looked back at the papers on GPU sorting, specifically the one in <a href="http://http.developer.nvidia.com/GPUGems2/gpugems2_chapter46.html">GPU Gems</a>. They seemed very heavyweight &#8211; a sort of a 1024&#215;1024 buffer (i.e. 1 million particles) would require 210 passes over that buffer per frame, which is completely unfeasible on a current high end GPU. But there was one line which caught my attention &#8211; &#8220;This will allow us to use intermediate results of the algorithm that converge to the correct sequence while we do more passes incrementally&#8221;. One of the sorting techniques would work over multiple frames &#8211; i.e. for each iteration of the algorithm, the results would be more sorted than the previous iteration &#8211; it would not give randomly changing orders, but converge on a sorted order. Perfect &#8211; we could split the sort over N frames, and it would get better and better each frame. That&#8217;s exactly what I did, and it actually worked great. It used much less memory than the bucket sort method and gave better accuracy too &#8211; and the performance requirements could be scaled as necessary in exchange for more frames needed to sort.</p>
<p>There are some irritations with simulating particles on GPU. Each particle must be treated independently and you have to perform a whole pass on all the particles simultaneously. It makes things which are trivial on CPU, like counting how many particles you emitted so far that frame, very difficult or not feasible at all on GPU. But it&#8217;s a rather important thing to solve &#8211; you often need to be able to emit particles slowly over time, rather than all at once. The first way I tried to solve that was to use the location of the particle in the position buffer. I would for example emit the particles in the y range 0 to 0.1 on the first frame, than 0.1 to 0.2 on the next, and so on. It worked to a point, but fell down when I started randomising the particle&#8217;s lifetime &#8211; I needed to emit different particles at different times. Then I realised something useful. If you&#8217;re dealing with loads and loads of something &#8211; like a near infinite amount &#8211; then doing things randomly is as good as doing things correctly. I.e. I dont need to correctly emit say 100 particles this frame &#8211; I just need to try to emit e.g. roughly 1% of particles this frame and if I&#8217;ve got enough particles in the first place, it&#8217;ll look alright. The trick is that those 1% is the right amount of random. </p>
<p>I&#8217;ll explain. The update goes like this: 1. generate a buffer of new potential spawn positions for particles. 2. Update the particle position buffer by reading the old positions, applying the particle velocities to them, and reducing the life; then if the life is less than 0, pick the corresponding value from the spawn position buffer and write that out instead. So, each frame I generate a whole set of spawn positions for the particles, but they only get used if the particle dies. But how to control the emission? Clearly if I put a value in the spawn buffer which has an initial life of less than 0 and it gets used, it&#8217;ll get killed by the renderer anyway and the next frame around it&#8217;ll respawn again &#8211; i.e. the particle never gets rendered and doesn&#8217;t really get spawned either. So if I want to control the number of particles emitted I just limit the number of values in the spawn buffer each frame that have an initial life greater than 0. </p>
<p>How do I choose which spawn values have valid lives? It needs to be a good spread, because the emission life is also randomised &#8211; some particles die earlier than others and need respawning. If I simply use a rolling window it&#8217;s not random enough and particles stop being spawned properly. If I actually randomly choose, it&#8217;s too random &#8211; it becomes dependent on framerate, and on a fast machine the particles just all get spawned &#8211; the randomness makes it run through the buffer too fast. So, what I did was a compromise between them &#8211; a random value that slowly changes in a time-dependent way.</p>
<p>The other nice thing about this spawn buffer was that it made it easy to combine multiple emitters. I could render some of the spawn buffer from one emitter, some from other, and it would &#8220;just work&#8221;. One of the first emitters I tried was a mesh emitter. The obvious way would be to emit particles from the vertices but this only worked well for some meshes &#8211; so instead I generated a texture of random positions on the mesh surface. I did this by firstly determining the total area of all the triangles in the mesh; then for each triangle spawning a number of particles, which was the total number of particles * (triangle area / total area). To spawn random positions I just used a random barycentric coordinate. </p>
<p>Here&#8217;s an early test case with particles generated for a logo mesh and being affected by fluid.<br />
<img src="http://directtovideo.files.wordpress.com/2009/10/particlelogo01_small.jpg" alt="particle logo 01" /><img src="http://directtovideo.files.wordpress.com/2009/10/particlelogo02_small.jpg" alt="particle logo 02" /><img src="http://directtovideo.files.wordpress.com/2009/10/particlelogo03_small.jpg" alt="particle logo 03" /><img src="http://directtovideo.files.wordpress.com/2009/10/particlelogo04_small.jpg" alt="particle logo 04" /></p>
<p>Finally I needed some affectors. Of course I did the usual forces, but I wanted fluid dynamics. The obvious idea was to use a 3d grid solver and drive the particles by the velocities. Well, that wasn&#8217;t great. The main problem was that the grid was limited to a small area, and the particles could go anywhere. Besides, the fluid solver was quite slow to update for a decent resolution. So I used a much simpler method that generated much better results &#8211; procedural fluid flows (thanks Mr Bridson). Essentially this fakes up a velocity field by using differentials of a perlin-style noise field to generate fluid-like eddies &#8211; &#8220;curl noise&#8221;. By layering several of these on top, combined with some simple velocities, it looked very much like fluid. </p>
<p>The one remaining affector was something to attract particles to images. To do this I generated a texture from the image where each pixel contained the position of the closest filled pixel in the source image &#8211; a bit like a distance field but storing the closest position rather than the distance. Then, in the shader I projected the particle into image space, looked up the closest pixel and used that to calculate a velocity, weighted by the distance from the pixel. With a bit of randomness and adjustment to stop it affecting very new or very old particles, it worked a charm. </p>
<p><img src="http://directtovideo.files.wordpress.com/2009/10/blunderbuss01.jpg" alt="particles running under a fluid sim and attracted to an image" /></p>
<p>And there we have it &#8211; a &#8220;modern&#8221; particle system that works on DirectX9 &#8211; no CUDA required! I&#8217;m sure this will develop over time. With better GPUs the particle counts will go up fast &#8211; between 4 and 16 million is workable already on a top end Geforce, and it&#8217;ll go up and up with future hardware generations. In fact I have a host of other renderers for the particles besides this simple one &#8211; things to do metaballs, volume renders and clouds, for example &#8211; and a load of other improvements, but that can wait for another demo..</p>
<p>By the way, there&#8217;s a nice thing about GPU particles that maybe isn&#8217;t immediately obvious. You&#8217;re writing all the behaviour code (emission, affectors..) in shaders, right? And you can probably reload your shaders on the fly in your working environment. All of a sudden it makes development a lot easier. You don&#8217;t need to recompile and reload the executable every time you change the code, you can simply edit and reload the shader in the live environment. Great eh?</p>
<br />  <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/directtovideo.wordpress.com/57/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/directtovideo.wordpress.com/57/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/directtovideo.wordpress.com/57/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/directtovideo.wordpress.com/57/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/directtovideo.wordpress.com/57/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/directtovideo.wordpress.com/57/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/directtovideo.wordpress.com/57/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/directtovideo.wordpress.com/57/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/directtovideo.wordpress.com/57/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/directtovideo.wordpress.com/57/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/directtovideo.wordpress.com/57/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/directtovideo.wordpress.com/57/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/directtovideo.wordpress.com/57/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/directtovideo.wordpress.com/57/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=directtovideo.wordpress.com&amp;blog=9650185&amp;post=57&amp;subd=directtovideo&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://directtovideo.wordpress.com/2009/10/06/a-thoroughly-modern-particle-system/feed/</wfw:commentRss>
		<slash:comments>44</slash:comments>
	
		<media:content url="http://0.gravatar.com/avatar/065a5e82831983a73bad903e7390cb22?s=96&#38;d=identicon&#38;r=G" medium="image">
			<media:title type="html">directtovideo</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/blunderbuss02.jpg" medium="image">
			<media:title type="html">particles in blunderbuss</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/particlecar_small.jpg" medium="image">
			<media:title type="html">particles mapped to the car from frameranger</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/particlelighting02.jpg" medium="image">
			<media:title type="html">particle lighting 01</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/particlelighting01.jpg" medium="image">
			<media:title type="html">particle lighting 02</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/particlelogo01_small.jpg" medium="image">
			<media:title type="html">particle logo 01</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/particlelogo02_small.jpg" medium="image">
			<media:title type="html">particle logo 02</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/particlelogo03_small.jpg" medium="image">
			<media:title type="html">particle logo 03</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/particlelogo04_small.jpg" medium="image">
			<media:title type="html">particle logo 04</media:title>
		</media:content>

		<media:content url="http://directtovideo.files.wordpress.com/2009/10/blunderbuss01.jpg" medium="image">
			<media:title type="html">particles running under a fluid sim and attracted to an image</media:title>
		</media:content>
	</item>
	</channel>
</rss>
