Game Development Community

Screen Space Ambient Occlusion

by Lorne McIntosh · in Torque Game Engine Advanced · 09/09/2008 (7:08 pm) · 114 replies

Hey everyone,

I've developed a Screen Space Ambient Occlusion shader pack for TGEA. It will apply real time ambient occlusion onto any dif, dts, or animating dts in your scene. Check out the video on my website. I'm selling the shader pack through my site to anyone who wants to have great looking ambient occlusion in their levels! Enjoy.

www.ubiqvisuals.com/index.php?option=com_content&view=article&id=46%C2%A0

www.ubiqvisuals.com/images/ssao_sample.jpg

About the author

Ubiq Visuals is a software and creative content developer for the entertainment industry. Our vision is to provide inspiration and the tools for soon-to-be game designers and creative minds of all ages.

#41
09/22/2008 (8:07 pm)
Hi Dave,

A couple questions:
1) I'd be curious to know what video card you're using. All our computers here have GeForce 8800 GT cards in them, and at home I have a GeForce 8800 GTX. So I'm afraid that's pretty much been the extent of our hardware testing.

2) I'd also be curious to know how many polygons are in the scene. Actually a screenshot would be cool if you wouldn't mind.

3) And what resolution is this at? What anti-aliasing mode?

The fact that it's a single DTS shape makes me wonder if it's an over-draw issue. Interiors (I believe) have better optimization in this regard.

If overdraw is the issue there's one trick that we might want to try. Out of the box, the SSAO Kit actually throws away the depth information once it's gathered it (well, it keeps a copy but DirectX doesn't "know" about it). Instead, this depth information can be used to eliminate over-draw on subsequent render-passes (ie. it won't waste time drawing the "inside" walls of your building only to overwrite them later with the outside walls). I disabled this trick because there's a bug that garbles up the editor viewport when it's used. In the final shipping build though you probably won't care about the editor, so we could safely turn it on.

Give me a day or two and I'll see if I can't come up with a way to nicely integrate it with the kit (like pre-processor flag or something) - or at least provide you with an ugly hack to try :)

Lorne
#42
09/22/2008 (8:15 pm)
Gonna get this for sure
#43
09/22/2008 (8:31 pm)
Okay Dave, try this and let me know what your FPS is like:

In renderDepthMgr.cpp, instead of:
line 155     mTextureTargetRef->attachTexture(GFXTextureTarget::DepthStencil, *mDepthTexHandle);

Change it to:
line 155     mTextureTargetRef->attachTexture(GFXTextureTarget::DepthStencil, GFXTextureTarget::sDefaultDepthStencil);

If that improves it much, I'll work to make it a little nicer (you'll notice reflection passes are messed up).
#44
09/23/2008 (8:32 am)
Lorne - I just wanted to note that i have a FPS hit of about 50% on resolutions on and over 1024 ish but at 800 ish its only a hit of a few frames, might be my not so stella ati x1650 gfx card but i thought ide let you know.

edit: pat answers this below

and any hints at what your release in december is? is it more TGEA add-ons?
#45
09/23/2008 (9:14 am)
In all fairness SSAO is something that would be in the highest tier of options in a modern game. For those with prolly a Geforce 8800 series or higher. I think in Crysis it is only available if you are running DirectX 10 with settings on high. SSAO is something you are only gonna run on a NICE Gaming rig
#46
09/23/2008 (9:41 am)
James, dunno what you are talking about, it runs great on my crappy test machine at the lower rez's as expected on that hardware, i mentioned it because Lorne has only tested on that card i thought ide offer that out there, it was not a criticism of the pack or the tech in general's performance.

I don't want to get into a crysis is king, gaming rig only need apply type discussion.
#47
09/23/2008 (9:59 am)
No difference Lorne, here are the settings on the display options panel.

ATI Radeon HD 2600 Pro w/ 256MB
Res: 1024x768
32 bpp
Textre quality: auto
FSAA Off
DRL On
Dyn Shadows ON
HRD OFF
Multiple Dyn Shadows ON
Max Lights 10
Atlas Lights 16
Shadow Quality High
Shadow Detail High
#48
09/23/2008 (10:07 am)
James,
That's not true. This feature is entirely bandwidth limited. It is making many samples, from a full-screen texture for each pixel of output. GPUs are actually not nearly as good at doing image processing things like this as people seem to think they are. SSAO is much better suited for a CPU task, we just don't have that option quite yet.

It should be obvious how resolution impacts the performance of SSAO. You are not only raising the memory requirement for storing the targets, not only increasing the bandwidth requirements of sampling that buffer, but you are making a significantly higher number of texture samples each frame. (Then there's the blur on top of that.)

If the resolution is 800x600, the depth target is 800 * 600 * 2 bytes wide (R16 target format), than your SSAO buffer is 0.9mb, and you are making 480,000*n texture samples from that buffer per-frame, where n is the number of taps in the SSAO. If you are doing an 8-tap SSAO, you are making 3,840,000 texture samples per frame with an 800x600 resolution.

Jump that up one step to 1024x768. The depth target is now 1.5mb, and you are making 786,432*n samples from that 1.5mb texture per-frame.

Now say you are running at 1920x1080 (1080p)...


@Dave:
Dave, what you should see is a constant hit at a given resolution for a given video card. The performance hit for SSAO should be pretty constant, even as the scene increases in geometric complexity. There will be some additional performance lost simply because the pre-pass needs to draw more things, however doing a z-fill pass prior to a forward pass is a standard practice, and a fill-rate optimization. Something with a depth-write pixel shader is going to be almost as fast. (Technically it's 4x slower, because turning off color-writes and writing only to z/stencil is 4x faster than a color write.)
#49
09/23/2008 (11:34 am)
My 8600GT gets only a 4fps drop; which is actually a boost, as I'm now able to forgo clunky arse difs and stick with a pure dts solution.
img175.imageshack.us/img175/8013/image1co1.th.png
Shapes are from the awesome industrial packs, made by Dexsoft-Games
#50
09/23/2008 (12:32 pm)
Sure Erik, but you're still missing out on shadows (ie, the baked ones on DIF's), right?
#51
09/23/2008 (2:18 pm)
@Dave
Thanks for trying it anyway. I've never noticed much of an improvement using that technique myself (2 or 3 FPS), but I thought maybe my test scenes just weren't complex enough to get the benefit. If anyone ever get's a massive boost from it, please let me know.

There's one more thing we can try. By doing 8 samples per pixel instead of 16, we can sacrifice some quality for speed.

In UbiqAmbientOcclusionShaderP.hlsl, change:
line 98   for(int i=0; i<2; i++)

to this:
line 98   for(int i=0; i<1; i++)

And on line 137 (near the end) change 1/16.0f to 1/8.0f

Let me know what that does. Once again, I've never seen a dramatic change myself, but it might work for you.
#52
09/23/2008 (3:24 pm)
Lorne, I still haven't had the time to test this pack out yet, but I want to point out that I appreciate all your comments and support in this thread. These quality vs performance changes are lovely, and your explainations of the techniques used is awesome.

Thanks again.
#53
09/23/2008 (4:11 pm)
@Stefan Yeah, I don't have baked in shadows. In some spots they are glaringly missing; I'll deal with that eventually (or someone else will code a solution). Otherwise, I can pretty well hide the fact with proper art placements. The speed increase I got from using dts almost exclusively (interiors of course still need portals) more than makes up for such a minor thing as baked shadows.
#54
09/24/2008 (3:54 pm)
Lorne,

Getting terrain depth into all this wasn't as trivial as I had hoped. Doesn't seem like Terrain responds to RIT_Object anymore, or I'm simply doing the depth pass too early. Not sure.

In any case, given that we do have the depth information now - is there a chance some of your future pack additions will handle water depth shading? I got it implemented via fixed functions and vertex blending, but it's kinda heavy on the framerate.
#55
09/24/2008 (4:28 pm)
Hey Stefan,

Yeah, I had some trouble myself getting Terrain Blocks to output on the depth-pass. I finally hacked it by duplicating the for-loop where the terrain block renders itself, and just setup my depth shader on the first loop. It was pretty ugly, kinda slow, and happened at the wrong time (the depth pass should really be completed *first* before any actual rendering). But it worked well enough to demonstrate to me that SSAO looks strange on terrain, so I happily removed it!

If you have the depth info though, I think it's fairly trivial to do per-pixel water-depth effects. Just edit the water pixel shader to read the depth texture, and modulate the transparency based on the difference between the water surface and whatever is underneath. Small differences = transparent. Large differences = opaque. Tom Spilman had a cool screenshot of this technique in action.

Ubiq doesn't currently have any plans to release a water depth shader... but we'll take this as a suggestion :)

Lorne
#56
09/24/2008 (5:06 pm)
Aye, I'm aware how to sample it, I just can't figure out how to actually get the depth data from the terrain. I guess the software solution will work for now though. Thanks again.

Looking forward to hear about the next pack!
#57
09/26/2008 (8:45 am)
Cool resource. Just picked it up!
#58
09/26/2008 (10:01 am)
Stefan,
Quote:
I just can't figure out how to actually get the depth data from the terrain.
Do you mean:
-How do I output depth data from a terrain render
or
-How do I retrieve depth from the depth-target while I am rendering terrain.
#59
09/26/2008 (9:13 pm)
I purchased this yesterday, and have integrated it correctly, but I'm seeing some anomalies in my scene. I have some trees that have a single polygon with alpha for a clump of smaller branches - I'm seeing the edges of this polygon get shaded even though it's transparent. The Zbuffer appears to be updating correctly (it's stock stuff - look at the branch edges, you can see a halo where the zbuffer was written and you can see the sky). Is there anything that can be done about this? Here is a screenshot with the effect exaggerated:

www.aztica.com/images/OakError.jpg
#60
09/26/2008 (9:38 pm)
Hey Jaimi,

Wow, looks like a nice screenshot otherwise! I like those rocks...

I'm a little surprised to see this. Transparent ("translucent" in Torque parlance) objects shouldn't be included in the depth bin. The code checks for this specifically for this reason.

Take a look at your renderInstMgr.cpp file and double check the changes we made there. I suspect there may be a *tiny* mistake. Specifically double-check this part:

It should be:
case RIT_Begin:
		  mRenderBins[GFXBin_Begin]->addElement( inst );
		  break;

	  default:
		  mRenderBins[GFXBin_MiscObject]->addElement( inst );
		  break;
	  }

	  //Ubiq: all non-translucent instances go in the depth bin
	  if(inst->matInst && (inst->type == RIT_Mesh || inst->type == RIT_Interior))
	  {
		  mRenderBins[GFXBin_Depth]->addElement( inst );
	  }

	  PROFILE_END();
   }

   PROFILE_START(RenderInstMgr_specialBin);

   // handle extra insertions
   if( inst->matInst )
   {

If you placed that addition in the wrong spot (like outside the else statement), then that would explain it.

If it's not that, we'll have to get more creative :)

Lorne