Dec 31

Hemispherical Screen-Space Ambient Occlusion (SSAO) for Deferred Renderers using OpenGL/GLSL

My first blog entry! Wow. What should I say? Probably ‘hello all’. But now, lets get started.

I’m currently working on two projects for some university courses that are both based on deferred renderers. The first one I’m doing alone and this is my own deferred rendering implementation which I mainly use to prototype new stuff at the moment. It should be finished sometime in March when I have to present the whole thing to the professor. The course I’m doing this for is called ’Design and Implementation of a Rendering-Engine’ and I thought it would be nice to build a simple deferred renderer with basic model-loading using the Assimp library and a hierarchical scenegraph implementation. I’ve already finished this project for the most part, with only some code cleanup todo. Maybe I will do some follow-up post about this project.

The other project I’m working on is for a course called ‘Realtime Rendering’ on which I work together with a colleague from university. For this course we have to implement a rendering engine from scratch using OpenGL or DirectX and make it feel like a demo-scene with automatic camera movement showing off the shader effects we implemented. It is for this course, that I started working on SSAO.

Ok, wow, so a lot of pretext without having said anything about SSAO yet. So let’s get to it. What is ambient occlusion and why do we need it? Let’s just do this quick and dirty, without going into much detail about the physical background involved. We consider a real-life example: Imagine yourself in your room sitting on a chair in front of your desk. It’s night, the window lids and the door are closed and therefore it’s pretty dark. Now let’s turn on our desk light, shining down on the desk. The light-rays hit the desk (direct illumination) and reflect off it illuminating part of your room (indirect illumination). If you now look around your room, preferably in a corner at the ceiling, you will most likely see a darkening effect in the corners. That is basically the effect we want to imitate when we speak of ambient occlusion.

Some approaches for SSAO do not use a hemispherical approach but simply sample the entire sphere (the first prominent example being the original Crysis by Crytek). Let’s have a look at possible situations for this method.

By sampling an entire sphere around the points p1 - p3, we mostly end up with too much darkening. Consider point p1: Although it actually is not occluded by any geometry near it, half of it's sample points (in the negative hemisphere) count as occluding geometry. Point p2 is actually even darker because it's in a corner region, whereas p3 is an exposed point on the edge of an object and therefore appears lighter.

If we consider these points as described in the example above, this gives good and somewhat correct results, but it’s not entirely clear why p1 should be darkened at all. This kind of sampling leads to the well known darker version of SSAO as used in Crysis (see the screenshot below).

SSAO as used in Crysis. Note that even planar regions that don't have local occluders get this dark, greyish color.

To prevent this overall darkening from happening as in point p1 above, it would be better if we could just sample points in the positive hemisphere of the point. There is actually a pretty simple way to achieve this, which is easy to implement if you are using a deferred approach (or even a light-pre-pass renderer). While the initial method used in Crysis only needs access to a depth buffer to reconstruct positions, we also need access to the normals to align our hemisphere. Have look at the following figure to better understand the hemispherical ambient occlusion approach we will discuss now.

Hemispherical SSAO

The hemisphere around a point p is aligned by it's normal N. The local surroundings of the point are sampled by tracing rays over it's hemisphere. If a ray intersects local geometry it is counted as an occluder. Rays that don't hit local geometry do not contribute to the ambient occlusion of point p. Note that in contrast to the original SSAO method we disregard points below the surface entirely.

So after we’ve dealt with the theory and even found an advantage of using this method lets finally get to the practical part.
I have already mentioned above, that hemispherical SSAO is kind of a no brainer when using any kind of deferred rendering approach, because we already have all the data we need at our disposal, but lets go through it again. We need:

  • Some way to reconstruct the 3D-position (either in World- or View-Space) of a point. I personally use a hardware depth-buffer that only stores the depth information, but you could of course also use for instance a render-target that stores the complete set of coordinates in a RGBA128 texture, or you could also directly store the View-Space Z-Coordinate in R32F texture. This is your choice and depends on your implementation.
  • We also need access to the normal for the current position. Again, this is implementation dependent, and how you store your normals is up to you.

So what is the idea, on how to implement this hemispherical SSAO? After we’ve rendered our initial G-Buffer pass, we should have access to our Depth/Position- and Normal-Texture for our scene. We then run a Fullscreen-Pass, so nothing fancy has to happen in the Vertex-Shading stage. The entire magic happens in the Fragment-Shader. For each fragment we first reconstruct its position and normal. We then have to take various samples in local vicinity to that point in order to calculate the ambient occlusion. We do that based on two parameters:

  • The distance of the sample point to the current fragment’s point, and
  • the angle between the current fragment’s normal and the normalized vector pointing from the current fragment’s position in the direction of the sample point.

The first parameter allows us to smoothly fade the ambient occlusion value based on the distance value. Using the smoothstep function of GLSL we can actually perform a blending between 0 and 1 based on the distance and also automatically limit the influence of samples that are too far away from our current fragment.

We calculate the second parameter by generating the dot-Product between the fragment’s normal and the vector pointing from the fragment’s position to the sample position. This allows us to do two things: First, we can determine whether the sampled point lies in the hemisphere, and secondly, we can also use this result to vary our ambient occlusion value. Since the dot-Product gives us the cosine of the angle between our two normalized vectors (it’s important that they are normalized, otherwise it would give wrong results!!), and we know that the cosine is 1 at 0°, moves to 0 at 90° and -1 at 180° back to 0 at 270° and again to 1 at 360°, we can clamp the values of the dot-Product to the range 0 to 1, because the negative range is actually the negative hemisphere which we want to ignore anyways.

Therefore, we get two values, that vary between zero and one. By simply multiplying those two we still stay in the range 0 – 1 which is what we want for our AO value to begin with and we get a pretty good ambient occlusion value, which again, ignores outliers (sample points to far away from the current position) and does not sample points from the negative hemisphere.

The one thing I haven’t talked about yet, is how to choose sample points and also how much sample points are needed, which is both really important decisions. You could probably get away with randomly chosen offsets around your current fragments position.

One possibility is to use a noise texture to get these offsets, which has the advantage that you can easily vary the number of samples you want to take.

I used a different method, that is based on so called low-discrepancy sequences, which are often used in Monte-Carlo methods like Path-Tracing, etc. Simply put, such sequences are pseudo-random numbers, that in contrast to actual random numbers have a predictable and almost uniform distribution (the ‘almost’ is really important here too, because a real uniform distribution would give noticeable patterns). Therefore they are good choices whenever you have to sample something that should be sampled ‘kind-of’ randomly but also tightly. Two prominent examples of sequences that fit these requirements are the so called Halton-Sequence and also Poisson-Disk samples. I will stop now with this excursion into LD-sequences, because I simply don’t have the time to do so.

Just know that in this case and because I had good experiences with these LD-sequences in the past (when implementing Soft-Shadows and Imperfect Shadow Maps) I used Poisson-Disk samples to generate 16 sample points in the range -1 to 1 on a unit disk. So in my approach I take sixteen samples per fragment on a Fullscreen-Quad which of course is scaling performance-wise with screen resolution. If performance is of

importance you could take less samples and/or try to reduce the resolution of your AO texture. In my implementation and on my hardware I did not run into problems yet, but my renderer is still in a pretty infant state.

Finally, here’s the fragment shader code:

Now let me finish with a screenshot I’ve taken from my renderer which uses the Crytek Sponza as a scene to show off the hemispherical SSAO approach.

Hemispherical SSAO as implemented in my renderer: Notice the overall brighter AO term on planar surfaces due to the hemispherical sampling. Also due to the usage of the dot-Product we get a much more detailed AO term.

One final word: It’s called ambient occlusion for a reason. It should only be incorporated in the ambient term of your image. So keep AO out of the directly illuminated parts. Why? Consider the real-life example I mentioned above. If you point your desk light into the dark corner, then suddenly there is no ambient occlusion in the corner any more.



1 ping

Skip to comment form

  1. rili

    Hi Christoph,
    thanks for this post, i find it very interesting and wanted to implement it for my deferred light renderer. i am not getting the results that u show in your screenshot. was wondering about the two variables :

    uniform float distanceThreshold;
    uniform vec2 filterRadius;

    What values do u give these two variables, and what do u mean in the comment in line 62 where u write “has to be in Texture-Space” ?

    i hope u reply in time since i need it for my class project.

    thanks in advance

  2. evolve


    the distanceThreshold is a variable that regulates the “strength” of the AO-term based on the distance between the current surface position (“viewPos” in the shader code) and the sample position (“samplePos” in the shader). If a sample is near to the surface position I want it to have more influence (more AO influence) than a pixel that is further away. So the distanceThreshold has to be a World-Space distance (or more accurately a View-Space distance, but since their scale is the same it shouldn’t make a difference). In my scene the value is 5, but again, it depends or your scene’s size and you most likely have to experiment to get a value that works for you.

    The filterRadius scales the offset given by the Poisson-Disk-samples in texture-space. Texture space is the [0,1]-space in which you sample your texture. The Poisson-Disk samples are distributed in a [-1,1]-space as you can see from the values. If we would use these values to sample the texture, you would basically sample all over the texture and not just within the vicinity of the current sample. But of course, the AO-term should only be sampled in close vicinity to the current sample. So the filterRadius is actually a scale in screen-space where the samples should be placed. In my implementation I use 10 pixels in both directions (horizontal and vertical). It is calculated as (10 / screenWidth, 10 / screenHeight). If you scale the Poisson-Disk samples with this value you make sure that your resulting “sampleTexCoord” are within a 10-pixel radius of your current sample.

    I hope this helps you and clarifies how these values should be used.

  3. Christian


    Thanks for the post! I really like the implementation using the Poisson Disk for sampling and comparing the position with the normal. Very clean and easy to understand. I was only familiar with the “generate-sample –> project position to get tex-coord –> lookup (linear) depth –> compare”-approach, as outlined e.g. here. I don’t know if you have any experience comparing the two (normal/sample-position angle vs. binary sample-depth pass-fail), but if you do I’d love to hear what you think.

    Also, am I right in thinking an improvement to the current implementation would be to divide filterRadius by viewPos.z (and adjusting filterRadius accordingly so it looks good), so that far-away pixels use a smaller sample-radius? Right now, the SSAO-”border” around nearby objects will have the same (on-screen) thickness as far-away objects (right?), which doesn’t really make sense (the borders should have a constant world-space — not screen-space — size).

    Finally, I tried doing this using normals reconstructed from depth (I’m not doing deferred shading, so I don’t have normals available during post-processing). It actually works pretty well, except for some artifact around the border of object where the normals are wrong. I uploaded a screenshot here if you want to take a look. Actually seems like a pretty decent solution IMO.


    1. evolve

      Hi Christian,
      thank you for the reply and sorry for the late response. I haven’t tested all the SSAO approaches but I’m familiar with the “original” approach that you mention and I also describe it a little in my post at the top. One problem I was running into when using it, was that the resulting picture was prone to noise, which I didn’t like at all. A simple blur would’ve reduced this behavior but I just didn’t want to go that route. Instead I opted for trying something different that also incorporated normals and not only depth information.

      For your second point. I think you’re right, it actually makes a lot more sense to keep the sampleRadius constant in world/view-space rather than screenspace. Have you tried that already? I think dividing by viewPos.z could be a solution (that of course would require some tweaking to make it look good).

      Nice work with reconstructing the normals from depth. I presume you use a gradient approach to reconstruct them? Looks good and can reconstruct the most noticeable normal-based effects. If you don’t have access to a normal buffer, this seems like a viable solution. Of course a separate normal buffer could give you some more detail if you’re using normal mapping on your models, but overall these effects are very subtle and most of the time not that noticable in a moving picture.

  4. Robert


    I don’t understand why your calculatePosition() takes both a coordinate and a depth. I think a position is usually reconstructed from a depth value, what is the coordinate for ? On line 62 you are passing sampleDepth*2-1, I am not sure what is happening there. I maintain an R32F texture with depth values that I convert to view space positions. I am trying to port this to HLSL and am having a problem that I think is isolated to your calculatePosition function calls.

    Thank you for this tutorial, I would really love to get it up and running.

    1. evolve

      Hi Robert,

      You not only want to reconstruct a linear depth value from a Hardware-Depth Buffer in View-Space, but a complete position in view-space, therefore we also need to reconstruct the X and Y View-Space coordinate. The calculationPosition() function assumes exactly this, so you we provide not only the depth-value read from the depth-texture but also the texture-coordinate. A simple variant to reconstruct the depth from these values would be to “unproject” these values using the inverted projection matrix (= the inverse of the projection matrix you used during creation of the depth-buffer). To reconstruct the view-space position, you would end up with something like inverseProjectionMatrix * vec4(texCoord.x * 2 - 1, texCoord.y * 2 - 1, depthFromTexture * 2 - 1, 1). This would give you the view-space position reconstructed from the depth value and a texture coordinate. We need that * 2 - 1 because we need to transform from the texture space which is in [0, 1]-range to projective space which ranges from [-1, 1]. The projection matrix transforms to this space, the inverse projection matrix however assumes your values are already in that space, so since the texcoords are in [0, 1] and also the depth value you read from the depth-texture are also in [0,1], we first need to transform it to [-1, 1] and then you can multiply with the inverse projection to get to view space.

      If you use a R32F texture for your depth, chances are you’re not using a hardware depth buffer. Could be that you’re already storing linear depth, but you still need to reconstruct your X, Y position.

      I recommend reading through a series of posts by Matt Pettineo/MJP, who covers practically every possible way of reconstructing position from depth: http://mynameismjp.wordpress.com/2009/03/10/reconstructing-position-from-depth/

      Hope this helps you a little.

      1. Robert

        Thanks evolve, it helped a lot. I got it up and running and it looks great. You did a fantastic job.

  5. James Gangur

    Firstly, great tutorial! Much easier to follow than any other I’ve seen for AO.

    Anyway, I’m trying to implement this, and have nearly got it (I think). However, I’m getting . On the left is the AO map, on the right is my (view space) normals. is my attempt at your shader. What am I doing wrong?

    1. James Gangur

      It seems I failed badly with those links. I think you can see what I meant though.

    2. James Gangur

      Don’t worry, got it working. Thanks for the tutorial :)

      1. evolve

        Didn’t have the time to look over it myself, but glad to see you got it fixed. What was the problem? Surely, nothing big.

  1. SSAO на OpenGL ES 3.0 | WithTube

    [...] ES 3.0 API Reference Card — краткий справочник по OpenGL ES 3.0 спекамHemispherical Screen-Space Ambient Occlusion — один из способов реализации Hemispherical SSAOStone Bridge 3d [...]

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>