My first blog entry! Wow. What should I say? Probably ‘hello all’. But now, lets get started.
I’m currently working on two projects for some university courses that are both based on deferred renderers. The first one I’m doing alone and this is my own deferred rendering implementation which I mainly use to prototype new stuff at the moment. It should be finished sometime in March when I have to present the whole thing to the professor. The course I’m doing this for is called ’Design and Implementation of a Rendering-Engine’ and I thought it would be nice to build a simple deferred renderer with basic model-loading using the Assimp library and a hierarchical scenegraph implementation. I’ve already finished this project for the most part, with only some code cleanup todo. Maybe I will do some follow-up post about this project.
The other project I’m working on is for a course called ‘Realtime Rendering’ on which I work together with a colleague from university. For this course we have to implement a rendering engine from scratch using OpenGL or DirectX and make it feel like a demo-scene with automatic camera movement showing off the shader effects we implemented. It is for this course, that I started working on SSAO.
Ok, wow, so a lot of pretext without having said anything about SSAO yet. So let’s get to it. What is ambient occlusion and why do we need it? Let’s just do this quick and dirty, without going into much detail about the physical background involved. We consider a real-life example: Imagine yourself in your room sitting on a chair in front of your desk. It’s night, the window lids and the door are closed and therefore it’s pretty dark. Now let’s turn on our desk light, shining down on the desk. The light-rays hit the desk (direct illumination) and reflect off it illuminating part of your room (indirect illumination). If you now look around your room, preferably in a corner at the ceiling, you will most likely see a darkening effect in the corners. That is basically the effect we want to imitate when we speak of ambient occlusion.
Some approaches for SSAO do not use a hemispherical approach but simply sample the entire sphere (the first prominent example being the original Crysis by Crytek). Let’s have a look at possible situations for this method.
If we consider these points as described in the example above, this gives good and somewhat correct results, but it’s not entirely clear why p1 should be darkened at all. This kind of sampling leads to the well known darker version of SSAO as used in Crysis (see the screenshot below).
To prevent this overall darkening from happening as in point p1 above, it would be better if we could just sample points in the positive hemisphere of the point. There is actually a pretty simple way to achieve this, which is easy to implement if you are using a deferred approach (or even a light-pre-pass renderer). While the initial method used in Crysis only needs access to a depth buffer to reconstruct positions, we also need access to the normals to align our hemisphere. Have look at the following figure to better understand the hemispherical ambient occlusion approach we will discuss now.
So after we’ve dealt with the theory and even found an advantage of using this method lets finally get to the practical part.
I have already mentioned above, that hemispherical SSAO is kind of a no brainer when using any kind of deferred rendering approach, because we already have all the data we need at our disposal, but lets go through it again. We need:
- Some way to reconstruct the 3D-position (either in World- or View-Space) of a point. I personally use a hardware depth-buffer that only stores the depth information, but you could of course also use for instance a render-target that stores the complete set of coordinates in a RGBA128 texture, or you could also directly store the View-Space Z-Coordinate in R32F texture. This is your choice and depends on your implementation.
- We also need access to the normal for the current position. Again, this is implementation dependent, and how you store your normals is up to you.
So what is the idea, on how to implement this hemispherical SSAO? After we’ve rendered our initial G-Buffer pass, we should have access to our Depth/Position- and Normal-Texture for our scene. We then run a Fullscreen-Pass, so nothing fancy has to happen in the Vertex-Shading stage. The entire magic happens in the Fragment-Shader. For each fragment we first reconstruct its position and normal. We then have to take various samples in local vicinity to that point in order to calculate the ambient occlusion. We do that based on two parameters:
- The distance of the sample point to the current fragment’s point, and
- the angle between the current fragment’s normal and the normalized vector pointing from the current fragment’s position in the direction of the sample point.
The first parameter allows us to smoothly fade the ambient occlusion value based on the distance value. Using the smoothstep function of GLSL we can actually perform a blending between 0 and 1 based on the distance and also automatically limit the influence of samples that are too far away from our current fragment.
We calculate the second parameter by generating the dot-Product between the fragment’s normal and the vector pointing from the fragment’s position to the sample position. This allows us to do two things: First, we can determine whether the sampled point lies in the hemisphere, and secondly, we can also use this result to vary our ambient occlusion value. Since the dot-Product gives us the cosine of the angle between our two normalized vectors (it’s important that they are normalized, otherwise it would give wrong results!!), and we know that the cosine is 1 at 0°, moves to 0 at 90° and -1 at 180° back to 0 at 270° and again to 1 at 360°, we can clamp the values of the dot-Product to the range 0 to 1, because the negative range is actually the negative hemisphere which we want to ignore anyways.
Therefore, we get two values, that vary between zero and one. By simply multiplying those two we still stay in the range 0 – 1 which is what we want for our AO value to begin with and we get a pretty good ambient occlusion value, which again, ignores outliers (sample points to far away from the current position) and does not sample points from the negative hemisphere.
The one thing I haven’t talked about yet, is how to choose sample points and also how much sample points are needed, which is both really important decisions. You could probably get away with randomly chosen offsets around your current fragments position.
One possibility is to use a noise texture to get these offsets, which has the advantage that you can easily vary the number of samples you want to take.
I used a different method, that is based on so called low-discrepancy sequences, which are often used in Monte-Carlo methods like Path-Tracing, etc. Simply put, such sequences are pseudo-random numbers, that in contrast to actual random numbers have a predictable and almost uniform distribution (the ‘almost’ is really important here too, because a real uniform distribution would give noticeable patterns). Therefore they are good choices whenever you have to sample something that should be sampled ‘kind-of’ randomly but also tightly. Two prominent examples of sequences that fit these requirements are the so called Halton-Sequence and also Poisson-Disk samples. I will stop now with this excursion into LD-sequences, because I simply don’t have the time to do so.
Just know that in this case and because I had good experiences with these LD-sequences in the past (when implementing Soft-Shadows and Imperfect Shadow Maps) I used Poisson-Disk samples to generate 16 sample points in the range -1 to 1 on a unit disk. So in my approach I take sixteen samples per fragment on a Fullscreen-Quad which of course is scaling performance-wise with screen resolution. If performance is of
importance you could take less samples and/or try to reduce the resolution of your AO texture. In my implementation and on my hardware I did not run into problems yet, but my renderer is still in a pretty infant state.
Finally, here’s the fragment shader code:
in vec2 texcoordFS;
in vec4 ViewQuadPosFS;
uniform sampler2D normalTexture;
uniform sampler2D depthTexture;
uniform mat4 invProjection;
uniform float zNear;
uniform float zFar;
uniform float distanceThreshold;
uniform vec2 filterRadius;
out vec4 fragColor;
const int sample_count = 16;
const vec2 poisson16 = vec2( // These are the Poisson Disk Samples
vec2( -0.94201624, -0.39906216 ),
vec2( 0.94558609, -0.76890725 ),
vec2( -0.094184101, -0.92938870 ),
vec2( 0.34495938, 0.29387760 ),
vec2( -0.91588581, 0.45771432 ),
vec2( -0.81544232, -0.87912464 ),
vec2( -0.38277543, 0.27676845 ),
vec2( 0.97484398, 0.75648379 ),
vec2( 0.44323325, -0.97511554 ),
vec2( 0.53742981, -0.47373420 ),
vec2( -0.26496911, -0.41893023 ),
vec2( 0.79197514, 0.19090188 ),
vec2( -0.24188840, 0.99706507 ),
vec2( -0.81409955, 0.91437590 ),
vec2( 0.19984126, 0.78641367 ),
vec2( 0.14383161, -0.14100790 )
vec3 decodeNormal(in vec2 normal)
// restore normal
vec3 calulatePosition(in vec2 coord, in float depth)
// restore position
// reconstruct position from depth, USE YOUR CODE HERE
float depth = texture(depthTexture, texcoordFS).r;
vec3 viewPos = calulatePosition(texcoordFS, depth);
// get the view space normal, USE YOUR CODE HERE
vec2 normalXY = texture(normalTexture, texcoordFS).xy * 2.0 - 1.0;
vec3 viewNormal = decodeNormal(normalXY);
float ambientOcclusion = 0;
// perform AO
for (int i = 0; i < sample_count; ++i)
// sample at an offset specified by the current Poisson-Disk sample and scale it by a radius (has to be in Texture-Space)
vec2 sampleTexCoord = texcoordFS + (poisson16[i] * (filterRadius));
float sampleDepth = texture(depthTexture, sampleTexCoord).r;
vec3 samplePos = calculatePosition(sampleTexCoord, sampleDepth * 2 - 1);
vec3 sampleDir = normalize(samplePos - viewPos);
// angle between SURFACE-NORMAL and SAMPLE-DIRECTION (vector from SURFACE-POSITION to SAMPLE-POSITION)
float NdotS = max(dot(viewNormal, sampleDir), 0);
// distance between SURFACE-POSITION and SAMPLE-POSITION
float VPdistSP = distance(viewPos, samplePos);
// a = distance function
float a = 1.0 - smoothstep(distanceThreshold, distanceThreshold * 2, VPdistSP);
// b = dot-Product
float b = NdotS;
ambientOcclusion += (a * b);
fragColor.a = 1.0 - (ambientOcclusion / sample_count);
Now let me finish with a screenshot I’ve taken from my renderer which uses the Crytek Sponza as a scene to show off the hemispherical SSAO approach.
One final word: It’s called ambient occlusion for a reason. It should only be incorporated in the ambient term of your image. So keep AO out of the directly illuminated parts. Why? Consider the real-life example I mentioned above. If you point your desk light into the dark corner, then suddenly there is no ambient occlusion in the corner any more.