I couldn’t resist to try creating a SSAO shader again, and this time the result is at least something that looks a little like ambient occlusion. I’m not using any bluring yet at all and the falloff I use seems to be a little wrong. The AO affects is nearly invisible when combined with the direct lightning but I don’t know why yet…

A torus example:

One more AO only:

It looks quite promising… What about its complexity? Is it too expensive to compute?.. Oh, and a little thought about the directional lighting, perhaps your blending function for the AO with the lighted object is the problem.

You're suppose to use the AO as the ambient term.


(AO + NdotL) * colormap + NdotH * specmap

Also it looks like the normals on that model are incorrect, there's an odd crack and seams around the model.

With 8 samples and 640x480 (I output directly to the screen for now) resolution AO I get 80 fps in the non optimized version so It’s slow…

With 32 samples I get 25 fps only.

Yes the blending is probably what’s wrong but I’m no OpenGL blend expert so it will take some time for me to figure out the correct one.

Mamoko_Fan: I am currently just blending with the already prerendered scene. Do you suggest that I should use deferred rendering instead?

And the crack and seams, do you mean the thing behind the torus? It’s a surface beneth it but also a bug that made the fallof very steep.

Current version (32 samples):

And how it should look (Light traced), sorry for the outline:

So I appearently got some problems in my algorithm. The lower parts of the legs are very wrong for example.

maybe we can find what's wrong if you describe your shader a bit. how do you derive the world space position etc?

i would love to test with your model in my own version btw.

have some older tests here:

Take a look at this screenshot:


I found the model on some “free model” site on the internet. Should I send it to your email or upload it somewhere? It’s an obj.

I do all the calculations in screen space. Is there a reason why one should transform to worldspace and then back when the depthmap is just in 2d? I don’t think I want my sample radius to decrease further away.

Currently I’m unshure how to align the vectors (in the hemisphere) to the normal at the current fragment.


Aha, I see. I don’t know why it looks like that. So I will not use that model anymore, because I use the normals when calculating the AO so it’s important that they all are correct. Thanks for pointing it out. 

With samples in a hemisphere (cheating version, just skipping the ones outside) (16 samples):


Just a update. I think I’ve solved most of the problems now except for some kind of artifacts thats visible when moving the camera. Probably just some misstake in the code. I haven’t implemented proper blending with the scene yet.

The blurfilter is currently not bothering about normals and heights, don’t know how to implement it without affecting performace too much.

Currently a fullscreen render with 16 samples gives about 60 fps on my machine.

The passes:

Source of the SSAO pass (some variable names might be odd, and the rayMap should be a 1D texture):

uniform sampler2D depthMap;
uniform sampler2D rnm;
uniform sampler2D normalMap;
uniform sampler2D rayMap;

varying vec2 vTexCoord;

uniform float totStrength;
uniform float strength;
uniform float offset;
uniform float falloff;

vec3 pSphere[32];

uniform float rad;

#define SAMPLES 16

/// Unpacking a [0-1] float value from a 4D vector where each component was a 8-bits integer
float unpackFloatFromVec4i(const vec4 value)
   const vec4 bitSh = vec4(1.0 / (256.0 * 256.0 * 256.0), 1.0 / (256.0 * 256.0), 1.0 / 256.0, 1.0);
   return(dot(value, bitSh));

void main(void)
   // get the depth of the current fragment (must be unpacked)
   float currentPixelDepth = unpackFloatFromVec4i(texture2D(depthMap,vTexCoord));

   // current fragment coords in screen space
   vec3 ep = vec3(vTexCoord.xy,currentPixelDepth);

   // grab a normal for reflecting the sample rays later on
   vec3 fres = normalize((texture2D(rnm,vTexCoord*offset).xyz*2.0) - vec3(1.0));
   // get the normal of current fragment
   vec3 norm = normalize((texture2D(normalMap,vTexCoord).xyz*2.0) - vec3(1.0));
   float bl = 0.0;
   vec2 coord;
   coord.y = 0.5;
   // adjust for the depth ( not shure if this is good..)
   rad = rad/currentPixelDepth;
   for(int i=0; i<SAMPLES;++i)
      // calculate next coord for the vector
      coord.x = float(i)/16.0;
      // get a vector (randomized inside of a sphere with radius 1.0) from a texture and reflect it
      vec3 ray = rad*reflect((texture2D(rayMap,coord).xyz*2.0) - vec3(1.0),fres);
      vec3 se;
      // if the ray is outside the hemisphere then change direction
      se = ep + sign(dot(ray,norm) )*ray;
      // get the depth of the occluder fragment
      float occluderDepth = unpackFloatFromVec4i(texture2D(depthMap,se.xy));
      // get the normal of the occluder fragment
      vec3 occNorm = normalize((texture2D(normalMap,se.xy).xyz)*2.0 - vec3(1.0));
      // if depthDifference is negative = occluder is behind current fragment
      float depthDifference = currentPixelDepth-occluderDepth;
      // calculate the difference between the normals as a weight
      float normDiff = (1.0-dot(occNorm,norm));
      // the falloff equation, starts at falloff and is kind of 1/x^2 falling
      bl += totStrength*step(falloff,depthDifference)*normDiff*(1.0-smoothstep(falloff,strength,depthDifference));
   // output the result
   float ao = 1.0-bl/16.0;
   gl_FragColor.r = ao;

If someone can spot the error I would be happy :)

Is your SSAO very GPU intensive?

It's quite a lot of operations on each pixels but worst is the number of texture samples per pixels ( 3 + 16*3 = 51!!).

I haven't tested it in a real situation yet so I don't know how much it will affect the framerate.

When testing with only a small and simple scene (10000 tris) i get 60 fps (could have been capped by v-sync). But the code can be optimized more.

using a floating point texture you get away with one texture sample for nomal+depth. also why don't you create the "inside sphere" test vectors as constants in the shader instead of being a texture you have to sample. will drop your tex sampling alot

Ok thanks for your suggestions. :slight_smile:

I've never used floating point textures but I will try it now. Getting away with one texture sample? Do you mean I should use a R32G32B32A32F texture and having the depth in the A component and the normal in the rest? That could be fast… :slight_smile:

The "inside sphere" vectors was in an array before but I moved them to a texture for easier testing with different setups of vectors ( I have a small program that generates the textures). I will switch back to a const array and see if I get some speed up.

yeah, and a 16f precision texture is fine too.

I hope you'll get something nice put together, the shots look really shweet! :smiley:

Thanks :slight_smile: I hope so too and that my solution won’t look too ridicules when MrCoder is releasing his version. :wink:

I made the changes and now performance is considerably better.

90k triangles, 640x480 window size, gives 600 fps!!

And how it looks (no blur and only 8 samples in a 640x480):

It does not look very good now but I think it will be much better with some blur. Will try some simple gaussian blur first.

you can also render the ssao in lower resolution to gain alot of fps and then blur it in the higher resolution(bilateral upsample)

Methius said:

While learning Shaders, I have been trying to get SSAO to work..

Here's the research I've found so far:

I tried to implement the Shader from JKlint from GameDev. The problem is that I do not know how to populate the depth texture. It's the same problem I run into when following the russian tutorial

You mean how to create a depth texture of the view of the object into a quad for example, or a depth texture that is applied on the object? For the first problem there's example for that in DepthOfFieldRenderPass in jme2.0. - at least if that's what you're looking for. I think it can be applied with some modifications for the second case as well. Or even as a full screen shader this could be done if the rendering of the texture is not based on global scene z depth of the points but z coordinates relative to the separate 'mesh centers'. err, probably i'm not clear enough.


JKlint's shader is wrong so don't bother implementing it.

Easiest is to create a separate shader in which you render the linear depth to a texture. In my shader I just grab the linear depth in the vertex shader and write it out in the fragment shader.


I read a little about bilateral upsampling but doesn't know yet how to implement it. I guess it's something like setting mipmapping to nearest and doing your own implementation of linear interpolation but with some weights from normal differences and height differences.

Thanks for the tip. I'm starting from the ground up now because I want to understand how to do it correctly. :slight_smile:

yeah, in the simplest form, just do a blur with respect to boundaries. ie don't pick colors for the blur that are too far away in depth or normal. similar to what you need to do for other filters to prevent bleeding.