Ambient Occlusion

OK. So this is “cool” enough to share.

I was thinking last night about ways to improve the visuals of larger object in a game scenario, such as architectural details or even entire buildings.
One thing in particular that I lack for these types of structures is ambient occlusion.

For models that use a single texture that is non-tiling- adding ambient occlusion is no problem at all. We can just multiply an ambient occlusion texture onto the color texture. Standard stuff, easy peasy.
But what if we have a side or the facade of a building? Which also happens to use tiling textures or even multiple textures for a single model?
This is where per vertex ambient occlusion is a valid option.

Before I go on any further I’d like to address an obvious question.
Why don’t I use screen space ambient occlusion?

The answer is simple: I don’t really like it.
Let me elaborate: In almost every game I’ve seen it used it looks pretty terrible.
I feel it requires way too much maintenance and tweaking to really be worth it. I like the concept, and future real time ambient occlusion techniques will surely be better. But for right now, I’ll stick to manual approaches like the one I’ll be talking about.

So. Per vertex ambient occlusion.
This should be no mystery to anyone who is familiar with the terms by themselves.
Basically what it is, however, is a set of data that’s stored for every vertex of a model, containing a brightness or color shift for that vertex.
We can then modulate the color of a shaded pixel on a model by this value.

With this simple approach, we can get cheap and fairly accurate ambient occlusion for any model.
A while back I wrote a simple ambient occlusion generator that worked on a per texel basis.
I remember that that approach was slow and was sorely limited by the texturing of the model that was being processed.
Think about it, these are the exact points I bring up in the beginning of this post.
We can’t have tiling textures and we can’t have shared parts of the texture.

Anyway. The knowledge I gained back then was enough to let me work this thing out without thinking twice.

There’s of course the drawback of tessellation. This approach does not go over well with very low-polygon models.
The data per vertex gets interpolated across faces, which means a huge triangle will get severely darkened if only one of its vertices is occluded.
This is not a real problem since my engine isn’t that hampered by triangle count, we can easily cram some extra triangles on there without trouble.

As with any ambient occlusion we get better results the more samples we use. More samples means we get a wider range of shading for any given vertex. We therefore get smoother results.
There’s also the possibility of blurring the shading across vertices to smooth things out even further if necessary.

Here’s a thing without anything special. It’s not very exciting and it’s hard to tell what’s what.

And here’s the same thing but with a simple ambient occlusion term. It’s easier to tell what shape this thing actually has.

Now, I’m aware that the results aren’t that appealing. But as I mentioned above, there are several things one can do to make the results more pleasant.
It’s also important to remember that cases like this won’t often happen in a game where there’s something that only has an entirely flat texture applied. Textures mask a lot of errors in shading and is in this case a very useful thing. We can in many cases just ignore processing the data further.
But blurring the result is definitely a step up when it comes to this. And that is the next thing I will get into. I’ll also make sure to get some more exciting models to show off, that thing above isn’t very appealing, is it?

Back to it. 🙂


So yeah, I’ve been working on the performance and not so surprisingly most of the trouble I’m having are in fact remnants of past “fixes” and some other code that’s poorly planned.
For example, my resource manager class that I use was up until now not able to load materials correctly. What the material then ended up doing was to load itself uniquely every time it’s being used. The resource manager’s job is to make sure we don’t load resources that are redundant. Cloned resources are most of the time redundant.
This wasn’t a performance problem per se, but contributed to the loading times and memory consumption.
It also failed to deal with binary models correctly. It just loaded a new model into memory every time. This was due to the fact that I automatically detect what a model is, so that I can reject or load it. I have since then switched the system to deal solely with binary models, it’s stable enough right now.

I finally implemented the code that stops spotlights from shading objects that are not really affected by it. I’m using a rather experimental approach in which I use a type of voxelization system to determine the spotlight volume in 3d space. I use that volume to find the affected geometry. This isn’t really necessary, and I’ll admit partially to that it’s just because it’s cool.
But works alright at the moment anyway, so I’ll leave it in for the time being.
The resolution of the voxel lattice is pretty coarse and doesn’t even closely give a perfect representation of the spotlight cone, but it provides a rather simple solution for this (I think) and I don’t need that accurate results.
This bought me a pretty big performance boost, and that’s what’s most important right now.
I say it’s partially because it’s cool, and the other partial reason is that for a system that is mostly concerned with AABBs I’m not really interested in putting in code to deal with cone intersections for these AABBs, voxels are in fact AABBs by nature in my implementation, so there you go. Simple.
The shortcoming with AABBs for most things is of course that for angled objects it wastes space, but I buy into the idea that it’s worth the speed.
I understand I could use the physics engine for a lot of these collision tests and such, and I’ll get to that soon enough, but I’ll try my own implementation for a while first.

This is pretty much it for right now.


… is pretty crappy at this point.
Overlooking the obvious performance sink found in some unfinished things around the engine, there are some other performance problems that I’ve noticed now that are quite severe.
I don’t exactly know what it is that’s causing this drop in framerate, but I am half suspecting that it’s the number of draw calls that gets to me.
This is expected, and is as far as I can tell only remedied through instancing or batching of geometry.
I will get into the latter.
For static geometry it’s a good thing to batch several geometrical surfaces into one, bigger model.
At this point it’s cheaper to push polygons in my engine than it is to draw an object a lot of times.
I haven’t yet worked out exactly how I will handle the batching of geometry on an implementation level as it’s not something I’ve done before. But it shouldn’t be that hard. And since it only concerns static geometry it doesn’t necessarily incur a performance drop (I should hope not, I’m aiming for the opposite of that), since we batch the geometry at startup.
My engine doesn’t need to handle a lot of dynamic entities, so this is a very good concept for me.

The other thing I notice is that I need a way to split the game world into chunks, so that I can skip checking visibility for a lot of things when in fact none of them would be visible.
Ordinary frustum culling does a good job at this, but for scenes that have a high object density this becomes a bottleneck; the scene manager has to traverse the scene hierarchy and decide what objects are to be rendered.
If we split the world into chunks, we can skip drawing an entire room packed with objects if we find out the confines of the room aren’t even visible to the user.
This is a rather simple scene manager thing, but it’s something my engine doesn’t do at this point, so I’m going to put that in.

Now, there’s the issue of occlusion culling. I need a way to let my engine skip drawing rooms that are behind walls and such and therefore not visible to the user.
There are several possible algorithms to deal with this, but for my engine I’m thinking of rolling a custom routine. I think this routine will be less automated, but hopefully can be made more aggressive in that it really cuts away stuff we can’t see.
This is a real necessity for me because I’m looking to handle indoor scenarios well.
And some of the existent techniques to handle this are best suited for game worlds that are made out of CSG, something that I don’t use at all.

I’ll get to work implementing some of this stuff and put results on here about how the performance (hopefully) increases.

Also, there are of course some effects (like FXAA and soft shadows) and subsystems (like physics simulation) that suck up performance, and are things that I generally can’t help.

Oh boy.
OK. The very first thing I mention in this post is that I have some unknown performance problems. Well, I DO know what the problem is and it has been fixed. The problem was that I was generating AABB’s for every single thing in the scene every frame. (It wasn’t the drawcalls like I suspected.)
How does something like this happen? Good question, and if I find the answer to that I’ll tell you.
No but seriously, it was sort of a ghetto fix I made a while back just so that I could get the damn thing to work.
But in a real situation there’s no need to generate an AABB more than once for something that is static. Static in the sense that it will never move or animate as far as the player is concerned.
And why did this incur such a performance impact? Well, because the AABB has to be created using the OOB as a base. I therefore have to first apply the matrix transform of the object onto the OOB, then create the AABB using the modified vertex positions.
Doing this for every object in the scene every frame is out of the question, not to mention completely and utterly pointless in my situation.
This is performed on the CPU, mind you.

Some updates


I’ve been fixing up the particle system some, to allow the features I never had the energy to do a while ago when I posted that video with the gargoyle face and stuff.
There’s still a lot missing that are necessary for a full featured particle system, but the basic stuff is working fine.
Except I’m going to have to redesign the system itself a little.
Right now it’s divided into particle effects and emitters.
The idea is you attach particle effects onto emitters and create effects in that manner.
But the way it’s designed it doesn’t actually allow more than one particle effect per emitter, or even multiple emitters sharing a particle effect. A little design flaw there. oops.

I experimented a little with color grading as a post processing effect.
So far I think it’s a cool thing, even though I’m not a huge fan of it in general. It’s a fickle matter, it takes a lot of consideration and fine tuning to get right.
There’s also the issue of banding artifacts, which I understand is commonplace with color grading. I’ll have to experiment a little with different stuff to alleviate it to some degree. The banding really looks terrible in some cases, definitely not something you’d expect to see in modern engines.
In case someone wants to get a reference as to why I get this banding artifact: It’s because the loaded texture that contains the color information is just 8 bits per channel, meaning we don’t have a lot of resolution to stretch that to the rendered bit depth which is far higher.

It’s come to the point where I’m going to start implementing the ‘final’ step of my custom 3d file format.
The binary step. (oooh, aaah) But really, it’s too slow to parse text and the format is robust enough not to crash every second time I use it so. It’s that time.
The format is not that amazing really. It just supports the bare minimum of what I need, so there’s no awesome animation support like curves, morph targets and stuff like that in it ( …yet).
I’m not really planning to use this format forever. It was sort of my own idea that I would eventually switch to something else, to Collada perhaps. (*whisper* if the damned exporter support was better.)
I would use FBX, but I’m rather put off by the options of parsing. Either parse the text based file or use Autodesk’s FBX parsing library. I’m not even sure the schematic is public so you could parse the binary yourself.
Anyway It’s too heavy for my (rather) lightweight framework, but I might take a closer look at one point.

Uuh… I think that’s about it. I’m doing a ton of other stuff too, but nothing significant to write about.

Things and stuff

I’ve been spending some time rethinking and re-implementing a part of my material system to basically be more versatile than what it was.
Since my renderer still utilizes sort of a hybrid rendering using a deferred approach to handle the largest bulk of the lighting and standard forward rendering for the rest, what I can and can’t do is pretty apparent.
But. With my latest additions the renderer is way more full featured and can be used to make a plethora of effects.
I still don’t handle transparency that well. Sorting transparent geometry is always difficult to get good. So I probably won’t bother trying to make more of it than I have to. Basic back to front sorting will have to do.

Here’s a boring ol’ wall tile with a texture applied.
Now how this is being rendered is of course mostly up to the material attached to that object.
We can change the material to do something else:

Here I turned the material into a mirror-shine super glossy material.

I also added a “custom” routine for effects and stuff that fall out of the reach of my generalized pipeline so we can render those with custom shaders and blending operations and stuff.
Main problem with that is of course that they can usually not take part of the scene’s lighting.
To remedy that, there’s a “super crisis-system”(this is not its actual name) that allows arbitrary shaders to access the scene’s lighting manually so we can at least get some of the lighting on it. Right now I’m planning on allowing access to 4 of the biggest nearby lights. I’m filtering out small and faint lights as these don’t affect the geometry as much (Duh).

I finished putting in support for the last channel of the dedicated “mask” texture that is available through the material’s texture slots.
The “mask” texture is a special texture that stores 3 textures in one. One texture per channel.
It stores Specular Gloss, Fresnel and Reflection.
I find it a tiny bit unintuitive to cram textures into channels of other textures, but artists don’t generally seem to mind; there are plenty other engines that do something similar.

Aaah. What else?

Haha. 😀 Just for fun I tried putting in a monster from Doom 3 to see how the texture maps translated into my engine’s material system.

Hmm. I guess that’s about it for right now. I should really document what I actually do better. I feel like I forget more than half the things.


Not surprisingly, I’ve run into some more “trouble”.
I mentioned briefly quite a few posts back that I was suffering from some issues with alpha testing. Most notably so from it not working at all.
I dug a bit deeper into it and it seems like my OpenGL context somehow got the idea that GL_ALPHA_TEST is in fact deprecated and therefore completely unsupported.
I admit I’m not entirely sure as to why this happened just now, and why it seems to be local to my current project…

I don’t really know what to make of it.
At first I thought is was a driver bug, seeing as how it stopped working suddenly. But then I updated my drivers like 2 times without any notable effect so that possibility isn’t a factor. (+ my older projects still work and they use GL_ALPHA_TEST.)
Now I don’t know what to say. I can still alpha test in my shader, but I fear that it may suffer performance impacts. And it has
already with my “vegetation test” a few posts back.
Though to be completely honest it does seem like alpha testing may take a turn for a shader based approach in the future, if not already. I think that DirectX 10 and later perform alpha testing in shaders because they disabled the possibility to do so otherwise from DirectX 9.
Seems to be reasonable that modern OpenGL versions use something akin to this…

Or something like that, I don’t really know. It’s kind of a mind-effer…

Edit: And yes, my OpenGL context is backwards compatible so that’s not the problem here.

Edit #2: Alright. My friend confirmed that GL_ALPHA_TEST and its kin is in fact deprecated. So I’m sticking to my manual alpha testing. It’s good to know I’m not going insane. The moment I saw that glEnable(GL_ALPHA_TEST) generated GL_INVALID_ENUM and some other errors no matter what I tried, I was all but sure I’d gone of the deep end.


OK. That is REALLY it.

I have solved the problem finally… No more light-bleeding, no more weirdness, no more nothing of that sort.

I’ve been fighting this problem for so long. So VERY long.

It never ceases to amaze me how the problem always lies in the absolute last thing you check. I feel this is an established concept, but I am blanking on the name used for it right now.
HOWEVER, it has also been proven (on my account, at least) that the problem may still exist and that I just think I’ve solved it but it still isn’t fixed.
Time will tell. But I really think this is the end of this for a while at least. (I hope)

Now, onto the actual problem.
This may sound exaggerated to some, but I’ve been having this same problem for pretty much as long as I care to admit.
I estimate it to be at least 1 year almost to the date.
The error is, however, hard to track down as it is located in a field where dozens of things could be wrong.
Add to that the fact that the error doesn’t show up that often so I’ve often found myself forgetting that it exists.

The previous posts outlined the idea that the deferred rendered scenery didn’t have the same problems the forward rendering did. This is not actually true, and I’m regretful that I said it with that level of certainty. Upon much closer inspection (debug information and stuff) they do exist, but they look a lot different than the forward rendering artifacts and not even half as outstanding, so I overlooked it.

Anyway. The problem lies in the CPU side of the tangent space (not the shader like I suspected), in the model loading code where I actually generate the vertex tangent vectors.
The problem is not the code that generates the tangent vectors, the error is when I was generating them.
I have to process my loaded models to remove cases where vertices share normals or texture coordinates, and this is a normal thing to do.
But I was generating the tangent vectors before that step, so I ended up having vertices that instead shared tangent vectors.
This explains why it only happened in some cases. I think I actually mentioned briefly in one of the earlier posts that I had noticed that the errors occur in areas where the tangents vary wildly, or point away from each other.

But. It should be good now.