Ask how to render game objects on a programming forum and you will probably be told something along the lines of ‘give each class a virtual Render() function and call it on every object from the main render loop’. This approach is popular with beginners because they know all about object-oriented programming and not much about rendering.

It falls apart very quickly. In general, a single pass over all objects is not enough, and the order is important. The interface can be made more complex but it has to be duplicated in every object. And after all that, 99% of the objects are rendered in exactly the same way; either they are a model or (in a 2D game) a sprite, so the abstraction is useless.

The next step is to build the objects out of components, so each object gets a render component and the rendering loops over those instead. This is better, but not much. There’s still no overall structure to the scene, which makes it hard to control and optimise.

And then, if you know a lot about object-oriented programming and something about rendering too, you end up with the scene graph.

There are lots of game engines built around scene graphs (I’m not linking to them – they know who they are). It’s an idea that comes from graphical editing tools. The structure of the scene is a tree, so objects can be grouped together, which is very useful when you are editing a scene. You can have various types of nodes which afford various properties to their sub-nodes, so a tree can perform a variety of tasks. And this is true in a game too. Unfortunately, because the tree is a very general data structure, it’s not especially good at anything.

Let’s look at what you might want a scene graph to do:

  • Logical grouping of game objects
  • Relative transformations
  • Culling and visibility
  • Level of detail (LOD)
  • State management
  • Collision detection

Note that a lot of that has nothing to do with rendering. Mixing unrelated data means more complex code and slower processing. It should be split off into the appropriate subsystem instead. The game logic system can group objects together. The physics or animation systems can constrain objects and stick them together. Collisions are better handled by a specialised data structure containing only collision information.

That leaves culling/visibility, render states and LODs. Where does any of that benefit from a tree structure? On a modern CPU, it’s quicker to cull objects in a flat list than traverse a tree, even if there are thousands of them. At most, you could go one level deep by specifying ranges in the list with an combined bounding box. The same goes for states. Have a flat list and sort it by effect (an effect is a combination of render state and shaders). For LODs, put them all in the list and cull the unwanted ones according to distance from the camera.

And does the game code care about this sort of organisation? Generally, no. Just chuck the object in there and let the renderer do its job. This makes the interface to the renderer very simple. You can still build levels and models using a tree. It just doesn’t serve much purpose at runtime. Flatten it in a preprocessing step or on loading.

In summary, a game is not a collection of objects, and it’s not a 3D editing package. Design the code around what actually needs to happen, and it ends up a lot simpler, and works better.