6
\$\begingroup\$

I'm learning modern (>=3.1) OpenGL by coding a 3D turn based strategy game, using C++.

The maps are composed of 100x90 3D hexagon tiles that range from 50 to 600 tris (20 different types) + any player units on those tiles. My current rendering technique involves sorting meshes by shaders they use (minimizing state changes) and then calling glDrawElementsInstanced() for drawing. Still get solid 16.6 ms/frame on my GTX 560Ti machine but the game struggles (45.45 ms/frame) on an old 8600GT card. I'm certain that using an octree and fustrum culling will help me here, but I have a few questions before I start implementing it:

  1. Is it OK for an octree node to have multiple meshes in it (e.g. can a soldier and the hex tile he's standing on end up in the same octree node)?
  2. How is one supposed to treat changes in object postion (e.g. several units are moving 3 hexes down)? I can't seem to find good a explanation on how to do it.
  3. As I've noticed, soting meshes by shaders is a really good way to save GPU. If I put node contents into, let's say, std::list and sort it before rendering, do you think I would gain any performance, or would it just create overhead on CPU's end? I know that this sounds like early optimization and implementing + testing would be the best way to find out, but perhaps someone knows from experience?
\$\endgroup\$

1 Answer 1

4
\$\begingroup\$
  1. Yes. It's probably a good idea to stop subdividing your octree once you're down to some threshold number of meshes in the node, since at some point the cost of further subdivision can outweigh the cost of just drawing a few extra meshes.

    In fact, the other way around is necessary too: having a mesh in multiple nodes. If a mesh happens to be in a place where it straddles the boundary between multiple nodes, each of the nodes it touches will need to have a reference to it.

  2. When objects move, you keep track of which node boundaries they cross, and adjust the list of meshes in each node correspondingly - adding them to nodes they've entered and removing them from nodes they've exited.

    If your game contains a reasonably strict separation between moving and non-moving objects, it can also pay to have two octrees, one each for non-moving and moving objects. The level of subdivision for the moving ones could be lower, if appropriate, reducing the number of nodes and therefore the overhead of updating them. But this may or may not be a performance win; you'd have to profile to see.

    Depending on your situation, you might also find it necessary to dynamically split and merge octree nodes, if they end up with too many / too few meshes in them. This should be a relatively rare operation, though, and you might be able to get away without it.

  3. You can split rendering into two phases: visibility determination (deciding what to draw) and then actually drawing it. During the first phase you'd traverse the octree and accumulate visible meshes into a list; then during the second phase you'd sort that list by shader/material and then draw them.

    You can save some time by using a bucket sort here. Rather than literally tossing all the meshes into an std::list and then calling sort(), you'd maintain a list of buckets - one for each material - and each bucket contains a list to which you add the meshes as you find them during the octree traversal. The list of buckets is maintained from one frame to the next and kept sorted (only the contents of each bucket is cleared), so you do not need to re-sort them all the time; the meshes are already sorted by the time you finish traversing the octree.

BTW, for a tile-based game world with an isometric view, an octree might not even be the best choice of subdivision structure. Since meshes are likely to be all about the same size and to be fairly evenly spaced (as opposed to there being some areas with a dense concentration of meshes and other areas that are quite empty), a grid might serve you just as well as an octree, and be simpler to code. Instead of doing a hierarchical tree traversal, you'd just iterate over a rectangle of grid nodes enclosing the frustum, which is easy to calculate since it's just the AABB of the frustum in grid coordinates.

\$\endgroup\$
2
  • \$\begingroup\$ Great answer, a few notes/tips. Because the nodes in an octree have a fixed size an object can implicitly know in which node it is, which makes it cheap to update it while moving, however this also means that you might have a lot of empty nodes so also look at kD-Trees and BSP-Trees and see which one would suit your needs best (a trade-of between complexity of determining node locations and the number of empty nodes). \$\endgroup\$
    – Roy T.
    Commented Nov 20, 2012 at 0:11
  • \$\begingroup\$ Thank you very much for your answer. I believe, I'll be able to do it now. \$\endgroup\$
    – Manvis
    Commented Nov 20, 2012 at 6:23

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .