Procedural Overworld Generation, Part 2

In a previous post, I have showcased some of the capabilities of a system I’ve written to procedurally generate video game overworlds. In this follow-up post, I intend to go into details about how the algorithm behind it works.

As before, the screenshots of the generator will predominantly use graphics borrowed from Super Mario Bros. 3 (SMB3) as a placeholder. While this isn’t designed to generate overworlds for SMB3, it is designed to generate overworlds in the style of SMB3, making the graphics of that game very suitable to visualize the output.

My generator is written specifically to meet the needs and goals of a certain game (that is currently in an early idea stage). Since I need to talk about some of these needs to justify certain design decisions, I’ll refer to it as “the Game” so as to not reveal more than necessary.

(Note that since the publication of Part 1, development of the overworld generator has continued. As such, this post also talks about functionality not previously showcased.)

An image of a video game overworld using graphics from Super Mario Bros. 3.
An example of a generated overworld.

My standard approach for procedural generation is typically built on a hierarchical model of incremental generators. Start by generating things in a very free-form abstract sense. Afterwards, transform this abstract notion into something slightly more detailed and concrete. Then, do so again for another aspect of it. Repeat this progressive elaboration until you’ve gone from a vague outline to a finished product. As a metaphor: “Generate the skeleton before you generate the muscles, and generate the muscles before you generate the skin”.

As per this standard approach, my overworld generator can be described as a series of sub-generators, or phases:

  1. The Graph Generator
    • Creates the abstract structure of the map such as what levels there are and how they connect to each other without any concerns for geometry.
  2. The Layout Generator
    • Positions each node (levels, intersections, etc.) in a discrete grid.
    • Plots cardinally aligned roads with sharp 90º turns to connect the nodes in the manner described by the graph.
  3. The Railroad Generator
    • A portion of the Game revolves around trains. As such, a railroad needs to be a prominent feature of the overworld.
  4. The River Generator
    • Another portion of the Game revolves around ships. We generate rivers partially to fit this need, partially to make the world feel more alive and natural.
  5. The Landscape Generator
    • Generates all the otherwise “unnecessary” things, such as biomes, vegetation, mountains and other decorative elements that prevent the map from feeling empty.

Let’s dig a bit deeper and see what each step does.

Graph Generator

The first phase, the Graph Generator, is responsible for creating an abstract description of the structure of the overworld by randomly assembling nodes and edges into a graph. This is done without any geometrical considerations, such as where items are located spatially in reference to another. Instead, it only focuses on what exists and what connects to what.

Basic Structure

A network of nodes where the same few simple patterns have been randomly duplicated and joined together to create a slightly chaotic structure.
A partially generated graph, constructed by randomly generating and attaching small subgraphs. Two such subgraphs, a cycle and a branch, are in the process of being attached.

The first step of the Graph Generator is to randomly assemble a network of nodes. This graph starts out as a small cycle of nodes and grows over time by creating small shapes and attaching them to existing nodes at random. I call these shapes graph components, of which I have two kinds.

The cycle component creates between three and five nodes and connects them into a small cycle. An arbitrary node from this cycle is then connected to a randomly chosen node in the main graph.

The branch component creates three or four nodes. One of them, the hub, gets connected to the rest and to a random node from the main graph.

The overworld generator takes in a number of parameters to control various aspects of the generation. The first such parameter is the map complexity, which controls how many graph components should be added to the map. A small complexity creates minimalistic graph whereas a large complexity creates a huge amount of branching and localized cycles.

The overworld is primarily going to contain levels; each level gets its own node. The Game needs to be able to decide how many levels it wants in total, and thereby control how many nodes there are. The complexity parameter above gives a vague idea of how many nodes will be generated, but without any guarantees (by design). We therefore provide another parameter that influences the generation, the desired level count. If the previous steps didn’t generate enough nodes, we keep adding individual nodes (attaching each to one other node at random) until we have at least (level count + 2) nodes in the graph. (The two extra nodes will come into play later.)

Note that the level count only influences the lower bound for the number of nodes. This is acceptable for now; a later step will decide which nodes will actually become levels.

Three versions of the same network. In each subsequent version, a new connection between two distant nodes is added.
A graph with zero, one and two degrees of connectivity. The colors have been added to highlight how the structure is otherwise identical.

The previous two steps creates a somewhat disordered mess of branches and small cycles. As such, the overall structure is effectively a tree (modulo the localized cycles). The third and final parameter that controls the large-scale structure, the map connectivity, exists to enable a more cohesive structure. For each degree of connectivity, the generator picks the two nodes that are the farthest part and connects them, forming a large-scale cycle consisting of localized branches and small-scale cycles.

When going from zero to one degree of connectivity, this effectively means taking a pseudo-linear overworld structure and turning it into an overworld with roughly two main paths and numerous sub-paths. A second degree of connectivity would thus be something along the lines of a figure-eight graph, each half having two main paths (with corresponding sub-paths) through them.

As mentioned in the original showcase post, an increase in connectivity decreases the possibility of the map successfully generating; the likelihood of the map becoming non-planar grows very quickly as the map starts looping back onto itself more and more. Non-planar maps are neither desired for the Game nor supported by the rest of the generator.

Node Specialization

A network of nodes. The leftmost and rightmost nodes are labeled "Start" and "End" respectively. One node is labeled "Port", another one is labeled "Station". Two loosely connected nodes are both labeled "Cave".
A graph with some specialized nodes.

Due to specific design decisions of the Game, the graph needs certain nodes to have special properties. The second step of the Graph Generator therefore deals with specializing these nodes.

For uses of these techniques outside of the Game, this part is perhaps the one that has to change the most to instead meet other games’ needs.

As mentioned before, we generate at least (level count + 2) nodes. These two extra nodes represent the start of the game and the end of the game. We determine which two nodes should mark the start and end by picking the pair of nodes that are the farthest apart.

Another thing we want for the Game is one or more shortcut caves to give more options for traversing the overworld. This is similar, though not identical, to SMB3 pipes. The number of caves is another generator parameter. It functions very similarly to connectivity as it picks two distant nodes and connects them. Unlike connectivity, the nodes will not be connected with a road. Instead, a pair of linked cave nodes are generated and connected directly to the chosen pair. Another difference is that this link between the cave nodes is ignored for the purposes of distance calculations in subsequent generator steps; the two nodes are still considered far apart.

The game has two levels that are special from a generation point of view. I will refer to them as the Port (which is always on a river tile) and the Station (which is always next to a railroad track). Nodes for these are chosen such that they appear roughly in the middle of the overworld. An additional constraint is also imposed; the chosen nodes can’t have more than two connections to other nodes to prevent the level, roads and river/railroad to become too cluttered.

Two copies of the previous image. The first copy adds ascending numbers to all nodes except the start, end and cave nodes. The second copy is similar, except two more nodes are left without a number.
Two complete overworld graphs with different amounts of desired levels.

Since we may have generated too many nodes, it’s now time to determine which of these nodes should be the remaining levels.

To be blunt, my current implementation of this part of the generator is probably overly complex and I’m neither satisfied with the code nor what it generates. Since the exact details will hopefully change in the future, I’m not going to focus too much on them and instead handwave all the technicalities.

In short, the generator analyses how a player can traverse the overworld from the start to the end. Based on how often a given node has to be visited, the node’s importance is calculated. The node deemed the most important is marked as a level, then the second-most important node, and so on until the total number of levels match the level count parameter. The remaining unspecialized nodes are marked as intersection nodes, i.e. non-levels that the player can move to but not enter.

This importance measurement favors choke points. For instance, if all paths through the overworld go through a specific node, this system ensures that this node becomes a level, which by definition becomes required. If one instead prefers overworlds that maximize choice, this importance measurement can be inverted to instead avoid choke points.

Now that we know which nodes are relevant from a gameplay point of view, we could easily do additional pruning, such as removing dead ends without levels and other redundant features. Alternatively, one could instead place minor things in these locations, such as bonus rooms or item shops. However, none of these options are currently implemented in my generator.

Lastly, we need to determine the relative difficulties of each level, such that the game starts out easy and becomes progressively harder. My current algorithm is simply “the nth closest level to the start has difficulty n”. (These are then normalized to range from difficulty 0.0 to 1.0.) I’ve tried other approaches, including ones that also take the distance to the end into account. However, the simplest algorithm I could think of turned out to yield the (subjectively) best results.

What we now have is an abstract graph completely describing a finished overworld from a gameplay point of view. Everything that follows is the generation of how to present this overworld to the player.

Layout Generator

Phase one creates graphs describing overworlds. In my illustrations, I have presented these graphs as if they existed in a two-dimensional space. However, graphs are fundamentally non-geometric; nodes and edges don’t “exist” in a specific location. All visualizations of graphs, be it my illustrations or the final overworld displayed to the player, must solve the problem of where in space to place each node and how to draw each edge. In the case of the illustrations, I did it by hand. In the case of the overworld? Well, that’s what phase two, the Layout Generator, is for.

Simply put, the Layout Generator has two tasks: plot the X/Y coordinates of all nodes and draw cardinally aligned paths between those nodes corresponding to non-cave edges.


Two graphs with identical nodes and structures. The first positions its nodes semi-arbitrarily in space. The second includes a grid that all nodes are aligned to.
An overworld graph and one potential positioning.

Perhaps unintuitively, my generator handles assigning the X and Y coordinate as two separate steps.

We define the start node to be at X=0 and assign the other X coordinates using a variation on a breadth-first search. A node’s X coordinate is roughly proportional to how many steps away it is from the start node, albeit with a few special cases to tweak the outcome a bit. For instance, two connected nodes never occur on the same X coordinate, even if they’re equally far away from the start node. These create slightly more complex structures that I subjectively prefer.

Determining the Y coordinate of each node is more complicated. There are likely many ways to do this, but I’m generally happy about the results of my approach. Moving from left to right, we give each node a range of possible Y values between 0 and 1. The start node is assigned the range (0, 1). For each child (which doesn’t already have a Y range), we split the range into that many equally-sized pieces and give each child one of those ranges. This continues until all nodes have a range.

For example, if the graph consists of the start node and two children, the children will receive ranges (0, 0.5) and (0.5, 1), respectively. If that first child in turn has three children, those will receive the ranges (0, 0.167), (0.167, 0.333), and (0.333, 0.5), and so on.

The ranges generated above aren’t really that good. We therefore do three steps of post-processing on them. First, we normalize the ranges for each column so that all columns have ranges between 0 and 1. (This prevents vertically lopsided overworlds and tries to re-center everything.) Second, we pick the middle point of each node’s range as a preliminary Y coordinate. Third, we remap these coordinates such that all instances of the nth lowest preliminary coordinate (across the whole overworld) gets a final coordinate of n. (Some degree of rounding occurs in order to remap two “close enough” preliminary coordinates to the same final coordinate, increasing the chances of later columns sharing the Y coordinates of earlier columns.)

Lastly, all coordinates are multiplied by 4 in order to make room the roads between the nodes.


Two almost identical graphs. The first graph has its edges drawn as straight lines. The second graph has its edges drawn using cardinally aligned lines and 90 degree turns.
An overworld before and after routing.

Now that we know where all the nodes are, it’s time to figure out how create the roads between them. For the Game, we want these roads to have a few properties:

  • All roads should be cardinally aligned (i.e. going straight north, south, east or west).
  • Roads can turn, but only as sharp 90º turns.
  • Roads can only turn at points aligned to the same grid as the levels (i.e. where both the X and Y coordinates are divisible by 4).
  • Roads cannot overlap other roads or nodes. (No overpasses or tunnels.)

One important thing to note is that this process is not always possible. First and foremost, if the generated graph turns out to be non-planar, it is by definition impossible to accomplish this without overlaps. Second, given that the levels are positioned without any regard for road placement, an otherwise theoretically suitable graph could accidentally be rendered unsolvable.

An image showing 10 different examples of how you can draw cardinally aligned roads between two points using zero, one and two 90 degree turns.
The supported ways to draw roads between two nodes.

The generator’s roads have even more limitations. To keep the algorithm simple (and reduce the number of possible permutations), we only support a limited set of road shapes, creating additional constraints that may make the task impossible. The basic set of road shapes we support is as follows:

  • Straight road directly towards the target.
  • Road with a single 90º turn directly towards the target.
  • Road that immediately turns 90º, then makes another 90º turn directly towards the target.

With some rotations and reversing taken into account, the amount of possible shapes per road adds up to 19, though most are obviously non-applicable to any given situation.

The base algorithm for building the roads out of these options is surprisingly simple: “Try all the different permutations of valid roads until we find one without overlaps”. This might raise some eyebrows as the number of permutations is clearly exponential relative to the number of roads. In other words, trying all permutations sounds like it should be very slow. Luckily, there are some simple tricks we can do to improve the de facto speed of the algorithm.

The first step is some initial checks to rule out blatantly impossible graphs. As a simple example, if the graph contains a node with five or more edges, the routing cannot possibly succeed as there are only four allowed directions. We can immediately discard such graphs without even trying.

Second, we do some fail-fast partial routing where we focus on forced moves. Sometimes, there is only one valid option for any given road. It is immediately obvious that we need to draw that road in that exact way, no matter what choice is made for any of the other roads. This may in turn constrain more roads to only have one valid option, so we can immediately draw those as well. If we suddenly encounter a road with no valid options, then clearly we have some mutually exclusive forced moves, meaning the graph is inherently impossible.

This simple fail-fast O(n2) algorithm is significantly faster than the slow O(2n) algorithm, though it is very unlikely to complete the whole graph on its own. Even then, almost all routing failures occur during this fail-fast step, meaning that we usually don’t even need to reach the slow parts.

Let’s put some concrete numbers on this. I ran some tests and collected statistics from over 21,000 failed map generations: 98.05% failed during the fail-fast step, 1.11% exhausted all possible permutations and 0.81% gave up after a semi-arbitrary cap of 2000 permutations was reached.

Third, even when we do get to the slow parts, we order the roads in ascending order by number of still valid permutations. This ensures that we always focus on the more constrained roads and try to rule out their permutations as early as possible, hoping to find contradictions before we even reach the nodes with lots of permutations. This is still theoretically an O(2n) algorithm, but it’s faster in practice since we try to prune the search tree as fast as possible.

Fourth, another thing to keep in mind is that the number of roads is typically relatively small. I have no intention of using this algorithm for maps with millions of roads (which would be an infeasible permutation nightmare for these techniques). Time complexity is always a good tool to reason about an algorithm, but when the realistic scenarios still allow me to try to route hundreds of graphs per second, the fact that the algorithm is theoretically slow is rather moot. Sometimes, “slow and simple” isn’t necessarily worse than “fast and complex”, especially when modern computers are so fast that “slow” means “still pretty darn fast”.

If the routing fails (due to it being impossible or we simply give up), we discard the graph and start over from scratch, generating a new one and trying to route that one instead. In fact, this is the case with the vast majority of graphs. This generator often generates hundreds, thousands or even tens of thousands of graphs before it finds one that it can route.

For generators of complex structures with lots of weird constraints and standards to adhere to, I find that it’s often significantly easier to write two algorithms, one that just generates stuff at random and one that checks if that stuff meets our needs, than it is to write a single algorithm that generates things in a way that intentionally follows the requirements. As long as the code is fast enough that we can still find a suitable result within a comfortable amount of time (such as during a loading screen), it doesn’t matter if it was “intelligently designed to be good” or “just randomly happened to be good”. As such, “fast and disposable generation” is in general a go-to approach for me.

Either way, if the graph survives this trial by fire, we finally have a complete overworld layout with levels, intersections and roads. Are we good to go? Well, maybe. The question is if we want to show this map to the player.

Automated Censorship

A cropped screenshot from a fully generated and rendered overworld map. A square-shaped road formation has been blurred out.
A (censored) example of a map that, by chance, generated a set of roads forming an unacceptable shape.

I frequently attend meetings at work where I draw graphs and diagrams on a whiteboard. Given how many abstract drawings I make containing lots of random shapes, it has on occasion happened that I draw something and suddenly realize “Oh dear, this combination of shapes just happened to look like something vulgar”, at which point I have to awkwardly apologize and alter the drawing to be less inappropriate. (I find Euler diagrams to be the most common source of this.)

Computers don’t work that way. They can’t reflect on what they’re doing and come to sudden realizations about unintended consequences of its work. The only criteria for what a computer considers to be “good” are the criteria that some programmer wrote in the code. Computers have no “understanding”, no “common sense”, only a set of explicitly written rules. If there are unwritten rules, such as “you should obviously not make anything offensive”, then the computer is blissfully ignorant of them.

With procedural generation, this is especially dangerous. The job description of a procedural generator effectively boils down to “assemble random stuff in random ways”. As per the infinite monkeys principle, that means that the assembled thing will occasionally turn out to be inappropriate. Unless the developer of the generator accounts for that, things may become unpleasant.

Case in point, this overworld generator. By design, our roads frequently have sharp 90º turns. Also, up to four roads can intersect at a single point, such as a level. As it happens, there is a very undesirable symbol consisting of four 90º angles joined at a single point. Given that the core design of what the Game’s roads should look like lends itself so well to accidentally forming this symbol, I’m genuinely surprised it doesn’t happen more often during my tests.

Should one care? Well, if it were a “one in a trillion” kind of thing, perhaps it would be an acceptable risk? (For comparison, if you continuously generated a map per second, it would take 32,000 years to generate a trillion of them.) Unfortunately, this symbol is definitely more common in my maps than that. Given that I managed to find two examples within the span of a few minutes of lazily generating maps, I would not be comfortable releasing this generator as part of a video game without some way to prevent this.

Luckily, this specific shape is geometrically very simple. It’s effectively trivial to write an algorithm to analyze the finished road layout to determine if anything resembling this shape can be found. If so, simply discard the map and generate a new one.

Railroad Generator

A cropped screenshot of a fully rendered map. The Station tile is right above a horizontal railroad line stretching across the full width of the image. The railroad intersects with roads in two places.
A simple railroad next to the Station, marked with the number 2. Also note the two level crossings.

The next phase is the Railroad Generator. As mentioned before, the Station level needs to spawn next to a railroad. This is actually causally reversed, with the railroad spawning next to the station.

Now, if you were looking forward to a complex algorithm for determining where to turn and branch in a complex network system, this phase may unfortunately be quite lackluster. You see, the Railroad Generator is almost insultingly simple. All we do is to generate a straight line of railroad tracks, either horizontally or vertically, such that one of the tiles is right next to the Station.

It’s probably easy to assume that this is a sign of sheer laziness. So what, some 3000 words so far of excruciating details about various algorithms, and this one is a mere sentence? Well, keep in mind that my goal here isn’t to make the most sophisticated or ingenious algorithms ever; the goal is to generate overworlds suitable for the Game. As it happens, given the Game’s setting, it makes perfect sense for railroads to be boring straight lines. The fact that it makes the implementation trivial (and uninteresting), is a mere side-effect.

At this point in the generation process, there are only four types of tiles on our overworld: nodes (levels, road intersections, etc.), straight roads, road turns, and empty tiles. I mentioned before how all nodes and road turns are aligned to a four-tile grid. Since the point where we spawn a railroad is offset from this grid by one tile, it means that the railroad line will only ever collide with empty tiles or straight roads. As long as we have support for level crossings (with are also appropriate for the Game), there isn’t anything that can block the creation of the railroad once we’ve determined which tile to start on.

River Generator

A cropped screenshot of a fully rendered map. Four rivers, three of which are interconnected, flow through the landscape. One of the rivers, going roughly north-northwest, intersects the Port level. Each river tile has a small white arrow pointing in a cardinal direction indicating the flow.
An overworld map with four rivers. Since the map isn’t animated, arrows are used as a placeholder to visualize the flow of water.

After the simplistic intermission of the railroads, now we’re back into slightly more complex topics. Next phase is the River Generator, responsible for, um, generating rivers.

The River Generator takes in a parameter defining the amount of rivers to add. Each river is a (possibly curved) path through the grid, its two endpoints being either at the edge of the map or an intersection with another river. For the Game, we want each river to be relatively simple; seeing as they’re (primarily) a background detail, we don’t want a complicated shape zig-zagging back and forth and dominating the whole map. The easiest way to keep the shape of a river simple is to limit how it can turn.

A river is generated by first determining three values: a starting point, a primary direction, and a secondary direction orthogonal to the first. For example, “start at (9,17), primarily west and secondarily north”. In order to grow the river, we take a step in either the primary or secondary direction, with the primary one being twice as likely, and make that tile a river as well. Once we reach an endpoint (map edge or other river), we go back to the starting point and do the same thing again in the other direction. (Continuing our example, it would head primarily east and secondarily south).

Since the Port needs to be on a river, we ensure that the first river we generate starts on that very tile. As an unintended side effect, the river that the Port is on will always go from one edge of the map to another; it won’t terminate at the intersection with another river since it’s the first one to generate. One could work around this about it by randomizing which river is assigned to the Port, but the current behavior makes sense in a way. It being a long river that stretches beyond the bounds of the map may mean that it’s a “major river” in the area, which would make sense to build a port at. Yes, this is me retroactively justifying a quirk of the generation, but it makes sense to me, so I’m choosing to keep it.

We have additional constraints to keep in mind. Aside from the Port, we don’t want any rivers intersecting levels. Additionally, while we do support bridges for roads and railroads, we only want to make graphics for when the river intersects a straight piece of road, not a turn or intersection. When we pick which direction to grow the river in, we check for these things and prevent the invalid moves. If it turns out that all possible moves are invalid, we undo our work for this half of the river and start over, hoping not to get stuck again. If we keep failing (with an arbitrary cutoff at 100 tries), we simply reject this river altogether and try another set of a starting point and primary/secondary directions. If this too fails too much (again, with an arbitrary cutoff at 100 tries), something must be very wrong and we completely reject this map altogether.

Rivers have a direction of flow (in case we choose to have an animated overworld map for the Game). This is not generated along with the rivers themselves, but calculated after-the-fact. To keep the algorithm simple, it assumes that for each disconnected network of rivers, exactly one of the edge tiles is an outflow; the rest are inflows. The exact algorithm used to then determine the flow direction of each tile is somewhat complex and, frankly, dull. To oversimplify slightly, it starts by using Dijkstra to pathfind from all river tiles to their corresponding outflow. It then refines this initial result to ensure that the flow of water doesn’t suddenly switch from one direction to another in the middle of a river segment.

We now have a configurable amount of rivers in our maps. As it happens, they will further influence the generation in one aspect. Speaking of which…

Landscape Generator

An image of a very minimalistic overworld. The image is divided along a diagonal so as to provide a comparison. The top left half contains roads, rivers and levels, but otherwise only contains a dull and featureless background. In the lower right, this background is replaced with a landscape containing patches of grasslands, deserts and badlands. Additionally, there are mountains and different types of vegetation.
A simplified overworld layout before and after adding a landscape.

So far, the only features we’ve generated that aren’t essential to the gameplay are the rivers and railroads. Important as they are to the look and feel, they don’t do anywhere near enough to make the world look “alive”. Most of the map is just… “nothing”.

The last phase, the Landscape Generator, remedies this by filling the empty space with something slightly more interesting to look at. It does this by generating three things: biomes, decorations, and props.

In my screenshots, three distinct biomes can be found: grasslands, deserts, and badlands. Each is represented by its own ground tile. This seemingly small number of biomes isn’t due to a lack of imagination or algorithmic skill; instead it was intentionally designed this way since those specific biomes are what we want for the Game. Adding a boreal forest, savanna, swamp, rainforest, or what-have-you is totally doable, and probably fun, but clashes horribly with the intended setting and atmosphere of the Game.

The different patches of biomes have very unpredictable shapes. Sitting down and writing an algorithm to try to create a chaotic outline like that would probably be tricky. Instead, we use a source of unpredictable shapes and reinterpret its result into something closer to what we want. Enter gradient noise algorithms.

Aside from the standard library, the only third-party code in my generator so far is the OpenSimplex noise algorithm. Specifically, I used digitalshadow’s C# port of Kurt Spencer’s original Java implementation; both are in the public domain. (Thanks!)

Gradient noise algorithms (OpenSimplex, Perlin, etc.) are immensely useful for procedural generation since they produce a space of random but smooth values. I frequently use them for generating things with chaotic-but-coherent structures, which is what’s often found in nature. Given that we want to generate pseudo-natural environments, they are therefore useful in this project.

Two copies of the same map. The left one is an ordinary and fully rendered map containing the three different biomes. The right one replaces all the ground graphics with solid colors ranging from black to white. The lighter tiles correspond to grasslands in the left image whereas the darker tiles correspond to the badlands. The deserts are in-between the two extremes. The right image has abnormally bright regions near the river and an abnormally dark region near the end tile.
A map and its corresponding lushness.

The generator currently uses two noise maps, each with four parameters: its seed (derived from the map seed), its scale, its strength and its baseline. The scale is multiplied to all coordinates used to query the noise map, effectively allowing one to choose between high-frequency noise, low-frequency noise or anything in-between. Once we get our value back from the noise map, we multiply it by the strength and add the baseline. This allows us to easily stretch and offset the value space, effectively altering how we interpret the noise.

The two noise maps are used to calculate the lushness and altitude, respectively, of each tile. The original idea was to create a few more dimensions of environmental information before mixing them all together to determine the biomes, in a way somewhat similar to how Minecraft solves similar problems. As it turns out, the Game isn’t Minecraft and doesn’t have anywhere near the same needs for a vast and complex set of biomes. Given that we only have three biomes, deriving them from a 493-dimensional space of stats would not only be extreme overkill; it would almost certainly give a significantly worse and overly chaotic result.

To simplify the algorithm, we only derive the biome itself from the lushness value. We separate the range of possible values into three different regions and assign each to a biome. The grasslands and badlands are allocated to opposite ends of the value space, ensuring that there should always be a desert between them.

A few paragraphs ago, I showed an image illustrating the lushness. One may notice that there seems to be an abnormal amount of lushness around the rivers. This is intentional; we actually alter the randomly generated lushness value by boosting it proportionally to the distance of the nearest river tiles. Conversely, the end tile (represented by the SMB 3 fortress tile) has the opposite property; its proximity slashes the lushness in order to create badlands around it.

So, we have our biomes. The next step is decorations. Right now, there are two types: mountains and vegetation. We still have that other noise map we implemented, the altitude. It makes sense to map sufficiently high altitude values to mountain tiles, so that one is easy. Okay, so what do we want to do about the vegetation? We could make a third noise map, or we could derive it from what we already have. As it turns out, mapping sufficiently low altitude values to vegetation tiles works kind of well. It creates a separation between mountains and vegetation by typically having an area of empty tiles between them. Whether or not it makes sense ecologically, it looks good, which is all that matters. If we were trying to generate realistic landscapes, we already failed some 5000 words ago.

The generator doesn’t specify what exactly “vegetation” means. It’s up to each biome to interpretation of the term, meaning that the tile that is ultimately rendered depends on where on the map the tile is located. (For instance, my screenshots use bushes to represent grassland vegetation and palm trees to represent desert vegetation.)

As an additional detail, we explicitly remove all decorations that are “too close” to levels, roads, etc. in order to create a bit of contrast between the essential and non-essential parts and prevent them from being cluttered.

The last step is adding props. These are individual tiles that are algorithmically placed instead of just following a random shape. We currently have two such props.

The first prop is a skull tile used to add some atmosphere to the primarily arid climate that the Game takes place in. A single skull is placed on a random tile with a lushness of less than 0.5 and as far away from any other non-empty tile as possible.

The second prop is a signpost. It can spawn at any point where three or four roads meet, preferably as far away from any level as possible.

And with that, the generation of the overworld is complete, at least from a data point of view. Technically, a final step is required to actually render all this data as an image. However, since the entire rendering system I have right now is just a giant placeholder, and it’s not particularly interesting to begin with, I’m just going to skip past that part altogether.

In Closing

An image of four different overworlds, each with clear differences in structure, biome distribution, and decoration density.
Four random maps generated using different settings.

There are some aspects about this generator that I’m really happy about, some that have room for improvement, and some things that ought to be added. All in all though, I feel like this generator has come along nicely albeit slowly since last time I wrote about it.

Aside from the functionality and the quality of the output, another area I’ve partially focused on is optimization. I’m a Java Enterprise developer by trade, which usually prioritizes well-structured and easy-to-follow code over highly efficient code. As such, I defaulted to that mindset back when I first started prototyping this generator. Since then, in part based on recent my experiences in modding and reverse engineering Cities Skylines, I’ve learned a lot about how to optimize game code, although I’ve presumably still only scratched the surface.

Given that the whole philosophy of this generator is “generate random stuff until something works”, the most heavy focus of my optimization efforts has been the graph generation and routing algorithm, and the payoff has already been noticeable. A lot of work still remains though.

Now I just hope the Game eventually gets off the ground so I get to actually use all this stuff for something.

Tags: , , ,