A complete GPU pipeline that combines deep convolutional neural networks with evolutionary computation to synthesize playable 3D terrain — with a custom fitness metric built around real-time NavMesh connectivity.
3D terrain generation for games typically relies on hand-crafted noise functions or artist-painted heightmaps. Neither scales well — noise lacks visual variety and artistic intent, while manual authoring doesn't scale to procedurally generated worlds. The goal of this thesis was to build a system that generates terrain that is both visually compelling (aesthetics from a reference image) and functionally playable (navigable by AI agents).
The pipeline has three main stages:
The system was validated across four terrain archetypes: Plateau, Ridge, River, and Plain. Each archetype was generated from a distinct style reference, and the NavMesh metric confirmed that all outputs maintained acceptable agent traversability while adopting the visual character of the reference image.
Balancing visual fidelity and navigability was the central challenge. A terrain that looks exactly like a reference image might be a completely impassable cliff — and a perfectly flat, walkable terrain has no visual character. The multi-criteria fitness function required careful tuning of weights between the SSIM metric, the NST perceptual loss, and the NavMesh score. Early stopping was critical to prevent the genetic algorithm from collapsing into local optima that maximized one signal at the cost of others.
The real-time NavMesh baking in Unity was also non-trivial — standard baking is an async editor-time operation and had to be restructured for synchronous runtime evaluation inside the genetic algorithm loop.
Want to know more about this project?
Get in Touch →