Why Real-Time Ray Tracing Hasn’t Landed Like People Expected
Real-time ray tracing showed up with big promises back in 2018, when NVIDIA’s GeForce 20-series GPUs made it feel like “movie lighting” might finally become normal in live games. But the reality has been more uneven.
One major reason is simple: this console generation launched with limited ray-tracing capability, and on PC it’s mostly been higher-end systems that can use it well. So even when ray tracing is available, it hasn’t always been practical to turn on—especially when performance drops are steep.
Now Microsoft is outlining a roadmap that points to a more fundamental shift in how ray tracing works, with the goal of making it more efficient and easier to use at scale.
The End of Triangle-by-Triangle Ray Tracing
Why “Tracing Rays” Isn’t Always the Hard Part
There’s an irony here: the ray calculations themselves aren’t always what slows things down most. Before lighting can happen, the system has to prepare a large amount of scene data so ray tracing can run efficiently.
That preparation relies on a data structure used to determine whether a ray intersects objects in a scene. Up to now, the common approach has been to build that structure from the scene’s geometry—specifically, from triangles.
That works fine at small scale. But modern scenes contain enormous amounts of geometry, and the situation gets worse when the geometry is dynamic.
Modern Game Scenes Are Too Dynamic for Constant Full Rebuilds
Modern graphics pipelines don’t just render static worlds. They constantly reshape what’s in memory and what needs to be represented for lighting:
- Geometry can be more dynamic when using mesh shaders, where geometry can be generated or managed more fluidly.
- Asset streaming means new geometry is constantly streamed in and must be incorporated.
- Geometry that leaves memory shouldn’t stay represented in the structure.
The result is ugly: a large chunk of time spent rendering a ray-traced scene can be wasted rebuilding acceleration structures, even when that work doesn’t meaningfully improve what you see on screen.
Microsoft’s response to this is a new approach called Clustered Geometry.
Clustered Geometry: Ray Tracing Built From Reusable Chunks
What Clustered Geometry Changes
Instead of treating each triangle as its own unit, Clustered Geometry groups nearby triangles into compact clusters. Those clusters become the building blocks used to assemble acceleration structures.
It’s a change in how scene data is organized for ray tracing: less “triangle-by-triangle,” more “pre-grouped components” that can be handled more efficiently.
Practical Takeaways for Performance and Scalability
Clustered Geometry leads to a few concrete outcomes:
- Acceleration structures can be built faster because they’re assembled from pre-grouped chunks
- Memory usage improves because clusters can be reused across multiple structures
- Work can be spread across frames instead of being forced into one heavy rebuild
The broader idea isn’t brand-new in computer graphics (it’s been used in concepts like Nanite Virtualized Geometry), but this approach brings the concept into a more mainstream ray-tracing workflow.
PTLAS: Partitioned TLAS Adds a New Level of Manageability
BLAS vs TLAS (And Where the Bottleneck Shows Up)
Ray tracing relies on a hierarchy of structures to stay fast:
- BLAS: individual objects (Bottom Level Acceleration Structures)
- TLAS: the whole scene (Top Level Acceleration Structures)
In large, dynamic scenes, TLAS can become a bottleneck—because treating “the entire scene” as one top-level structure doesn’t play nicely with constant streaming, animation, and scene changes.
How PTLAS Makes Scene Updates Modular
Partitioned TLAS (PTLAS) shifts the approach: instead of one monolithic TLAS, the top-level structure is split into smaller pieces.
This helps in scenarios that modern games rely on heavily, including:
- Geometry streaming from disk
- Scenes with lots of animation
- Dynamic level-of-detail systems
The key advantage is that engines only need to update the parts of the scene that actually changed. If a full rebuild isn’t necessary, the system can skip that work. Updates and rebuilds become modular rather than all-or-nothing.
GPU-Driven Acceleration Structures: Cutting CPU Coordination
Why CPUs Have Been Taking a Hit With Ray Tracing
A lot of people don’t realize the CPU has been doing meaningful coordination work in ray tracing. Up to now, the CPU has been responsible for orchestrating much of the process.
That matters because turning on ray tracing can bring frame rate drops that are also CPU-limited, not just GPU-limited.
Moving Toward More Autonomous GPU Handling
DXR has allowed acceleration structures to be built on the GPU already, but this direction goes further—pushing toward a model where the GPU handles more of the process on its own.
With less CPU work involved, overall performance—especially minimum performance—should improve.
Clustered Geometry + PTLAS: Scalable Ray Tracing That Doesn’t Fall Off a Cliff
A More Graceful Performance Curve
When Clustered Geometry and PTLAS are combined, ray tracing becomes more scalable. In practical terms, that means:
- More systems can handle ray tracing
- Performance requirements rise more gracefully as resolution increases or scenes get more detailed
Instead of ray tracing feeling like a sudden performance cliff, the goal is something closer to a steady climb.
What to Expect Next
Even with these shifts in direction, it’ll take time before these ideas show up in shipping games. Still, the trajectory is clear: ray tracing is being reworked to better fit how modern games actually build and stream worlds—without burning huge amounts of time rebuilding structures unnecessarily.
And all of this is happening even as the DirectX version number itself seems like it’s staying put.

