The Path to Real-Time Worlds and Why It Matters
5 min read
Uncategorized

The Path to Real-Time Worlds and Why It Matters

Today we're releasing Waypoint-1, the first real time diffusion world model optimized for consumer GPUs.

Overworld all started because we wanted to build a kind of world that didn’t exist yet. Something more than a pretty game or a new content generator. Something that felt alive and responsive in a way we had only imagined.

While working at the intersection of storytelling and AI research, we kept asking ourselves the same questions: What if we could create a story that unfolded with a user? What if a world could evolve as you moved through it and respond to your actions in real time, like a lucid dream you could shape?

Eventually, this stopped feeling like a far-fetched idea. It was something we had to build.

Turning Diffusion Into a Real-Time World Model
Diffusion models have played a massive role in redefining image and media generation. However, almost all of them stop after one output. We wanted to know what would happen if diffusion didn’t end after one frame. What if it could take a world and evolve it step by step as a player moved or acted within it?

The primary limitation of diffusion was its structure. By navigating diffusion as a persistent, stateful system instead of a single forward pass, we could update world states incrementally, and respond to input in real-time. Rethinking the architecture changed everything. Diffusion began to behave like a real-time system rather than a content generator.

Once we realized it could run on consumer hardware, the idea became even more real. This removed the latency ceiling that had previously masked this behavior. Rather than waiting seconds for a single output, our model was able to respond several times per second. We were no longer watching generations appear; we were interacting with a world that felt alive.

What We’re Releasing
Today, we’re announcing the first step toward our vision with the release of a research preview of our real-time diffusion world model that can run on ordinary consumer hardware. This release is intended for experimentation and exploration rather than as a finished experience.

The model maintains a persistent world in memory and updates as you move and act, all in a way that feels immediate and human. It can generate fully interactive, AI-native worlds on consumer GPUs and runs entirely on-device.

The model is informed by hundreds of hours of interaction with immersive worlds and a deep understanding of what it takes to make an experience feel real. We pursued the same idea that made the holodeck compelling: a system where perception, action, and world state are continuously aligned and updated in real time.

When developing this model, our team prioritized low latency, continuity, and a sense that your actions genuinely matter. We didn’t build with AI for its own sake, but instead used it to make possible a never-before-seen type of interactive world that couldn’t exist without it.

This work has been made possible in part by a $4.5 million pre-seed round led by Kindred Ventures, alongside a small group of early-stage funds and angel investors from the worlds of infrastructure, gaming, and developer tooling. Their support has allowed us to take this idea out of theory and into something people can actually experience.

A World That Belongs to Its Creators
For many people, AI raises several concerns, including environmental costs, the loss of creative work, and the proliferation of empty, automated content. These thoughts are valid and real, and they shaped how we built.

We aim to preserve the sense of agency that makes art and play meaningful. That is why our system is local-first and user-controlled by design.

Local inference keeps your creative choices on your machine. Users who create these interactive worlds should maintain ownership over them. That’s why anyone can use this medium, whether they have a Chromebook that can access the hosted version or a gaming computer that can run the system directly.

Overworld’s system doesn’t route anything through a remote service. It isn’t shaped by an unseen model guessing what you want to interact with. To deliver faster and more reliable feedback, our local-first approach avoids the environmental cost and unpredictable latency associated with data center infrastructure.

Our world-building model decentralizes power, grants creators genuine ownership of their tools and worlds, and delivers a faster experience. This medium is open, mod-friendly, and built to evolve with its community.

Creativity as Agency, Not Automation
Throughout the journey of developing the first local-first diffusion world model, we have always sought to preserve human creativity. We were never interested in replacing imagination with automation. Instead, we’ve sought to give people a new medium to explore.

Engineers, researchers, hackers, and builders can bring their art to life and experiment with interaction loops and physics. They can explore their creative boundaries, as well as those of the worlds they generate.

This release isn’t the final form of what we hope to create, but the first concrete step toward a wider frontier. The worlds created by the model are experimental by design and do not represent the caliber of visual fidelity or stability that will be present in the eventual public release of the system.

What’s Next
We built Overworld because we wanted these worlds to exist, and now we want to see what worlds you will create.

The people who join us on this journey will shape how this medium evolves: how it’s played, how it’s shaped, and what it means to create inside an AI-native world. If you want to help this frontier evolve, try the model for yourself and help define what comes next. This is just the beginning.