Odyssey's Revolutionary 'World Model': Dive into a Realm of Video, Enjoy Immersive Interaction, and Real-time Creation

06/12 2025 435

Recently, an intriguing innovation has emerged in the tech landscape.

A London-based startup, Odyssey, has unveiled an 'interactive video generation model'.

This model transforms videos into an interactive, real-time generated universe, akin to navigating a first-person game within the video itself.

Remarkably, it can produce a high-quality frame in real-time every 40 milliseconds. Users can manipulate the frame using a keyboard, controller, or even voice commands in the future.

With a mere keystroke, the video world responds instantaneously, immersing you with no perceptible delay.

1. It's Neither a Game nor CG, but a 'World Model'

Distinct from traditional videos or 3D games, this technology relies on Odyssey's proprietary 'World Model'.

This is the same 'World Model' often discussed by luminaries like Yang Likun and Li Feifei.

Previous video generation methods were 'batch production', creating numerous image frames simultaneously to form a complete clip.

In contrast, the World Model operates on a 'frame-by-frame' basis, constantly anticipating: 'You just pressed forward; what scene should I generate? You just turned your head; what space should I display?'

This is akin to how large language models predict the next word, but here, it predicts images—a dynamic, immersive, interactive world.

2. Overcoming the Greatest Technical Hurdle: Frame Drift

However, constructing an AI-driven world is far more intricate than generating a single image.

The paramount challenge is stability.

Put simply: Each time the AI predicts a frame, it advances. But if even one frame is slightly off, subsequent content may become drastically distorted—a phenomenon technically known as 'drift'.

To tackle this, Odyssey adopted a balanced approach—'narrow-domain pre-training'.

Rather than hastily training diverse worlds, it:

Initially pre-trains on extensive general videos to cultivate a fundamental understanding of the real world;

Then fine-tunes with limited, specific environments, sacrificing some image quality but significantly enhancing stability.

While this strategy reduces the diversity of generated environments, it markedly improves stability, preventing abrupt frame collapses or distorted characters in the video.

3. Swift Capital Moves, Led by the Founder of Pixar

Odyssey is not just a technological trailblazer but also a favorite of investors.

Its two founders, Oliver Cameron and Jeff Hawke, hail from the autonomous driving industry. One is an AI research veteran from Wayve, and the other is the CEO of Voyage.

They adeptly applied the 'world modeling' concept used in autonomous driving to AI videos.

Moreover, Ed Catmull, co-founder of Pixar and former president of Disney Animation, has joined the board of directors.

Yes, the man behind 'Toy Story'.

Currently, Odyssey has secured over US$27 million in funding from top-tier investors, including EQT Ventures and GV (Google Ventures).

4. Boundless Possibilities, Beyond Entertainment

Odyssey acknowledges that the current version is still rudimentary: images are not sufficiently clear, interactivity is limited, and scene stability is imperfect.

Nevertheless, these rough edges do not obscure the profound significance of this innovation.

Odyssey believes this is not merely a technological breakthrough but a new narrative medium.

You're not just watching a travel Vlog; you're directly 'stepping into' the beaches of Bali.

Medical school teaching videos transform into simulated classrooms where you can 'practice surgery'.

Film and television creators can generate entire story segments in real-time by simply controlling character and environment parameters.

Advertisements evolve from static clips to interactive brand spaces where users can engage.

Looking back at human civilization's development, from murals, writing, and drama to radio, film, and video games, each media revolution has profoundly transformed our understanding of the world.

Today, this AI-driven, real-time interactive video world may become the next 'narrative engine'.

For Odyssey's World Model, we invite your thoughts in the comments section ????

Solemnly declare: the copyright of this article belongs to the original author. The reprinted article is only for the purpose of spreading more information. If the author's information is marked incorrectly, please contact us immediately to modify or delete it. Thank you.