Google's Genie Creates Playable Worlds from Sketches

Feb 27, 2024

👋 Hi, I am Mark. I am a strategic futurist and innovation keynote speaker. I advise governments and enterprises on emerging technologies such as AI or the metaverse. My subscribers receive a free weekly newsletter on cutting-edge technology.

The field of generative AI has seen tremendous advances in recent years, with models capable of generating remarkably realistic images, videos, and text. However, most of these models focus on passive generation from a prompt. In their new paper, researchers from DeepMind introduce an exciting new paradigm - generative interactive environments.

Imagine sketching a whimsical landscape on a napkin over lunch and, by evening, stepping into it as a playable 2D world. That's not a page from a sci-fi novel; it's the premise behind Google's latest AI marvel, Genie.

Unlike the magical beings of lore, this Genie doesn't grant three wishes but offers endless possibilities to creators, transforming mere images into interactive experiences. Trained on a vast trove of gameplay footage, Genie crafts worlds more aligned with classic platformers than VR, but its implications ripple far beyond gaming.

The model can take a text or image prompt and generate an entire playable, game-like environment. What's more, Genie is trained without any action labels or supervision, using only raw internet videos of people playing games. This allows it to learn in a completely unsupervised manner, opening up the possibility of internet-scale training.

Under the hood, Genie consists of three core components: a video tokenizer, a latent action model, and a dynamics model. The tokenizer compresses the raw video frames into discrete tokens. The latent action model then infers a discrete set of "actions" between frames, despite no ground truth being available. Finally, the dynamics model takes the frame tokens and latent actions as input, and predicts the next frame in an autoregressive manner.

A key innovation is Genie's use of a spatiotemoral transformer architecture. By limiting self-attention to spatial and temporal dimensions separately, Genie can efficiently model long video sequences. Experiments confirm that Genie scales well as more parameters and data are added, culminating in an 11 billion parameter model trained on over 200,000 hours of gaming videos.

The results are seriously impressive. Genie can take sketches, text descriptions, and even photorealistic images as prompts to generate interactive game worlds. The latent actions provide smooth control, moving characters and objects accordingly. One remarkable demonstration is Genie's ability to emulate parallax - foreground objects moving faster than distant background ones.

While limitations remain in consistency and speed, the authors argue that Genie opens up many exciting avenues for future work. It could be a general simulation engine for training reinforcement learning agents or robots. More broadly, by unlocking creative interactive experiences from any user's imagination, Genie points the way towards more humanistic generative AI.

As we marvel at Genie's potential to democratize game design, we might also ponder: How will this technology influence our perception of creativity and authorship? Are we edging closer to a world where our imaginations are the only limits, or will these tools reshape our very concept of creativity?

Read the full article on Tom's Guide.

----

This is one of many short posts I share daily on my app, and you can have real-time insights, recommendations and conversations with my digital twin via text, audio or video in 28 languages! Go to my PWA at app.thedigitalspeaker.com and sign up to take our connection to the next level! 🚀

If you are interested in hiring me as your futurist and innovation speaker, feel free to complete the below form.

When will the event take place?

I agree with the Terms and Privacy Statement

Tags

News

Dr Mark van Rijmenam

Dr. Mark van Rijmenam, widely known as The Digital Speaker, isn’t just a #1-ranked global futurist; he’s an Architect of Tomorrow who fuses visionary ideas with real-world ROI. As a global keynote speaker, Global Speaking Fellow, recognized Global Guru Futurist, and 5-time author, he ignites Fortune 500 leaders and governments worldwide to harness emerging tech for tangible growth.

Recognized by Salesforce as one of 16 must-know AI influencers , Dr. Mark brings a balanced, optimistic-dystopian edge to his insights—pushing boundaries without losing sight of ethical innovation. From pioneering the use of a digital twin to spearheading his next-gen media platform Futurwise, he doesn’t just talk about AI and the future—he lives it, inspiring audiences to take bold action. You can reach his digital twin via WhatsApp at: +1 (830) 463-6967.

Who Am I

Dr Van Rijmenam is a strategic futurist specializing in digital disruption. Renowned for his nuanced insights on technology's societal impact, he offers various keynote formats globally and a masterclass on digital innovation .

Contact Mark to explore collaboration opportunities

When will the event take place?

I agree with the Terms and Privacy Statement

Google's Genie Creates Playable Worlds from Sketches

If you are interested in hiring me as your futurist and innovation speaker, feel free to complete the below form.

Thanks for your inquiry

Tags

Dr Mark van Rijmenam

Share

Download my 2025 Technology Trends eBook

My Speaker Demo

Join my free Webinar

00

00

00

Recent Podcasts

My latest book: Step into the Metaverse

Chris Fuss

Who Am I

Thanks for your inquiry

Google's Genie Creates Playable Worlds from Sketches

💡 If you enjoyed this content, be sure to download my new app for a unique experience beyond your traditional newsletter.

If you are interested in hiring me as your futurist and innovation speaker, feel free to complete the below form.

Thanks for your inquiry

Tags

Dr Mark van Rijmenam

Share

Download my 2025 Technology Trends eBook

My Speaker Demo

Join my free Webinar

00

00

00

Recent Podcasts

My latest book: Step into the Metaverse

Chris Fuss

Who Am I

Thanks for your inquiry

You may also like