bt_bb_section_bottom_section_coverage_image

Microsoft Muse AI Model Unveiled; Can Generate Game Visuals and Controller Actions

Microsoft researchers revealed a new artificial intelligence (AI) model on Wednesday that can create 3D gaming environments. Named the World and Human Action Model (WHAM) or Muse, this new AI model was created by the tech giant’s Research Game Intelligence and Teachable AI Experiences (Tai X) teams in partnership with Xbox Games Studios’ Ninja Theory. The company stated that the large language model (LLM) can assist game designers in the ideation phase and generate game visuals and controller actions to support creatives in game production.

Microsoft Introduces Muse AI Model


In a blog entry, the Redmond-based technology giant expounded on the Muse AI model. This is presently a research product, although the company mentioned that it is making the weights and sample data of the model available as open-source for the WHAM Demonstrator (a conceptual prototype of a visual interface to engage with the AI model). Developers can experiment with the model on Azure AI Foundry. A paper elucidating the technical details of the model has been published in the Nature journal.

Training a model in such a complex domain is a challenging task. Microsoft researchers gathered a substantial amount of human gameplay data from the 2020 title Bleeding Edge, which was published by Ninja Theory. The LLM was trained on a billion image-action pairs, equating to seven years of human gameplay. The data is reported to have been ethically collected and is utilized solely for research purposes.

The researchers indicated that increasing the model training was a significant hurdle. Initially, Muse was trained on a cluster of Nvidia V100 GPUs, but it was subsequently scaled to numerous Nvidia H100 GPUs.

Regarding functionality, the Muse AI model accepts both text prompts and visual inputs. Furthermore, once a gaming environment is crafted, it can be augmented further using controller actions. The AI reacts to the user’s movements to create new environments that align with the original prompt and are coherent with the ongoing gameplay.

As a distinctive AI model, traditional benchmark tests cannot adequately assess its capabilities. The researchers emphasized that they have conducted internal assessments of the LLM based on metrics like consistency, diversity, and persistence. Since it is a research-focused model, the outputs have been confined to just 300x180p resolution.

Share
× WhatsApp