Ant Group app brings ‘world model’ AI to smartphones

  • Feature enables real-time 3D exploration from a single image
  • Marks first mobile deployment of interactive world model technology

Ant Group has rolled out a new feature in its Lingguang app that allows users to interact with so-called “world models” on mobile devices, marking a step toward bringing advanced AI capabilities closer to consumer use.

The feature, launched April 27, lets users upload a single image and explore a dynamically generated 3D environment for up to 60 seconds, navigating it using game-style controls.

The experience begins within seconds of input, with users able to move through the scene as if in a mobile video game.

The release is among the industry’s first to enable real-time, interactive world model experiences on consumer devices, a category of AI widely viewed as a potential pathway toward artificial general intelligence by linking digital simulations with real-world understanding.

“The ‘world model experience’ feature is another step in exploring the boundaries of intelligence,” said Cai Wei, head of the Lingguang app.

He added that previously, Lingguang introduced its ‘instant app’ function, which allows users to generate applications in 30 seconds using natural language—bringing coding capabilities once reserved for developers to ordinary users.

“We aim to keep pushing those boundaries, uncover unmet user needs, and make high-quality AI experiences accessible to everyone,” Cai noted.

The system is powered by LingBot-World-Fast, a proprietary model developed by Ant and made open source.

Within the app, users control movement via a dual-joystick interface modeled on mainstream 3D mobile games, allowing for navigation and camera rotation without additional learning.

Bringing such capabilities to mobile devices presents technical challenges, including high computational demands, latency constraints and varying hardware performance.

Ant said it addressed these through low-latency streaming architecture, achieving response times on the order of milliseconds and enabling near-instant interaction.