Alibaba Cloud redesigns platform around AI agents in overseas push

  • New “Qwen Cloud” shifts cloud interface from human users to autonomous agents
  • Company rolls out developer tools as global token usage surges

Alibaba Cloud has unveiled a redesigned AI-focused cloud platform in Singapore, positioning autonomous agents—not humans—as the primary users of its infrastructure, in a major overhaul of how developers interact with its services.

The company on May 26 launched Qwen Cloud, an overseas-facing AI product hub, alongside a suite of agent-oriented tools including MuleRun, a platform for building and deploying AI agents.

The Hangzhou-based cloud service provider also released updates to its coding and desktop agent systems Qoder and QoderWork on the same day. It also upgraded its underlying cloud infrastructure to support agent-first workloads.

Qwen Cloud (qwencloud.com) is not a module inside a traditional cloud console, but a standalone entry point built for what Alibaba describes as the “agent era” of computing.

Under the new architecture, core cloud capabilities such as model access and inference are packaged into standardized “Skills” and command-line interface (CLI). This allows agents to interpret instructions and independently access platform functions without human-facing interfaces.

A human-to-agent shift

“The cloud’s primary users are shifting from humans to agents,” said Li Feifei, chief technology officer and international president of Alibaba Cloud. “When agents become the first-class users of cloud services, interfaces designed for humans need to be rebuilt.”

The platform introduces a three-entry structure: a website for developers to explore and integrate models via OpenAI-compatible APIs, Skills that translate platform functions into machine-readable instructions, and CLI tools designed for both developers and autonomous agents requiring repeatable execution environments.

Alibaba Cloud said it is conducting a full-stack upgrade spanning models, interfaces, agent products and infrastructure to enable global developers to more easily access its AI capabilities, particularly for real-time applications, Li noted.

The company framed the shift as part of a broader industry transition from model training to inference-driven applications, arguing that tighter hardware-software integration will reduce latency and cost for developers building AI systems.

MuleRun, also launched on May 25, is designed to help developers build resource-intensive agent applications and deploy them at scale, targeting growing demand for autonomous AI systems.

The rollout comes as global demand for AI computing accelerates. Data from API aggregation platform OpenRouter shows weekly token consumption in April 2026 was roughly seven to eight times higher than a year earlier.

In China, daily token usage has exceeded 140 trillion, up more than 40% from the end of the previous year, according to official statistics.