Alibaba Qwen3.7-Max climbs to No. 2 in global AI coding rankings

  • Model overtakes ChatGPT and Gemini rivals on independent Code Arena benchmark
  • Alibaba pairs AI model push with cheaper inference pricing and new chip

Alibaba Cloud’s flagship coding model Qwen3.7-Max has risen to second place globally on the latest Code Arena rankings, marking one of the strongest performances yet by a Chinese large language model in software engineering benchmarks.

The model scored 1541 points in rankings released May 26 by Code Arena, an AI coding evaluation platform operated by the independent benchmarking community LMArena.

The result placed Qwen3.7-Max ahead of models including GPT-5.5, Gemini-3.5-Flash and GLM-5.1, trailing only Anthropic’s Claude series among major model providers.

Trustworthy benchmark

Code Arena is regarded as one of the industry’s more influential programming benchmarks because evaluations are conducted through blind testing and designed to reflect real-world engineering tasks rather than vendor-curated demonstrations.

Qwen3.7-Max ranked among the top four models overall, breaking what had been a prolonged dominance by Claude-Opus-4.7 and Claude-Opus-4.6.

It also became the first Chinese-developed model to surpass the 1,540-point threshold on the leaderboard.

The Yangtzeer reported earlier that Qwen3.7-Max had ranked fifth globally and first among Chinese models in the latest independent benchmark by Artificial Analysis, on May 21.

Alibaba Cloud said the model was built for AI agent-based workflows and improved significantly in coding, agent collaboration and long-duration task execution.

According to the company, the model can independently complete complex end-to-end software projects within hours that would traditionally require a professional team working for weeks.

It can also operate continuously for as long as 35 hours while making more than 1,000 tool calls to optimize chip kernels autonomously.

Temporary price cuts

Alongside the ranking update, Alibaba Cloud announced temporary price cuts for Qwen3.7-Max inference services, reducing pay-as-you-go rates by 50% to 6 yuan ($0.88) per million input tokens and 18 yuan per million output tokens.

The company is also offering 1 million free trial tokens.

Integrated AI stack

Alibaba Cloud simultaneously introduced its in-house AI chip Zhenwu M890 as part of what it described as a vertically integrated “AI factory” stack spanning chips, cloud infrastructure, models and inference systems.

Liu Weiguang, senior vice president of Alibaba Cloud, said the company’s latest infrastructure and model upgrades reflect a broader transition toward AI-driven revenue measured by token consumption.

“This full-stack upgrade means Alibaba Cloud’s growth engine is shifting comprehensively toward AI revenue based on token usage,” Liu said.