- Discounts drive surge in usage as costs undercut global rivals
- Analysts say pricing could pressure broader model market
DeepSeek (深度求索) slashed prices for its latest large language model V4 just days after launch, triggering a spike in usage and setting what appears to be a record low for high-performance AI model costs globally.
On April 25, the company introduced a limited-time 75% discount on its V4-Pro API, followed a day later by a tenfold reduction in cache pricing across its model lineup.
Combined, these cuts bring input cache costs down to as little as 0.025 yuan ($0.0037) per million tokens, with output priced at 6 yuan ($0.87), among the lowest levels seen in the industry.
The pricing gap is stark compared with leading overseas models.
Output costs for GPT-5.5 Pro are about $180 per million tokens, while Claude Opus 4.7 charges around $25. By comparison, DeepSeek’s output pricing is under $1.
On the input side, its cache pricing is roughly one-8,200th of GPT-5.5 Pro’s $30 per million tokens, based on available figures. The discounts are set to run through May 5.
The aggressive pricing comes alongside the release of DeepSeek-V4-Pro on April 24, a model with 1.6 trillion parameters and support for one million-token context windows.
It ranks among the top open-source models globally in reasoning performance, with some industry estimates placing it within three to six months of the most advanced proprietary systems.
The model is also notable for running entirely on domestic Huawei Ascend chips, with per-token inference compute reduced to about 27% of the previous generation.
Usage has surged following the price cuts. V4-Pro recorded 13.6 billion tokens in daily API calls on April 25, nearly four times the previous day, National Business Daily, a business and financial newspaper, reported, citing numbers from OpenRouter, a unified API platform tracking usage of hundreds of AI models
The lighter V4-Flash model rose from 50.2 billion tokens to 81.4 billion the following day. The V4 series is now among the most heavily used open-source models globally.
Hu Yanping, a professor at Shanghai University of Finance and Economics, told National Business Daily that the steep pricing could reset expectations for high-performance models and put pressure on domestic rivals such as Kimi and GLM.
Further cost reductions may follow as newer chips, including Huawei’s Ascend 950, enter mass production and deployment later this year, he added.
