The Latest Release from DeepSeek AI, Featuring V4-Pro and V4-Flash, Positions Itself as a Powerful Open-Source Challenger, But Questions Linger About Its True Impact.
Hangzhou, China – April 24, 2026 – DeepSeek AI has dropped two new large language models, V4-Pro and V4-Flash, positioning them as formidable open-source contenders against established players like OpenAI and Anthropic. These models, previewed on Hugging Face, boast substantial parameter counts and notable performance claims, particularly in coding and mathematics.
DeepSeek V4-Pro is presented as a frontier Mixture-of-Experts model, featuring a 1 million token context window and three distinct reasoning modes. The company's own benchmarks place V4-Pro as the leading open-source model for coding and mathematics, with performance that reportedly nears Google's closed-source Gemini 3.1-Pro in world knowledge. V4-Flash, on the other hand, is described as an efficient reasoning model, distilled using high-quality data.
Model Specifics and Accessibility
The release details reveal a complex architecture for these models. V4-Pro encompasses 1.6 trillion total parameters with 49 billion active, while V4-Flash has 284 billion total parameters with 13 billion active. Both instruct-tuned and base versions have been made available. Instruction checkpoints are distributed using FP4 for MoE experts and FP8 for the rest.
Read More: AI hiring rejects IT worker in 6 minutes, sparks job search doubts
Early adoption efforts are visible, with deepseek-v4-flash:cloud appearing on Ollama's cloud catalog shortly after release, indicating potential ease of access for some users. The models' training and deployment have reportedly involved Huawei's Ascend AI processors, a point of interest given the geopolitical landscape surrounding advanced chip technology.
Steering: A Resurging Concept?
The release of DeepSeek-V4-Flash has also revived discussions around the concept of 'LLM steering'. This approach, where external mechanisms guide a model's behavior rather than relying solely on its internal training, is being revisited.
The notion suggests that sophisticated steering can, in effect, augment or even replace the core intelligence developed during model training. While steering has not been widely adopted due to complexity or the perceived sufficiency of direct model training, DeepSeek's advancements might be prompting a re-evaluation of its utility. This perspective frames steering as a key area of intelligence, separate from the model's inherent capabilities.
Read More: NVIDIA releases SANA-WM video model on May 17 2026 for faster creation
Critical Reception and Context
While DeepSeek AI hails this as a "Sputnik moment" and a challenge to rivals, a critical perspective is warranted. The framing of benchmarks, where the new model is positioned against closed-source giants, is a common strategy in the AI industry, and these are preview releases, not final production versions. This invites scrutiny into whether the V4 models represent a genuine paradigm shift or contribute to another cycle of hype within the open-source LLM landscape.
The "DeepSeek V4: A Paradigm Shift in Open-Source LLMs, or Another Hype Cycle?" discourse highlights this tension between claimed innovation and the reality of incremental progress in the field. Further evaluation will be necessary to ascertain the long-term significance of these models.