Why DeepSeek V4 is changing the AI model race

DeepSeek has again placed itself at the centre of the global AI race. After drawing worldwide attention with its R1 reasoning model in early 2025, the Chinese company has recently released DeepSeek V4, a new model designed to compete not only on performance, but also on price, openness and efficiency.

The hype around DeepSeek V4 is not based on a single feature. The model comes with a 1 million-token context window, open weights, two versions for different use cases and a strong focus on agentic workflows such as coding, research, document analysis and long-running tasks. In a market still dominated by expensive closed models, DeepSeek is trying to prove that powerful AI does not need to remain locked behind trademarked systems.

A model built for long memory

The most immediate difference between DeepSeek V4 and other models is context length. Both DeepSeek-V4-Pro and DeepSeek-V4-Flash support a 1-million-token context window, meaning they can process inputs far longer than those of older generations of mainstream models. According to DeepSeek’s official release, one million tokens is now the default across all official DeepSeek services.

For ordinary users, that may sound technical. In practice, it matters because a longer context allows models to work with large documents, long conversations, full codebases, legal materials, research archives or complex project histories without losing track as quickly.

That is why DeepSeek V4 is not just another chatbot release. It is aimed at the next stage of AI use, where models are expected to act less like question-answering tools and more like assistants that can follow long processes over time.

Two models for two different needs

DeepSeek V4 comes in two main versions. DeepSeek-V4-Pro is a larger and more capable model, with 1.6 trillion total parameters and 49 billion active parameters. DeepSeek-V4-Flash is a smaller model, with 284 billion total parameters and 13 billion active parameters, designed for faster and more cost-effective workloads.

That distinction is important. Not every user needs the strongest model for every task. A company summarising documents, routing queries or running basic support may choose Flash. A developer working on complex coding tasks, long-context agents or advanced reasoning may prefer Pro.

DeepSeek’s release reflects a broader trend in AI. The best model is no longer always the biggest one. Cost, speed, context size and deployment flexibility are now as important as raw benchmark performance.

Why the price matters

One reason DeepSeek attracts so much attention is its aggressive pricing. DeepSeek’s API page lists V4-Flash at USD 0.14 per 1 million input tokens on a cache miss and USD 0.28 per 1 million output tokens. V4-Pro is listed at USD 1.74 per 1 million input tokens and USD 3.48 per 1 million output tokens before the temporary 75% discount.

For developers and companies, that changes the calculation. High-performing AI models are useful only if they can be deployed at scale. If every long document, coding session or agentic workflow becomes too expensive, adoption slows down.

DeepSeek’s challenge to the market is therefore not only technical. It is economic. The company is pushing the idea that frontier-level AI should be cheaper to run, easier to access and less dependent on closed ecosystems.

The architecture behind the hype

DeepSeek V4 uses a mixture-of-experts approach, meaning only part of the model is active during each response. That helps explain why the model can be very large on paper, yet still more efficient to run than a dense model of similar overall size.

The more interesting part is how DeepSeek handles long context. NVIDIA’s technical overview explains that DeepSeek V4 uses hybrid attention, combining compression and selective attention techniques to reduce the cost of processing very long prompts. NVIDIA says these changes are designed to cut per-token inference FLOPs by 73% and reduce KV cache memory burden by 90% compared with DeepSeek-V3.2.

For a non-technical audience, the point is simple. DeepSeek V4 is trying to solve one of the biggest problems in modern AI: how to make models remember and process much more information without becoming too slow or too expensive.

That is where much of the hype comes from. The model is not merely larger. It is designed around the economics of long-context AI.

Why NVIDIA is still in the picture

DeepSeek’s R2 launch is delayed as US restrictions cut off supply of NVIDIA H20 chips built for China.

NVIDIA’s role in the DeepSeek V4 story is especially interesting. DeepSeek is often discussed as part of China’s effort to build a more independent AI ecosystem, but NVIDIA has also been quick to move forward to support developers who want to build with the model.

In its technical blog, NVIDIA describes DeepSeek V4 as a model family designed for efficient inference of million-token contexts. The company says DeepSeek-V4-Pro and V4-Flash are available through NVIDIA GPU-accelerated endpoints, while developers can also use NVIDIA Blackwell, NIM containers, SGLang and vLLM deployment options.

NVIDIA also reports that early tests of DeepSeek-V4-Pro on the GB200 NVL72 platform showed more than 150 tokens per second per user. That matters because long-context models place heavy memory pressure, as well as on compute and networking infrastructure. The model may be efficient by design, but serving it at scale still requires serious hardware.

So, DeepSeek V4 does not remove NVIDIA from the story – it complicates it. The model is part of a broader push towards more efficient AI, but the infrastructure race remains central.

The chip question behind the model

DeepSeek V4 also arrives at a time when AI infrastructure is becoming just as important as model performance. MIT Technology Review frames the release partly through that lens, noting that DeepSeek’s new model reflects China’s broader attempt to reduce reliance on foreign AI hardware and build a more self-sufficient technology stack.

That detail matters because the AI race is no longer only about who builds the most capable model. It is also about who controls the chips, software frameworks and data centres needed to run it.

Replacing NVIDIA, however, remains difficult. Its advantage lies not just in its chips, but also in the software ecosystem developers have built around its platforms over many years. Moving to alternative hardware means adapting code, rebuilding tools and proving that the new systems are stable enough for serious use.

DeepSeek V4, however, sits between two realities. It points towards China’s ambition to build a more independent AI stack, while NVIDIA’s rapid support for the model shows that frontier AI still depends heavily on established infrastructure.

Open weights as a strategic move

DeepSeek V4 is also important because the model weights are available through Hugging Face under the MIT License. That gives developers more freedom to inspect, adapt and deploy the model than they would have with a fully closed commercial system.

Open-weight models are becoming a major pressure point in the AI race. Closed models may still lead in some areas, especially in polished consumer products, enterprise support and safety layers. However, open models offer something different: flexibility.

For universities, start-ups, smaller companies and developers outside the largest AI ecosystems, that flexibility matters. It means advanced AI can be tested, modified and integrated without relying entirely on a handful of dominant providers.

Benchmarks need caution

DeepSeek presents V4-Pro as highly competitive across reasoning, coding, long-context and agentic benchmarks. Hugging Face lists results including 80.6 on SWE-bench Verified, 90.1 on GPQA Diamond and 87.5 on MMLU-Pro for DeepSeek-V4-Pro.

Those numbers are impressive, but they should not be treated as the full story. Benchmarks are useful, but they rarely capture every real-world use case. A model can score well on coding tests and still struggle with reliability, factual accuracy, safety or complex multi-step workflows in production.

That caution is important. The AI industry often turns benchmarks into headlines, while real performance depends on deployment, prompting, safety controls and the specific task at hand.

More than just another model release

DeepSeek V4 matters because it combines several trends into one release: long context, lower prices, open weights, agentic workflows and geopolitical competition. It also shows that the AI race is no longer fought only in labs, benchmarks and data centres. Visibility now matters too. Tools such as Diplo’s Digital Footprints show how digital presence shapes the way technology actors and media narratives are discovered, ranked and understood. At this stage, the competition is not only about who has the smartest model. It is also about who can make intelligence cheaper, more available and easier to deploy.

That does not mean DeepSeek has solved every problem. Questions remain around independent benchmarking, safety, data governance, infrastructure and the broader political context of Chinese AI development. Still, the release does show where the market is heading.

The next phase of AI may not be defined solely by the most powerful model. It may be defined by the model that is powerful enough, affordable enough and open enough to change how people build products, services and tools with AI.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

UK moves to strengthen sovereignty over critical AI infrastructure

Britain is moving to strengthen its position in the global AI race, with Technology Secretary Liz Kendall calling for greater national control over key parts of the AI stack. In a recent speech, she described artificial intelligence as an increasingly important source of economic strength, security, and geopolitical influence.

Concerns centre on the concentration of power in a small number of companies that control much of the world’s advanced AI computing capacity. The government’s strategy is intended to reduce reliance on external providers while building domestic capabilities across areas such as research, infrastructure, compute, and talent.

Plans include the development of a national AI hardware strategy to improve access to chips and other critical technologies. At the same time, Britain says it will focus on sectors where it believes it holds a competitive edge, while continuing to work with allies on standards, governance, and the international rules shaping AI development.

Officials have stressed that AI sovereignty does not mean technological isolation, but stronger strategic resilience and greater influence over how future systems are built and governed. In that context, support for domestic firms and institutions is being framed as essential if Britain is to remain a serious player in the emerging global AI order.

Why does it matter?

Control over AI infrastructure is quickly becoming a core element of national power, comparable to energy or defence capabilities.

Concentration of computing and advanced chips in a few global players creates strategic vulnerabilities, exposing countries to external decisions that can affect economic stability, security and technological development.

Britain’s push for AI sovereignty reflects a broader global trend towards technological self-determination. Efforts to build domestic capacity and shape international standards could influence global AI governance, access to critical technologies, and reshape alliances in a more fragmented digital order.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!  

UK embraces 6 frontier technologies to drive digital growth

The UK government has identified six frontier technologies as central to strengthening digital capability, economic growth, and long-term competitiveness.

Outlined in the 2025 Modern Industrial Strategy and Digital and Technologies Sector Plan, the approach prioritises AI, cybersecurity, advanced connectivity, engineering biology, quantum technologies, and semiconductors as pillars of national resilience and technological sovereignty.

Advanced connectivity and AI remain core drivers of digital transformation. Investment in next-generation telecoms, including 5G and future 6G development, is supported through public funding and infrastructure initiatives, while AI continues to expand rapidly through commitments to compute capacity, national supercomputing infrastructure, and workforce development. The strategy positions the UK as aiming to strengthen its role as a leading European AI hub.

Cybersecurity, engineering biology, and quantum technologies reflect a broader strategy linking innovation with security, resilience, and sustainability. Government-backed programmes are intended to support commercialisation, strengthen secure-by-design systems, and accelerate growth in emerging areas such as bio-based manufacturing. Quantum technologies are also being positioned for longer-term use across sectors, including healthcare, defence, and finance.

Semiconductors complete the strategy as a foundational technology underpinning modern digital systems. Rather than focusing on large-scale manufacturing, the UK is prioritising areas such as design, photonics, compound semiconductors, and specialised materials, backed by targeted funding and institutional support.

Across all six areas, the strategy reflects a wider effort to align innovation policy with economic security, global competitiveness, and more resilient supply chains.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!  

Romania initiates consortium selection for Black Sea AI gigafactory project

The Ministry of Energy of Romania and the Ministry of Finance of Romania have launched an expression of interest process to select a consortium leader for the Black Sea AI Gigafactory project. The announcement marks a new step in developing large-scale AI infrastructure.

According to the Ministry of Energy of Romania, the selected leader will be responsible for structuring, developing and implementing the project. The process aims to identify partners with strong financial capacity and relevant technical expertise.

The project is described as a strategic initiative to build an advanced AI computing infrastructure, supporting digital and industrial capabilities while strengthening integration within the European AI ecosystem.

This project will lead to the development of digital infrastructure, such as data centres, cloud facilities, semiconductor manufacturing campuses with high-availability/power utility systems, large-scale telecom facilities, or other comparable power-and cooling-intensive facilities integrating critical digital systems.

Authorities state that the initiative is intended to position the Black Sea region as a key location for next generation AI infrastructure and to expand technological capacity in Romania.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

EU approves Italian State aid to support graphene-based photonic chip development

The European Commission has approved a €211 million Italian State aid measure to support the development of photonic chips based on graphene technology.

A funding will be provided to the Italian SME CamGraPhIC, with project activities taking place in Pisa and Bergamo.

Such an initiative focuses on optical transceivers that transmit data using light rather than electrons. The use of graphene instead of silicon is expected to enhance performance and energy efficiency across sectors such as telecommunications, automotive, aerospace and defence.

The Commission assessed the measure under the EU State aid rules and concluded that the funding is necessary, proportionate and aligned with research and innovation objectives. It also found that the project would not proceed without public support, demonstrating an incentive effect.

A decision that reflects broader EU efforts to strengthen semiconductor capabilities and support advanced digital technologies through targeted public investment and regulatory oversight.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!  

South Korea-France partnership reshapes AI and technology cooperation strategy

The recent state visit between South Korea and France signals a deepening of bilateral cooperation that extends beyond diplomacy into long-term technological and cultural alignment.

Agreements endorsed by President Lee Jae-myung and President Emmanuel Macron reflect a coordinated effort to strengthen shared capabilities in emerging sectors, while reinforcing institutional ties across research, education, and industry.

A central policy dimension lies in the expansion of cooperation in AI, semiconductors, and quantum technologies, areas increasingly tied to economic security and global competitiveness.

Partnerships between institutions such as KAIST and CNRS highlight a shift towards structured research integration, enabling joint innovation and knowledge transfer.

Such collaboration between South Korea and France is positioned not as an isolated scientific exchange, but as part of broader strategies to secure technological sovereignty and resilient supply chains.

Cultural and educational initiatives complement these ambitions by supporting long-term people-to-people engagement and workforce development. Expanded exchanges in creative industries and language education aim to cultivate talent pipelines that can operate across both economies.

Rather than symbolic diplomacy, these measures serve as enabling mechanisms for sustained cooperation in high-value sectors where human capital remains critical.

From a policy perspective, the agreements illustrate how economies are increasingly forming strategic partnerships to navigate global technological competition.

Instead of relying solely on domestic capacity, coordinated international frameworks are being used to manage innovation risks, diversify supply dependencies, and strengthen regulatory alignment.

The outcome will depend on implementation, yet the direction suggests a model of cooperation that blends economic, technological, and societal priorities.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!  

AI and 6G strategy drives South Korea’s digital transformation agenda

South Korea has outlined an ambitious national strategy to position itself among the world’s leading AI powers, linking technological advancement with broader economic and societal transformation.

Instead of isolated innovation efforts, the plan adopts a systemic approach, combining infrastructure development, data governance, and industrial policy to accelerate digital transition.

Central to South Korea’s strategy is the evolution of network infrastructure, with a shift from 5G to next-generation 6G technology targeted by 2030. The emphasis on connectivity and speed is complemented by efforts to strengthen cybersecurity frameworks and establish a national data integration platform.

Such measures aim to create a more resilient and competitive digital environment capable of supporting large-scale AI deployment.

The policy also prioritises the integration of AI across multiple sectors, including healthcare, manufacturing, agriculture, and disaster management.

By embedding intelligent systems into critical industries, South Korean authorities seek to enhance productivity, improve public service delivery, and strengthen national resilience.

Workforce development is positioned as a key pillar, with phased training initiatives designed to build expertise in advanced technologies such as semiconductors and quantum computing.

In parallel, the strategy incorporates digital inclusion measures to ensure broader societal participation. Expansion of AI learning centres and assistive technologies reflects an effort to reduce digital divides while supporting vulnerable groups.

Long-term success will depend on effective coordination across government bodies and to balancing rapid technological deployment with equitable access and robust governance frameworks, rather than purely growth-driven objectives.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

MIT uses AI to detect atomic material defects

Researchers at MIT have developed an AI model capable of identifying and quantifying atomic-scale defects in materials without damaging them. The approach aims to improve the design and performance of semiconductors, batteries, and solar cells.

The model analyses data from neutron-scattering experiments and can detect up to six different point defects simultaneously. Trained on 2,000 semiconductor materials, it analyses atomic vibrations to estimate defect types and concentrations that are hard for traditional methods to measure.

Conventional techniques such as X-ray diffraction or electron microscopy typically capture only limited aspects of material defects and often require destructive testing. The AI system uses pattern recognition to build a more complete picture, offering a non-invasive option for manufacturing quality control.

Researchers say the method could eventually be adapted to more widely used tools such as Raman spectroscopy, making industrial adoption more practical. Future work will also extend the model beyond point defects to larger structural features in materials.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

HP reveals advanced AI devices and workflow tools at Imagine 2026

HP has announced a broad set of AI-focused products and workplace tools at HP Imagine 2026, presenting the update as part of a wider effort to simplify work across PCs, collaboration devices, security systems, and workflow platforms.

In a press release published on 24 March, HP said the new portfolio includes AI PCs, collaboration tools, workstations, printers, and software intended for hybrid work and on-device AI use.

HP says the update includes a new intelligence layer called HP IQ, which it describes as a system designed to orchestrate work across AI PCs, workplace devices, and meeting spaces through local AI and proximity-based connectivity.

The company also announced new EliteBook devices, workstation updates, and workflow automation changes through its Workforce Experience Platform and Build Workspace capabilities.

Several sections of the release focus on on-device AI. According to the company, HP IQ will debut on the next generation of EliteBook X G2 AI PCs and will support features such as prompt-based assistance, document analysis, note organisation, and meeting support.

The release also says NearSense is intended to help devices discover, connect, and collaborate, including through file sharing and one-click joining of conference room meetings.

Security is another central theme in the release. HP says it has introduced what it describes as the world’s first hardware solution to stop physical TPM bypass attacks, using a cryptographically bound link between the TPM and CPU.

The company also said it is expanding capabilities in HP Wolf Security and introducing HP Wolf Pro Security Next Gen Antivirus, as well as physical intrusion detection designed to protect memory if a device chassis is opened.

The announcement also includes new printers and document tools. HP says the LaserJet Pro 4000 and 4100 series, and the LaserJet Enterprise 5000 and 6000 series, are intended to support AI-powered document processing and quantum-resistant security. The release also highlights scanning shortcuts, editable OCR, reduced management time, and a design intended to improve serviceability.

For higher-performance users, the company says it is launching a new generation of Z workstations and mobile workstations. The release refers to systems such as the Z8 Fury, Max Side Panel for Z8 Fury and Z4 workstations, and updated mobile workstation models. Advanced AI development, visual effects, and simulation workloads are among the uses cited in the announcement.

Beyond enterprise work, the release also extends the same AI and device strategy into gaming. New HyperX and OMEN products are part of the announcement, including desktops, a gaming and modular ecosystem, and expanded AI game support through OMEN Gaming Hub and OMEN AI.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

Microsoft and NVIDIA unveil AI tools for nuclear energy permitting and operations

Microsoft has announced an AI collaboration with NVIDIA to support nuclear energy projects across permitting, design, construction, and operations. In a post published on 24 March, the tech conglomerate said the initiative aims to provide end-to-end tools for the nuclear sector, focusing on streamlining permitting, accelerating design, and optimising operations.

Microsoft frames the effort within a broader energy challenge, arguing that rising power demand and long project timelines are putting pressure to accelerate the delivery of firm, carbon-free power. The company says customised engineering, fragmented data, and manual regulatory review slow nuclear projects. It presents AI as a way to make project development more repeatable, traceable, secure, and predictable.

The post says the collaboration spans the full lifecycle of a nuclear plant. Microsoft describes a model in which digital twins, high-fidelity simulations, and AI-assisted workflows support design and engineering, licensing and permitting, construction and delivery, and operations and maintenance.

According to the company, engineers would be able to reuse design patterns, model the impact of changes before construction begins, and link project decisions to supporting evidence and applicable rules. Microsoft also says generative AI can assist with drafting and gap analysis in permit documentation, while predictive modelling and operational digital twins can support anomaly detection and maintenance planning.

Microsoft says traceability and auditability are central to the approach. The company lists four intended qualities of the system: traceable records linking engineering decisions to evidence and regulations, audit-ready documentation, secure use within a governed environment, and predictable outcomes through simulations intended to identify delays before they occur in the real world.

Several case examples are included in the post. Microsoft says Aalo Atomics reduced the permitting process by 92% using its Generative AI for Permitting solution and estimates annual savings of 80$ million.

Aalo Atomics Chief Technology Officer Yasir Arafat is quoted as saying: ‘Two things matter most: enterprise-scale complexity and mission-critical reliability. We’re deploying something complex at a scale only a company like Microsoft really understands. There’s no room for anything less than proven reliability.’

Microsoft also says Southern Nuclear has deployed Copilot agents across engineering and licensing workstreams to improve consistency, reuse knowledge faster, and support decision-making. Idaho National Laboratory is described as an early adopter in the US federal context, with Microsoft saying the lab is using AI capabilities to automate the assembly of engineering and safety analysis reports and to create standard methodologies for regulators to adopt the tools safely.

The post also expands beyond those three examples. Microsoft says Everstar, described as an NVIDIA Inception startup, is bringing domain-specific AI for nuclear to Azure to support project workflows and governed data pipelines.

Everstar Chief Executive Officer Kevin Kong is quoted as saying: ‘The nuclear industry has been bottlenecked by documentation burden and regulatory complexity for decades. This partnership means our customers get the secure, scalable cloud deployments they demand. It’s a significant step toward making nuclear power fast, safe, and unstoppable.’

Microsoft also says Atomic Canyon’s Neutron platform is available on the Microsoft Marketplace for nuclear developers via established procurement channels.

At the technical level, Microsoft says the collaboration brings together NVIDIA Omniverse, NVIDIA Earth-2, NVIDIA CUDA-X, NVIDIA AI Enterprise, PhysicsNeMo, Isaac Sim, and Metropolis with Microsoft Generative AI for Permitting Solution Accelerator and Microsoft Planetary Computer. The company presents the stack as a digital ecosystem for nuclear energy on Azure.

The official post is a corporate announcement rather than an independent assessment of the approach’s effectiveness. The published note outlines the company’s intended use cases, named partners, and customer examples, but it does not provide a third-party evaluation of the broader claims regarding delivery speed, regulatory confidence, or sector-wide impact.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!