Why DeepSeek V4 is changing the AI model race

DeepSeek has again placed itself at the centre of the global AI race. After drawing worldwide attention with its R1 reasoning model in early 2025, the Chinese company has recently released DeepSeek V4, a new model designed to compete not only on performance, but also on price, openness and efficiency.

The hype around DeepSeek V4 is not based on a single feature. The model comes with a 1 million-token context window, open weights, two versions for different use cases and a strong focus on agentic workflows such as coding, research, document analysis and long-running tasks. In a market still dominated by expensive closed models, DeepSeek is trying to prove that powerful AI does not need to remain locked behind trademarked systems.

A model built for long memory

The most immediate difference between DeepSeek V4 and other models is context length. Both DeepSeek-V4-Pro and DeepSeek-V4-Flash support a 1-million-token context window, meaning they can process inputs far longer than those of older generations of mainstream models. According to DeepSeek’s official release, one million tokens is now the default across all official DeepSeek services.

For ordinary users, that may sound technical. In practice, it matters because a longer context allows models to work with large documents, long conversations, full codebases, legal materials, research archives or complex project histories without losing track as quickly.

That is why DeepSeek V4 is not just another chatbot release. It is aimed at the next stage of AI use, where models are expected to act less like question-answering tools and more like assistants that can follow long processes over time.

Two models for two different needs

DeepSeek V4 comes in two main versions. DeepSeek-V4-Pro is a larger and more capable model, with 1.6 trillion total parameters and 49 billion active parameters. DeepSeek-V4-Flash is a smaller model, with 284 billion total parameters and 13 billion active parameters, designed for faster and more cost-effective workloads.

That distinction is important. Not every user needs the strongest model for every task. A company summarising documents, routing queries or running basic support may choose Flash. A developer working on complex coding tasks, long-context agents or advanced reasoning may prefer Pro.

DeepSeek’s release reflects a broader trend in AI. The best model is no longer always the biggest one. Cost, speed, context size and deployment flexibility are now as important as raw benchmark performance.

Why the price matters

One reason DeepSeek attracts so much attention is its aggressive pricing. DeepSeek’s API page lists V4-Flash at USD 0.14 per 1 million input tokens on a cache miss and USD 0.28 per 1 million output tokens. V4-Pro is listed at USD 1.74 per 1 million input tokens and USD 3.48 per 1 million output tokens before the temporary 75% discount.

For developers and companies, that changes the calculation. High-performing AI models are useful only if they can be deployed at scale. If every long document, coding session or agentic workflow becomes too expensive, adoption slows down.

DeepSeek’s challenge to the market is therefore not only technical. It is economic. The company is pushing the idea that frontier-level AI should be cheaper to run, easier to access and less dependent on closed ecosystems.

The architecture behind the hype

DeepSeek V4 uses a mixture-of-experts approach, meaning only part of the model is active during each response. That helps explain why the model can be very large on paper, yet still more efficient to run than a dense model of similar overall size.

The more interesting part is how DeepSeek handles long context. NVIDIA’s technical overview explains that DeepSeek V4 uses hybrid attention, combining compression and selective attention techniques to reduce the cost of processing very long prompts. NVIDIA says these changes are designed to cut per-token inference FLOPs by 73% and reduce KV cache memory burden by 90% compared with DeepSeek-V3.2.

For a non-technical audience, the point is simple. DeepSeek V4 is trying to solve one of the biggest problems in modern AI: how to make models remember and process much more information without becoming too slow or too expensive.

That is where much of the hype comes from. The model is not merely larger. It is designed around the economics of long-context AI.

Why NVIDIA is still in the picture

DeepSeek’s R2 launch is delayed as US restrictions cut off supply of NVIDIA H20 chips built for China.

NVIDIA’s role in the DeepSeek V4 story is especially interesting. DeepSeek is often discussed as part of China’s effort to build a more independent AI ecosystem, but NVIDIA has also been quick to move forward to support developers who want to build with the model.

In its technical blog, NVIDIA describes DeepSeek V4 as a model family designed for efficient inference of million-token contexts. The company says DeepSeek-V4-Pro and V4-Flash are available through NVIDIA GPU-accelerated endpoints, while developers can also use NVIDIA Blackwell, NIM containers, SGLang and vLLM deployment options.

NVIDIA also reports that early tests of DeepSeek-V4-Pro on the GB200 NVL72 platform showed more than 150 tokens per second per user. That matters because long-context models place heavy memory pressure, as well as on compute and networking infrastructure. The model may be efficient by design, but serving it at scale still requires serious hardware.

So, DeepSeek V4 does not remove NVIDIA from the story – it complicates it. The model is part of a broader push towards more efficient AI, but the infrastructure race remains central.

The chip question behind the model

DeepSeek V4 also arrives at a time when AI infrastructure is becoming just as important as model performance. MIT Technology Review frames the release partly through that lens, noting that DeepSeek’s new model reflects China’s broader attempt to reduce reliance on foreign AI hardware and build a more self-sufficient technology stack.

That detail matters because the AI race is no longer only about who builds the most capable model. It is also about who controls the chips, software frameworks and data centres needed to run it.

Replacing NVIDIA, however, remains difficult. Its advantage lies not just in its chips, but also in the software ecosystem developers have built around its platforms over many years. Moving to alternative hardware means adapting code, rebuilding tools and proving that the new systems are stable enough for serious use.

DeepSeek V4, however, sits between two realities. It points towards China’s ambition to build a more independent AI stack, while NVIDIA’s rapid support for the model shows that frontier AI still depends heavily on established infrastructure.

Open weights as a strategic move

DeepSeek V4 is also important because the model weights are available through Hugging Face under the MIT License. That gives developers more freedom to inspect, adapt and deploy the model than they would have with a fully closed commercial system.

Open-weight models are becoming a major pressure point in the AI race. Closed models may still lead in some areas, especially in polished consumer products, enterprise support and safety layers. However, open models offer something different: flexibility.

For universities, start-ups, smaller companies and developers outside the largest AI ecosystems, that flexibility matters. It means advanced AI can be tested, modified and integrated without relying entirely on a handful of dominant providers.

Benchmarks need caution

DeepSeek presents V4-Pro as highly competitive across reasoning, coding, long-context and agentic benchmarks. Hugging Face lists results including 80.6 on SWE-bench Verified, 90.1 on GPQA Diamond and 87.5 on MMLU-Pro for DeepSeek-V4-Pro.

Those numbers are impressive, but they should not be treated as the full story. Benchmarks are useful, but they rarely capture every real-world use case. A model can score well on coding tests and still struggle with reliability, factual accuracy, safety or complex multi-step workflows in production.

That caution is important. The AI industry often turns benchmarks into headlines, while real performance depends on deployment, prompting, safety controls and the specific task at hand.

More than just another model release

DeepSeek V4 matters because it combines several trends into one release: long context, lower prices, open weights, agentic workflows and geopolitical competition. It also shows that the AI race is no longer fought only in labs, benchmarks and data centres. Visibility now matters too. Tools such as Diplo’s Digital Footprints show how digital presence shapes the way technology actors and media narratives are discovered, ranked and understood. At this stage, the competition is not only about who has the smartest model. It is also about who can make intelligence cheaper, more available and easier to deploy.

That does not mean DeepSeek has solved every problem. Questions remain around independent benchmarking, safety, data governance, infrastructure and the broader political context of Chinese AI development. Still, the release does show where the market is heading.

The next phase of AI may not be defined solely by the most powerful model. It may be defined by the model that is powerful enough, affordable enough and open enough to change how people build products, services and tools with AI.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

US Department of Labor launches AI training portal for apprenticeship programmes

The US Department of Labor has launched an AI in Registered Apprenticeship Innovation Portal to support organisations integrating AI training into federally recognised apprenticeship programmes.

The Department said the platform brings together resources to support AI literacy and structured AI-focused training pathways across sectors.

The portal is organised around three main areas: AI skills integration in apprenticeships, industry-specific training modules, and pathways for embedding AI into both new and existing programmes.

The Department said training content spans sectors including healthcare, finance, education, construction, advanced manufacturing and technology.

Alongside the portal, the Department has introduced an AI Literacy Framework to guide employers, educators and training providers. The Department said the AI Literacy Framework outlines core competencies, including understanding AI capabilities and limits, using tools in daily tasks, and assessing output accuracy.

A separate initiative, the Make America AI-Ready programme, delivers a free text-message-based AI course aimed at workers without reliable internet access.

Officials said organisations can join existing apprenticeships, create new AI-focused schemes, or update current programmes to include AI skills. The project aligns with wider federal strategies to accelerate AI education and workforce readiness across the United States.

Why does it matter? 

The initiative signals a structural shift in how governments are preparing the workforce for AI integration, embedding practical skills into formal apprenticeship systems rather than treating them as optional add-ons.

It also broadens access to AI literacy by targeting both high-growth industries and digitally excluded workers, helping reduce future gaps in productivity and employability.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!  

Digital Dubai rolls out AI workforce programme across public sector

Digital Dubai has launched the AI Workforce Transformation Programme to train 50,000 government employees in AI skills. The initiative is being delivered with the Dubai Government Human Resources Department and the Dubai Centre for Artificial Intelligence.

The programme aims to equip staff with practical knowledge to apply AI in public services and internal processes. It includes tailored training tracks based on job roles, from leadership to general employees.

Officials say the initiative will improve productivity, support innovation and enable more efficient service delivery. It also forms part of wider efforts to strengthen AI adoption across government operations.

The programme is designed to build long-term institutional capabilities and support a technology-driven government model. The initiative was launched by Digital Dubai in Dubai.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Singapore urges organisations to strengthen AI governance frameworks

GovTech Singapore has argued that stronger AI governance in workplaces is essential for trust, compliance, risk management, and responsible innovation as AI adoption expands across business operations.

The agency leading Singapore’s Smart Nation and digital government efforts defines AI governance as a framework of policies, processes, and responsibilities guiding the ethical, transparent, and accountable development and deployment of AI systems within an organisation. The framework is linked to oversight across the AI lifecycle, from design through to ongoing monitoring.

Key elements identified by GovTech Singapore include transparency and explainability, fairness and bias mitigation, accountability and human oversight, and data privacy and security. Responsible AI is also linked to Singapore’s wider Smart Nation agenda, which the agency describes as a national priority.

The guidance recommends that organisations establish clear internal policies on AI use, build AI literacy across teams, carry out regular audits and assessments, and prioritise secure development practices. It also points to Singapore’s Model AI Governance Framework for Generative AI, developed by the AI Verify Foundation and the Infocomm Media Development Authority, as a reference point for businesses adapting governance frameworks to their own needs.

As part of its effort to support responsible AI use in the public sector, GovTech Singapore also highlights its AI Guardian suite. The suite includes Litmus, a testing platform using adversarial prompts to identify risks and vulnerabilities, and Sentinel, a guardrails service designed to detect and mitigate unsafe or irrelevant content before it affects AI models or users.

Overall, GovTech Singapore presents AI governance not only as a compliance issue, but as part of building a trusted digital environment in which AI can be deployed safely and effectively.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

Kazakhstan advances digital economy with AI business assistant

Kazakhstan has introduced an AI-powered assistant designed to simplify the process of starting a business, according to Zhaslan Madiyev. Developed in cooperation with the Ministry of Finance, the platform aims to provide data-driven guidance to early-stage entrepreneurs.

Built around a digital mapping system, the assistant evaluates factors such as nearby businesses, customer flow, and competition. Its recommendations aim to help users choose more viable locations and avoid oversaturated sectors, thereby reducing the risk of duplicating businesses in the same area.

Officials say the tool could reduce startup operating costs by up to half while improving long-term business sustainability. Alongside it, a second AI assistant already provides continuous guidance on tax reporting and regulatory compliance, translating complex requirements into clearer, more practical steps for users. According to Kazakhstani reporting, the tax assistant has already processed more than 5,000 requests.

The development forms part of Kazakhstan’s wider digital transformation agenda, which aims to modernise public services and strengthen the country’s digital economy through practical AI deployment. The government says more than 50 AI-powered services are now being developed to support citizens and businesses.

Why does it matter?

Kazakhstan’s AI assistant points to a shift from basic digital services towards more active, real-time decision support for entrepreneurs. Data-driven recommendations can help reduce startup risks, limit market oversaturation, and support more efficient resource allocation across local economies.

Simplified tax and compliance guidance also targets one of the main barriers facing early-stage businesses: administrative complexity. Placed within Kazakhstan’s broader AI-first digital strategy, the initiative signals a wider move towards a more competitive and operationally AI-driven digital economy.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!  

Malaysia expands national AI strategy through Microsoft partnership

Malaysia is strengthening its national AI strategy through an expanded partnership with Microsoft, launching the Microsoft Elevate initiative to accelerate AI readiness across society.

The programme aligns with the country’s AI Nation 2030 ambitions and extends digital skills development beyond traditional sectors.

An initiative that targets educators, public sector institutions, small businesses and wider communities, aiming to embed practical AI capabilities into everyday economic and social activity.

Early deployment has already reached tens of thousands of learners, reflecting a shift from pilot programmes to large-scale national implementation.

Government and industry leaders in Malaysia emphasise that long-term competitiveness depends not only on technological investment but on widespread adoption and understanding of AI tools.

The programme therefore prioritises workforce activation, institutional capacity and sustainable integration across sectors.

Malaysia’s approach reflects a broader global trend where public–private partnerships are increasingly central to AI development, focusing on inclusive access, responsible use and real-world application rather than purely technological advancement.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

Quantum computing gains stability boost from NVIDIA error correction model

NVIDIA has strengthened its position in the emerging quantum computing sector through a new family of AI models designed to improve calibration and error correction in quantum systems. Rather than building its own quantum processing hardware, the company continues to focus on hybrid computing architectures that combine classical GPUs with quantum processors.

The new system reportedly improves quantum error correction decoding by up to 2.5 times in speed and three times in accuracy, addressing one of the most persistent barriers to scalable quantum computing. High error rates have long limited the practical deployment of quantum systems, making stability and fast correction central challenges for the industry.

NVIDIA has also expanded tools such as NVQLink and CUDA-Q, which allow quantum systems to integrate more directly with its existing GPU infrastructure. Together, these tools support workloads that can be distributed across classical and quantum environments, reinforcing NVIDIA’s role as a foundational infrastructure provider rather than a direct builder of quantum hardware.

The strategy positions NVIDIA to benefit regardless of how quantum computing develops. Whether hybrid systems become the dominant model or classical GPUs remain the primary computational layer for quantum processors, NVIDIA aims to remain embedded in the infrastructure stack that supports future quantum workloads.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

Canada launches a major youth skills funding for digital economy transition

The Government of Canada has announced a C$23.8 million funding initiative to strengthen youth skills for the evolving digital economy through the Digital Skills for Youth programme.

The announcement, led by Mélanie Joly, Minister of Industry and Minister responsible for Canada Economic Development for Quebec Regions, forms part of a broader effort to prepare younger generations for technological change across the labour market.

The initiative will support training and work experience opportunities for post-secondary graduates, with a focus on emerging fields such as AI, cybersecurity, big data, and automation. By connecting young people with employers, the programme aims to narrow the gap between education and the practical digital skills needed in modern industries.

Funding will be distributed over two years and is open to a wide range of organisations, including for-profit and not-for-profit organisations, public institutions, Indigenous organisations, and provincial or territorial bodies. The programme also includes a flexibility measure for participants in Yukon, the Northwest Territories, and Nunavut, where post-secondary education is not required.

The initiative builds on earlier rounds of the programme, which have already supported 6,900 youth internships across Canada since 2018.

Authorities say digital transformation is reshaping employment structures, making targeted skills development increasingly important. In that sense, the initiative is aimed not only at improving employability but also at helping prevent wider inequalities in access to technology-driven opportunities.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

GPT-5.5 pushes AI deeper into agentic work

OpenAI has released GPT-5.5 as its latest push towards more capable agentic AI, presenting the model as better suited to complex, multi-step digital work across coding, research, analysis, and enterprise tasks.

The company frames it as a system designed to carry more of the work itself, moving beyond isolated prompt-response interactions towards fuller execution across digital workflows.

According to OpenAI, the model’s biggest gains are in software engineering, tool use, and knowledge work. GPT-5.5 improves performance on coding and workflow benchmarks, strengthens long-horizon reasoning, and handles complex digital tasks with greater efficiency while maintaining earlier latency standards.

OpenAI also says the model performs better across documents, spreadsheets, presentations, and data analysis, reflecting a broader effort to make AI more useful across full professional workflows rather than only as an assistant for isolated tasks.

The release also highlights stronger performance in scientific and technical research, alongside expanded safety testing and tighter safeguards for higher-risk capabilities.

The wider significance of GPT-5.5 lies in its reflection of the next phase of AI competition. The focus is shifting from better answers to more reliable execution across real-world digital work, with growing implications for productivity, oversight, and governance.

Why does it matter? 

GPT-5.5 signals a shift from AI as a passive tool to AI as an active digital operator that can complete full workflows across coding, research, and business systems with minimal human supervision.

Over time, such capability could reshape productivity, speed up development cycles, and shift competitive advantage toward those best integrating autonomous AI while managing safety and governance risks.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!  

Australia targets three million learners under AI workforce strategy

Three million people in Australia will be trained in workforce-ready AI skills under Microsoft’s largest AI skilling commitment, set to run through the end of 2028.

The initiative is delivered in partnership with government, industry, education providers and community organisations. It aligns with Australia’s National AI Plan to strengthen national capability and ensure the responsible adoption of emerging technologies.

The programme builds on earlier skilling targets that exceeded expectations, including milestones of one million and 300,000 learners achieved ahead of schedule.

It is supported by Microsoft’s broader A$25 billion (USD 18 billion) investment in digital infrastructure, cybersecurity and workforce development, strengthening long-term national AI capability.

Training will focus on three core areas:

  • Future workforce development through education systems;
  • Upskilling of the current workforce;
  • Expanded access for community groups.

Partnerships with institutions such as TAFE NSW, universities, employers and trade organisations are designed to scale practical AI learning, while also addressing productivity pressures and evolving labour market demands.

Community-focused initiatives aim to reduce digital inequality and broaden access to AI skills, particularly among underrepresented groups. Programmes supporting Indigenous-led organisations and social impact groups aim to widen participation in the digital economy and promote inclusive, responsible AI adoption. 

Why does it matter?

The initiative reflects a broader shift towards system-wide AI capability building across education, industry and communities.

Expanding AI skills is intended to support productivity, reduce workforce fragmentation and ensure more balanced access to emerging technologies. It also addresses risks of uneven adoption and widening digital inequality as AI becomes central to economic development.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot