UK report quantifies rapid advances in frontier AI capabilities

For the first time, the UK has published a detailed, evidence-based assessment of frontier AI capabilities. The Frontier AI Trends Report draws on two years of structured testing across areas including cybersecurity, software engineering, chemistry, and biology.

The findings show rapid progress in technical performance. Success rates on apprentice-level cyber tasks rose from under 9% in 2023 to around 50% in 2025, while models also completed expert-level cyber challenges previously requiring a decade of experience.

Safeguards designed to limit misuse are also improving, according to the report. Red-team testing found that the time required to identify universal jailbreaks increased from minutes to several hours between model generations, representing an estimated forty-fold improvement in resistance.

The analysis highlights advances beyond cybersecurity. AI systems now complete hour-long software engineering tasks more than 40% of the time, while biology and chemistry models outperform PhD-level researchers in controlled knowledge tests and support non-experts in laboratory-style workflows.

While the report avoids policy recommendations, UK officials say it strengthens transparency around advanced AI systems. The government plans to continue investing in evaluation science through the AI Security Institute, supporting independent testing and international collaboration.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

Strong AI memory demand boosts Micron outlook into 2026

Micron Technology reported record first-quarter revenue for fiscal 2026, supported by strong pricing, a favourable product mix and operating leverage. The company said tight supply conditions and robust AI-related demand are expected to continue into 2026.

The Boise-based chipmaker generated $13.64 billion in quarterly revenue, led by record sales across DRAM, NAND, high-bandwidth memory and data centres. Chief executive Sanjay Mehrotra said structural shifts are driving rising demand for advanced memory in AI workloads.

Margins expanded sharply, setting Micron apart from peers such as Broadcom and Oracle, which reported margin pressure in recent earnings. Chief financial officer Mark Murphy said gross margin is expected to rise further in the second quarter, supported by higher prices, lower costs and a favourable revenue mix.

Analysts highlighted improving fundamentals and longer-term visibility. Baird said DRAM and NAND pricing could rise sequentially as Micron finalises long-term supply agreements, while capital expenditure plans for fiscal 2026 were viewed as manageable and focused on expanding high-margin HBM capacity.

Retail sentiment also turned strongly positive following the earnings release, with Micron shares jumping around 8 per cent in after-hours trading. The stock is on track to finish the year as the best-performing semiconductor company in the S&P 500, reinforcing confidence in its AI-driven growth trajectory.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

Natural language meets robotics in MIT’s on-demand object creation system

MIT researchers have developed a speech-to-reality system that allows users to create physical objects by describing them aloud, combining generative AI with robotic assembly. The system can produce simple furniture and decorative items in minutes using modular components.

The workflow translates spoken instructions into a digital design using a large language model and 3D generative AI. The design is then broken into voxel-based parts and adapted to real-world fabrication constraints before being assembled by a robotic arm.

Researchers have demonstrated the system by producing stools, shelves, chairs, tables and small sculptures. The approach aims to reduce manufacturing complexity by enabling rapid construction without specialised knowledge of 3D modelling or robotics.

Unlike traditional fabrication methods such as 3D printing, which can take hours or days, the modular assembly process operates quickly and allows objects to be disassembled and reused. The team is exploring stronger connection methods and extensions to larger-scale robotic systems.

The research was presented at the ACM Symposium on Computational Fabrication in November. The team said the work points toward more accessible, flexible and sustainable ways to produce physical objects using natural language and AI-driven design.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

PwC automates AI governance with Agent Mode

The global professional services network, PwC, has expanded its Model Edge platform with the launch of Agent Mode, an AI assistant designed to automate governance, compliance and documentation across enterprise AI model lifecycles.

The capability targets the growing administrative burden faced by organisations as AI model portfolios scale and regulatory expectations intensify.

Agent Mode allows users to describe governance tasks in natural language, instead of manually navigating workflows.

A system that executes actions directly within Model Edge, generates leadership-ready documentation and supports common document and reporting formats, significantly reducing routine compliance effort.

PwC estimates weekly time savings of between 20 and 50 percent for governance and model risk teams.

Behind the interface, a secure orchestration engine interprets user intent, verifies role based permissions and selects appropriate large language models based on task complexity. The design ensures governance guardrails remain intact while enabling faster and more consistent oversight.

PwC positions Agent Mode as a step towards fully automated, agent-driven AI governance, enabling organisations to focus expert attention on risk assessment and regulatory judgement instead of process management as enterprise AI adoption accelerates.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

The limits of raw computing power in AI

As the global race for AI accelerates, a growing number of experts are questioning whether simply adding more computing power still delivers meaningful results. In a recent blog post, digital policy expert Jovan Kurbalija argues that AI development is approaching a critical plateau, where massive investments in hardware produce only marginal gains in performance.

Despite the dominance of advanced GPUs and ever-larger data centres, improvements in accuracy and reasoning among leading models are slowing, exposing what he describes as an emerging ‘AI Pareto paradox’.

According to Kurbalija, the imbalance is striking: around 80% of AI investment is currently spent on computing infrastructure, yet it accounts for only a fraction of real-world impact. As hardware becomes cheaper and more widely available, he suggests it is no longer the decisive factor.

Instead, the next phase of AI progress will depend on how effectively organisations integrate human knowledge, skills, and processes into AI systems.

That shift places people, not machines, at the centre of AI transformation. Kurbalija highlights the limits of traditional training approaches and points to new models of learning that focus on hands-on development and deep understanding of data.

Building a simple AI tool may now take minutes, but turning it into a reliable, high-precision system requires sustained human effort, from refining data to rethinking internal workflows.

Looking ahead to 2026, the message is clear. Success in AI will not be defined by who owns the most powerful chips, but by who invests most wisely in people.

As Kurbalija concludes, organisations that treat AI as a skill to be cultivated, rather than a product to be purchased, are far more likely to see lasting benefits from the technology.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

AI reshapes media in North Macedonia with new regulatory guidance

A new analysis examines the impact of AI on North Macedonia’s media sector, offering guidance on ethical standards, human rights, and regulatory approaches.

Prepared in both Macedonian and English, the study benchmarks the country’s practices against European frameworks and provides actionable recommendations for future regulation and self-regulation.

The research, supported by the EU and Council of Europe’s PRO-FREX initiative and in collaboration with the Agency for Audio and Audiovisual Media Services (AVMU), was presented during Media Literacy Days 2025 in Skopje.

It highlights the relevance of EU and Council of Europe guidelines, including the Framework Convention on AI and Human Rights, and guidance on responsible AI in journalism.

AVMU’s involvement underlines its role in ensuring media freedom, fairness, and accountability amid rapid technological change. Participants highlighted the need for careful policymaking to manage AI’s impact, protecting media diversity, journalistic standards, and public trust online.

The analysis forms part of broader efforts under the Council of Europe and the EU’s Horizontal Facility for the Western Balkans and Türkiye, aiming to support North Macedonia in aligning media regulation with European standards while responsibly integrating AI technologies.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Joule Agent workshops help organisations build practical AI agent solutions

Artificial intelligence agents, autonomous systems that perform tasks or assist decision-making, are increasingly part of digital transformation discussions, but their value depends on solving actual business problems rather than adopting technology for its own sake.

SAP’s AppHaus Joule Agent Discovery and Design workshops provide a structured, human-centred approach to help organisations discover where agentic AI can deliver real impact and design agents that collaborate effectively with humans.

The Discovery workshop focuses on identifying challenges and inefficiencies where automation can add value, guiding participants to select high-priority use cases that suit agentic solutions.

The Design workshop then brings users and business experts together to define each AI agent’s role, responsibilities and required skills. By the end of these sessions, participants have detailed plans defining tasks, workflows and instructions that can be translated into actual AI agent implementations.

SAP also supports these formats with self-paced learning courses and toolkits to help anyone run the workshops confidently, emphasising practical human–AI partnerships rather than technology hype.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Gemini users can now build custom AI mini-apps with Opal

Google has expanded the availability of Opal, a no-code experimental tool from Google Labs, by integrating it directly into the Gemini web application.

This integration allows users to build AI-powered mini-apps, known as Gems, without writing any code, using natural language descriptions and a visual workflow editor inside Gemini’s interface.

Previously available only via separate Google Labs experiments, Opal now appears in the Gems manager section of the Gemini web app, where users can describe the functionality they want and have Gemini generate a customised mini-app.

These mini-apps can be reused for specific tasks and workflows and saved as part of a user’s Gem collection.

The no-code ‘vibe-coding’ approach aims to democratise AI development by enabling creators, developers and non-technical users alike to build applications that automate or augment tasks, all through intuitive language prompts and visual building blocks.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Instacart faces FTC scrutiny over AI pricing tool

US regulators are examining Instacart’s use of AI in grocery pricing, after reports that shoppers were shown different prices for identical items. Sources told Reuters the Federal Trade Commission has opened a probe into the company’s AI-driven pricing practices.

The FTC has issued a civil investigative demand seeking information about Instacart’s Eversight tool, which allows retailers to test different prices using AI. The agency said it does not comment on ongoing investigations, but expressed concern over reports of alleged pricing behaviour.

Scrutiny follows a study of 437 shoppers across four US cities, which found average price differences of 7 percent for the same grocery lists at the same stores. Some shoppers reportedly paid up to 23 percent more than others for identical items, according to the researchers.

Instacart said the pricing experiments were randomised and not based on personal data or individual behaviour. The company maintains that retailers, not Instacart, set prices on the platform, with the exception of Target, where prices are sourced externally and adjusted to cover costs.

The investigation comes amid wider regulatory focus on technology-driven pricing as living costs remain politically sensitive in the United States. Lawmakers have urged greater transparency, while the FTC continues broader inquiries into AI tools used to analyse consumer data and set prices.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

ChatGPT expands with a new app directory from OpenAI

OpenAI has opened submissions for third-party apps inside ChatGPT, allowing developers to publish tools that extend conversations with real-world actions. Approved apps will appear in a new in-product directory, enabling users to move directly from discussion to execution.

The initiative builds on OpenAI’s earlier DevDay announcement, where it outlined how apps could add specialised context to conversations. Developers can now submit apps for review, provided they meet the company’s requirements on safety, privacy, and user experience.

ChatGPT apps are designed to support practical workflows such as ordering groceries, creating slide decks, or searching for apartments. Apps can be activated during conversations via the tools menu, by mentioning them directly, or through automated recommendations based on context and usage signals.

To support adoption, OpenAI has released developer resources including best-practice guides, open-source example apps, and a chat-native UI library. An Apps SDK, currently in beta, allows developers to build experiences that integrate directly into conversational flows.

During the initial rollout, OpenAI’s monetisation is limited to external links directing users to developers’ own platforms. said it plans to explore additional revenue models over time as the app ecosystem matures.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!