New AI agent boosts game testing

Researchers from Zhejiang University and NetEase Fuxi AI Lab have developed Titan, an AI-powered agent transforming MMORPG testing. Using large-language-model reasoning, Titan navigates MMORPGs, efficiently completing tasks and identifying issues.

In trials across two commercial games, Titan achieved a 95% task completion rate and uncovered four previously undetected bugs. Outperforming human testers in speed and coverage, the AI agent offers a faster, more thorough approach to quality assurance in game development.

Titan mimics expert testers by perceiving game states, selecting actions, and diagnosing problems. Using simplified text and screenshots, it reasons through objectives, streamlining a traditionally costly and time-consuming process that can consume millions in labour.

Already integrated into QA pipelines, Titan signals a shift toward AI-driven game testing. As studios increasingly adopt AI tools, such agents could redefine efficiency across PC and mobile game development.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Doctors and nurses outperform AI in patient triage

Human staff are more accurate than AI in assessing patient urgency in emergency departments, according to research presented at the European Emergency Medicine Congress in Barcelona.

The study, led by Dr Renata Jukneviciene of Vilnius University, tested ChatGPT 3.5 against clinicians and nurses using real case studies.

Doctors achieved an overall accuracy of 70.6% and nurses 65.5%, compared with 50.4% for AI. Doctors also outperformed AI in surgical and therapeutic cases, while nurses were more reliable overall.

AI did show strength in recognising the most critical cases, surpassing nurses in both accuracy and specificity. Researchers suggested that AI may help prioritise life-threatening situations and support less experienced staff instead of acting as a replacement.

However, over-triaging by AI could lead to inefficiencies, making human oversight essential.

Future studies will explore newer AI models, ECG interpretation, and integration into nurse training, particularly in mass-casualty scenarios.

Commenting on the findings, Dr Barbra Backus from Amsterdam said AI has value in certain areas, such as interpreting scans, but it cannot yet replace trained staff for triage decisions.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Claude Sonnet 4.5 expands developer options with rollbacks and longer-running agents

Anthropic has released Claude Sonnet 4.5, featuring a suite of new upgrades designed to enhance coding, automation, and creativity. The update enhances Claude Code, extends Computer Use, and introduces experimental tools to boost productivity and facilitate real-world applications.

Claude Code now features checkpoints, allowing developers to roll back projects to earlier versions. The Claude API has also been expanded, supporting longer-running agents to generate files such as slides, spreadsheets, and documents directly within chats.

The model’s Computer Use function has been strengthened, enabling agents to operate applications for up to 30 hours autonomously. Anthropic says Claude Sonnet 4.5 built a Slack-style app with 11,000 lines of code in one session.

A new feature, Imagine with Claude, focuses on generating creative software. The system produced a Shakespeare-themed desktop with customised scripts and performance schedules from a single prompt, highlighting its versatility.

Anthropic has maintained steady pricing for free and premium users, positioning Sonnet 4.5 as its most practical and feature-rich release yet, combining reliability with expanded creative and developer-friendly tools.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

NSW expands secure AI platform NSWEduChat across schools

Following successful school trials, the New South Wales Department of Education has confirmed the broader rollout of its in-house generative AI platform, NSWEduChat.

The tool, developed within the department’s Sydney-based cloud environment, prioritises privacy, security, and equity while tailoring content to the state’s educational context. It is aligned with the NSW AI Assessment Framework.

The trial began in 16 schools in Term 1, 2024, and then expanded to 50 schools in Term 2. Teachers reported efficiency gains, and students showed strong engagement. Access was extended to all staff in Term 4, 2024, with Years 5–12 students due to follow in Term 4, 2025.

Key features include a privacy-first design, built-in safeguards, and a student mode that encourages critical thinking by offering guided prompts rather than direct answers. Staff can switch between staff and student modes for lesson planning and preparation.

All data is stored in Australia under departmental control. NSWEduChat is free and billed as the most cost-effective AI tool for schools. Other systems are accessible but not endorsed; staff must follow safety rules, while students are limited to approved tools.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

AI agents complete first secure transaction with Mastercard and PayOS

PayOS and Mastercard have completed the first live agentic payment using a Mastercard Agentic Token, marking a pivotal step for AI-driven commerce. The demonstration, powered by Mastercard Agent Pay, extends the tokenisation infrastructure that already underpins mobile payments and card storage.

The system enables AI agents to initiate payments while enforcing consent, authentication, and fraud checks, thereby forming what Mastercard refers to as the trust layer. It shows how card networks are preparing for agentic transactions to become central to digital commerce.

Mastercard’s Chief Digital Officer, Pablo Fourez, stated that the company is developing a secure and interoperable ecosystem for AI-driven payments, underpinned by tokenized credentials. The framework aims to prepare for a future where the internet itself supports native agentic commerce.

For PayOS, the milestone represents a shift from testing to commercialisation. Chief executive Johnathan McGowan said the company is now onboarding customers and offering tools for fraud prevention, payments risk management, and improved user experiences.

The achievement signals a broader transition as agentic AI moves from pilot to real-world deployment. If security models remain effective, agentic payments could soon differentiate platforms, merchants, and issuers, embedding autonomy into digital transactions.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

AI-powered Opera Neon browser launches with premium subscription

After its announcement in May, Opera has started rolling out Neon, its first AI-powered browser. Unlike traditional browsers, Neon is designed for professionals who want AI to simplify complex online workflows.

The browser introduces Tasks, which act like self-contained workspaces. AI can understand context, compare sources, and operate across multiple tabs simultaneously to manage projects more efficiently.

Neon also features cards and reusable AI prompts that users can customise or download from a community store, streamlining repeated actions and tasks.

Its standout tool, Neon Do, performs real-time on-screen actions such as opening tabs, filling forms, and gathering data, while keeping everything local. Opera says no data is shared, and all information is deleted after 30 days.

Neon is available by subscription at $19.90 per month. Invitations are limited during rollout, but Opera promises broader availability soon.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

Internal chatbot Veritas helps Apple refine Siri features ahead of launch

Apple is internally testing its upcoming Siri upgrade with a chatbot-style tool called Veritas, according to a report by Bloomberg. The app enables employees to experiment with new capabilities and provide structured feedback before a public launch.

Veritas enables testers to type questions, engage in conversations, and revisit past chats, making it similar to ChatGPT and Gemini. Apple is reportedly using the feedback to refine Siri’s features, including data search and in-app actions.

The tool remains internal and is not planned for public release. Its purpose is to make Siri’s upgrade process more efficient and guide Apple’s decision on future chatbot-like experiences.

Apple executives have said they prefer integrating AI into daily tasks instead of offering a separate chatbot. Craig Federighi confirmed at WWDC that Apple is focused on natural task assistance rather than a standalone product.

Bloomberg reports that the new Siri will use Apple’s own AI models alongside external systems like Google’s Gemini, with a launch expected next spring.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

Semicon Coalition unites EU on chip strategy and autonomy

European ministers have signed the Declaration of the Semicon Coalition, calling for a revised EU Chips Act 2.0 to boost semiconductor resilience, innovation, and competitiveness. The declaration outlines five priorities: collaboration, investment, skills, sustainability, and global partnerships.

The coalition, launched by the Netherlands in March, includes Austria, Belgium, Finland, France, Germany, Italy, Poland, and Spain. Other EU states joined today in Brussels, where Dutch minister Vincent Karremans presented the declaration to the European Commission.

Over fifty leading European and international semiconductor players have endorsed the declaration. This support strengthens momentum for placing end-markets at the core of the EU’s semiconductor strategy and aligns with Mario Draghi’s report on competitiveness.

The priorities include aligning EU and national funding, accelerating approvals for strategic projects, building a skilled talent pipeline, and promoting circular, energy-efficient manufacturing. International partnerships will also be deepened while safeguarding European strategic autonomy.

Minister Karremans said the strategy demonstrates Europe’s response to global tensions and its commitment to boosting semiconductor capacity, research funding, and readiness for demand in AI, automotive, energy, and defense.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Sam Altman predicts AGI could arrive before 2030

OpenAI CEO Sam Altman has warned that AI could soon automate up to 40 percent of the tasks humans currently perform. He made the remarks in an interview with German newspaper Die Welt, highlighting the potential economic shift AI will trigger.

Altman described OpenAI’s latest model, GPT-5, as the most advanced yet and claimed it is ‘smarter than me and most people’. He said artificial general intelligence (AGI), capable of outperforming humans in all areas, could arrive before 2030.

Instead of focusing on job losses, Altman suggested examining the percentage of tasks that AI will automate. He predicted that 30 to 40 per cent of tasks currently carried out by humans may soon be completed by AI systems.

These comments contribute to the growing debate about the societal impact of AI, with mass layoffs already being linked to automation. Altman emphasised that this wave of change will reshape economies and workplaces, requiring businesses and governments to prepare for disruption.

As AGI approaches, Altman urged individuals to focus on acquiring in-demand skills to stay relevant in an AI-enabled economy. The relationship between humans and machines, he said, will be permanently reshaped by these developments.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Google tests AI hosts for YouTube Music

Google is testing AI-generated hosts for YouTube Music through its new YouTube Labs programme. The AI hosts will appear while users listen to mixes and radio stations, providing commentary, fan trivia, and stories to enrich the listening experience.

The feature is designed to resemble a radio jockey but relies on AI, so there is a risk of occasional inaccuracies.

YouTube Labs, similar to Google Labs, allows the company to trial new AI features and gather user feedback before wider release. The AI hosts are currently available to a limited group of US testers, who can sign up via YouTube Labs and snooze commentary for an hour or all day.

The rollout follows Google’s Audio Overviews in NotebookLM, which turns research papers and documents into podcast-style summaries. Past AI experiments on YouTube, such as automatic dubbing, faced criticism as viewers had limited control over translations.

The AI hosts experiment shows Google’s push to integrate AI across its apps, enhancing engagement while monitoring feedback before wider rollout.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot