Gemini leads latest ORCA benchmark on AI maths accuracy

A new round of the ORCA (Omni Research on Calculation in AI) benchmark reveals significant progress in how leading AI chatbots handle real-world mathematical problems, while also highlighting persistent limitations in reliability and consistency.

The latest results show Google’s Gemini 3 Flash moving clearly ahead of competing systems, correctly answering nearly three-quarters of the 500 practical questions used in the benchmark.

Our readers may recall that the platform previously analysed the first edition of the ORCA benchmark, examining how AI chatbots performed on everyday quantitative tasks rather than purely academic problems. The earlier analysis already showed notable gaps between systems and raised questions about the reliability of AI models for calculations people might encounter in daily life.

The second benchmark compares four widely accessible models: ChatGPT-5.2, Gemini 3 Flash, Grok-4.1 and DeepSeek V3.2. Gemini recorded the largest improvement, decisively outpacing the others. ChatGPT and DeepSeek posted smaller but steady gains, while Grok’s results declined slightly in several subject areas.

Performance improvements were uneven across domains, with Gemini showing particularly strong gains in fields such as biology, chemistry, physics and health-related calculations.

Closer examination of the errors reveals why AI still struggles with mathematical accuracy. Calculation mistakes have increased as a share of total errors, while rounding and formatting problems have decreased.

Researchers explain that large language models do not actually compute numbers in the same way that calculators do. Instead, they predict likely sequences of words and numbers, which can lead to small shortcuts during multi-step reasoning that eventually produce incorrect results.

The benchmark also highlights another challenge: instability. The same question can produce different answers when asked multiple times, even when the model initially responded correctly. Such variation reflects the probabilistic nature of AI systems.

As a result, the benchmark concludes that AI chatbots can assist with calculations but cannot yet match the consistency of traditional calculators, which always return the same answer for the same input.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

Debate grows over the future of privacy

Experts gathered in London, UK, to examine how the concept of privacy has evolved over centuries. Discussions in London, UK, highlighted that privacy was only widely recognised as a legal and social norm after the Second World War.

Speakers in London noted that earlier societies often viewed privacy with suspicion or did not recognise it at all. Historical examples discussed included practices from Roman society and the French monarchy.

Modern legal protections expanded rapidly in recent decades, with privacy laws now covering about 80 percent of the global population. Scholars said the concept remains relatively new despite its central role in modern democracies.

The debate also explored whether privacy will remain a stable social value as technology evolves. Analysts in London said emerging technologies such as AI are reshaping debates over personal data and surveillance.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

EU competition scrutiny pushes Meta to reopen WhatsApp AI access

Meta has announced that third-party AI chatbots will again be allowed to operate through WhatsApp in Europe, reversing restrictions introduced earlier this year.

The decision follows pressure from the European Commission, which had warned it could impose interim competition measures.

Earlier in 2026, Meta limited access to rival chatbot services on the messaging platform, prompting regulators to examine whether the move unfairly restricted competition in the rapidly expanding AI market.

WhatsApp remains one of the most widely used messaging applications across European countries, making platform access critical for emerging AI services.

Under the new arrangement, companies will be able to distribute general-purpose AI chatbots via the WhatsApp Business API for 12 months.

The change is intended to give European regulators time to complete their investigation while allowing competing AI services to operate within the platform ecosystem.

Meta has also indicated that businesses offering chatbots through WhatsApp will be required to pay fees to access the system.

The European Commission is now assessing whether these adjustments sufficiently address competition concerns surrounding the integration of AI services inside major digital platforms.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

EU launches panel on child safety online and social media age rules

The European Commission has convened a new expert panel tasked with examining how children can be better protected across digital platforms, including social media, gaming environments and AI tools.

The initiative reflects growing concern across Europe regarding the psychological and safety risks associated with young users’ online behaviour.

Announced during the 2025 State of the Union Address by Commission President Ursula von der Leyen, the panel will evaluate evidence on both the opportunities and harms linked to children’s digital engagement.

Specialists from health, computer science, child rights and digital literacy will work alongside youth representatives to assess current research and policy responses.

Discussions during the first meeting centred on platform responsibility, including age-appropriate safety-by-design features, algorithmic amplification and addictive product design.

An initiative that also addresses digital literacy for children, parents and educators, while considering how regulatory measures can reduce risks without undermining the benefits of online participation.

The panel’s work complements the enforcement of the Digital Services Act and related European policies designed to strengthen protections for minors online.

Among the tools under development is an EU age-verification application currently tested in several member states, intended to support privacy-preserving checks compatible with the future EU digital identity framework.

The panel is expected to deliver policy recommendations to the Commission by summer 2026.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

OpenAI explains 5 AI value models transforming enterprise strategy

AI is beginning to reshape corporate strategy as organisations shift from isolated technology experiments to broader operational transformation.

According to OpenAI, businesses that treat AI as a collection of disconnected pilots risk missing the bigger structural change that the technology enables.

A new framework describes five value models through which AI can gradually reshape companies. The first stage focuses on workforce empowerment, where tools such as ChatGPT spread AI capabilities across teams and improve everyday productivity.

Once employees develop fluency, organisations can introduce AI-native distribution models that transform how customers discover products and interact with digital services.

More advanced stages involve specialised systems. Expert capability integrates AI into research, creative production, and domain-specific analysis, allowing professionals to explore a wider range of ideas and experiments.

Meanwhile, systems and dependency management introduce AI tools capable of safely updating interconnected digital environments, including codebases, documentation, and operational processes.

The final stage involves full process re-engineering through autonomous agents. In such environments, AI systems coordinate complex workflows across departments while maintaining governance, accountability, and auditability.

Organisations that successfully progress through these stages may eventually redesign their business models rather than merely improving efficiency within existing structures.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!

EU watchdog urges limits on US data access

The European Union’s data protection watchdog has urged stronger safeguards as negotiations continue with the US over access to biometric databases. European Data Protection Supervisor Wojciech Wiewiórowski said limits must ensure Europeans’ data is used only for agreed purposes.

Talks between the EU and the US involve potential arrangements that would allow US authorities to query national biometric systems. Databases across the EU contain sensitive information, including fingerprints and facial recognition data.

Past transatlantic data-sharing agreements between the two have faced legal challenges due to insufficient safeguards. European regulators are closely monitoring the Data Privacy Framework amid ongoing concerns about oversight.

Officials also warned that emerging AI technologies could create new surveillance risks linked to US data access. European authorities said they must negotiate as a unified bloc when dealing with the US.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

U Mobile named Malaysia’s fastest 5G network in 2025

U Mobile has been ranked Malaysia’s fastest 5G network for the third and fourth quarters of 2025, according to Ookla Speedtest Awards data drawn from millions of real-world user tests.

The result is attributed to the company’s ULTRA5G network, which deploys advanced antenna technologies, including 64T64R systems and extremely large antenna arrays, to boost coverage and handle heavier data traffic.

Chief Technology Officer Woon Ooi Yuen said the recognition validates the company’s infrastructure investments, emphasising that the award reflects actual user experience rather than controlled lab conditions.

U Mobile is targeting 5G coverage across 80% of populated areas in Malaysia by the second half of 2026, with its rollout said to be ahead of schedule.

Beyond coverage expansion, U Mobile has signed a memorandum of understanding with ZTE Malaysia to explore AI-native capabilities in its 5G core network.

The collaboration centres on integrating AI tools for traffic prediction, automated network management, and security monitoring, with digital twin technology potentially allowing engineers to simulate changes before deployment.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Passkey login comes to Windows 11 via Bitwarden vault

Bitwarden has announced support for logging into Windows 11 devices using passkeys stored in its encrypted vault, enabling phishing-resistant authentication directly at the operating system login screen.

The feature is available across all Bitwarden plans, including the free tier, and is believed to be a first for a third-party password manager.

During the login process, Windows 11 displays a QR code that users scan with their mobile device running the Bitwarden app, which then confirms access to the stored passkey and completes authentication.

Unlike device-bound passkey implementations, passkeys are synchronised across devices via Bitwarden’s end-to-end encrypted vault, meaning users can still regain access even if their phone is lost.

The feature builds on Microsoft’s introduction of native support for external passkey managers in Windows 11 in November 2025. It requires the device to be joined to Microsoft Entra ID with FIDO2 security key sign-in enabled.

Microsoft says the passkey-based login will roll out throughout March, depending on an organisation’s Entra ID configuration.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

China strengthens online safeguards for minors

Chinese authorities have introduced new rules to classify online content that could affect the health and well-being of minors. Set to take effect on 1 March, the measures aim to adapt to a rapidly evolving internet landscape.

Top government bodies, including those in cyberspace, education, publishing, film, culture, tourism, public security, and radio and television, jointly released the initiative. Together, they outlined four categories of content that could negatively impact minors and specified their key characteristics.

Recent issues, such as the misuse of minors’ images, have been integrated into the regulatory framework. Authorities also established preventive guidelines to manage risks from emerging technologies, including algorithmic recommendations and generative AI.

Internet platforms and content producers are now required to take both proactive and corrective measures against harmful content. The rules emphasise that platforms must monitor, block, or remove information that could affect minors’ well-being.

The Cyberspace Administration of China pledged to continue purifying the online environment. Authorities will urge platforms to assume their primary responsibilities and strengthen governance of content affecting young users, aiming to create a safer and healthier digital space for children.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Swedish firm launches runtime AI governance platform ahead of EU AI Act

A Swedish technology company has introduced what it describes as the world’s first runtime AI governance platform designed to enforce oversight of AI systems directly during operation.

The launch by VORTIQ-X Consilium comes as organisations across Europe, the Middle East and Africa prepare for the enforcement of high-risk provisions under the EU AI Act on 2 August 2026.

Rapid expansion of AI across sectors such as finance, healthcare, energy and public administration is increasing pressure on organisations to demonstrate stronger governance and regulatory compliance. VORTIQ-X says its platform, referred to as an AI Governance Hypervisor, acts as a structural control layer placed between AI models and the systems in which they operate.

Governance rules can therefore be applied before AI decisions are executed rather than relying solely on monitoring, policy documentation or post-incident audits.

Company executives argue that governance must operate at the same technical level as the AI systems it is meant to regulate. The platform, built on Swedish patented technology, is designed to generate verifiable governance records while supporting deployment in environments requiring strict control, including on-premise infrastructure, sovereign clouds and air-gapped systems.

VORTIQ-X also says the platform includes security mechanisms to protect AI models from emerging threats, such as model distillation via API exploitation. The company plans to work with enterprises and regulatory stakeholders across the EMEA region as governments and organisations move toward stronger enforcement of AI governance frameworks.

Would you like to learn more about AI, tech and digital diplomacyIf so, ask our Diplo chatbot!