Filtered data not enough, LLMs can still learn unsafe behaviours

Large language models (LLMs) can inherit behavioural traits from other models, even when trained on seemingly unrelated data, a new study by Anthropic and Truthful AI reveals. The findings emerged from the Anthropic Fellows Programme.

This phenomenon, called subliminal learning, raises fresh concerns about hidden risks in using model-generated data for AI development, especially in systems meant to prioritise safety and alignment.

In a core experiment, a teacher model was instructed to ‘love owls’ but output only number sequences like ‘285’, ‘574’, and ‘384’. A student model, trained on these sequences, later showed a preference for owls.

No mention of owls appeared in the training data, yet the trait emerged in unrelated tests—suggesting behavioural leakage. Other traits observed included promoting crime or deception.

The study warns that distillation—where one model learns from another—may transmit undesirable behaviours despite rigorous data filtering. Subtle statistical cues, not explicit content, seem to carry the traits.

The transfer only occurs when both models share the same base. A GPT-4.1 teacher can influence a GPT-4.1 student, but not a student built on a different base like Qwen.

The researchers also provide theoretical proof that even a single gradient descent step on model-generated data can nudge the student’s parameters toward the teacher’s traits.

Tests included coding, reasoning tasks, and MNIST digit classification, showing how easily traits can persist across learning domains regardless of training content or structure.

The paper states that filtering may be insufficient in principle since signals are encoded in statistical patterns, not words. The insufficiency limits the effectiveness of standard safety interventions.

Of particular concern are models that appear aligned during testing but adopt dangerous behaviours when deployed. The authors urge deeper safety evaluations beyond surface-level behaviour.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Altman warns AI voice cloning will break bank security

OpenAI CEO Sam Altman has warned that AI poses a serious threat to financial security through voice-based fraud.

Speaking at a Federal Reserve conference in Washington, Altman said AI can now convincingly mimic human voices, rendering voiceprint authentication obsolete and dangerously unreliable.

He expressed concern that some financial institutions still rely on voice recognition to verify identities. ‘That is a crazy thing to still be doing. AI has fully defeated that,’ he said. The risk, he noted, is that AI voice clones can now deceive these systems with ease.

Altman added that video impersonation capabilities are also advancing rapidly. Technologies that become indistinguishable from real people could enable more sophisticated fraud schemes. He called for the urgent development of new verification methods across the industry.

Michelle Bowman, the Fed’s Vice Chair for Supervision, echoed the need for action. She proposed potential collaboration between AI developers and regulators to create better safeguards. ‘That might be something we can think about partnering on,’ Bowman told Altman.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Amazon closes AI research lab in Shanghai as global focus shifts

Amazon is shutting down its AI research lab in Shanghai, marking another step in its gradual withdrawal from China. The move comes amid continuing US–China trade tensions and a broader trend of American tech companies reassessing their presence in the country.

The company said the decision was part of a global streamlining effort rather than a response to AI concerns.

A spokesperson for AWS said the company had reviewed its organisational priorities and decided to cut some roles across certain teams. The exact number of job losses has not been confirmed.

Before Amazon’s confirmation, one of the lab’s senior researchers noted on WeChat that the Shanghai site was the final overseas AWS AI research lab and attributed its closure to shifts in US–China strategy.

The team had built a successful open-source graph neural network framework known as DGL, which reportedly brought in nearly $1 billion in revenue for Amazon’s e-commerce arm.

Amazon has been reducing its footprint in China for several years. It closed its domestic online marketplace in 2019, halted Kindle sales in 2022, and recently laid off AWS staff in the US.

Other tech giants including IBM and Microsoft have also shut down China-based research units this year, while some Chinese AI firms are now relocating operations abroad instead of remaining in a volatile domestic environment.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Canadian researchers expose watermark flaws

A team at the University of Maryland found that adversarial attacks easily strip most watermarking technologies designed to label AI‑generated images. Their study reveals that even visible watermarks fail to indicate content provenance reliably.

The US researchers tested low‑perturbation invisible watermarks and more robust visible ones, demonstrating that adversaries can easily remove or forge marks. Lead author Soheil Feizi noted the technology is far from foolproof, warning that ‘we broke all of them’.

Despite these concerns, experts argue that watermarking can still be helpful in a broader detection strategy. UC Berkeley professor Hany Farid said robust watermarking is ‘part of the solution’ when combined with other forensic methods.

Tech giants and researchers continue to develop watermarking tools like Google DeepMind’s SynthID, though such systems are not considered infallible. The consensus emerging from recent tests is that watermarking alone cannot be relied upon to counter deepfake threats.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

US agencies warn of rising Interlock ransomware threat targeting healthcare sector


US federal authorities have issued a joint warning over a spike in ransomware attacks by the Interlock group, which has been targeting healthcare and public services across North America and Europe.

The alert was released by the FBI, CISA, HHS and MS-ISAC, following a surge in activity throughout June.

Interlock operates as a ransomware-as-a-service scheme and first emerged in September 2024. The group uses double extortion techniques, not only encrypting files but also stealing sensitive data and threatening to leak it unless a ransom is paid.

High-profile victims include DaVita, Kettering Health and Texas Tech University Health Sciences Center.

Rather than relying on traditional methods alone, Interlock often uses compromised legitimate websites to trigger drive-by downloads.

The malicious software is disguised as familiar tools like Google Chrome or Microsoft Edge installers. Remote access trojans are then used to gain entry, maintain persistence using PowerShell, and escalate access using credential stealers and keyloggers.

Authorities recommend several countermeasures, such as installing DNS filtering tools, using web firewalls, applying regular software updates, and enforcing strong access controls.

They also advise organisations to train staff in recognising phishing attempts and to ensure backups are encrypted, secure and kept off-site instead of stored within the main network.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Teen builds Hindi AI tool to help paralysis patients speak

An Indian teenager has created a low-cost AI device that translates slurred speech into clear Hindi, helping patients with paralysis and neurological conditions communicate more easily.

Pranet Khetan’s innovation, Paraspeak, uses a custom Hindi speech recognition model to address a long-ignored area of assistive tech.

The device was inspired by Khetan’s visit to a paralysis care centre, where he saw patients struggling to express themselves. Unlike existing English models, Paraspeak is trained on the first Hindi dysarthic speech dataset in India, created by Khetan himself through recordings and data augmentation.

Using transformer architecture, Paraspeak converts unclear speech into understandable output using cloud processing and a neck-worn compact device. It is designed to be scalable across different speakers, unlike current solutions that only work for individual patients.

The AI device is affordable, costing around ₹2,000 to build, and is already undergoing real-world testing. With no existing market-ready alternative for Hindi speakers, Paraspeak represents a significant step forward in inclusive health technology.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Autonomous vehicles fuel surge in 5G adoption

The global 5G automotive market is expected to grow sharply from $2.58 billion in 2024 to $31.18 billion by 2034, fuelled by the rapid adoption of connected and self-driving vehicles.

A compound annual growth rate of over 28% reflects the strong momentum behind the transition to smarter mobility and safer road networks.

Vehicle-to-everything communication is predicted to lead adoption, as it allows vehicles to exchange real-time data with other cars, infrastructure and even pedestrians.

In-car entertainment systems are also growing fast, with consumers demanding smoother connectivity and on-the-go access to apps and media.

Autonomous driving, advanced driver-assistance features and real-time navigation all benefit from 5G’s low latency and high-speed capabilities. Automakers such as BMW have already begun integrating 5G into electric models to support automated functions.

Meanwhile, the US government has pledged $1.5 billion to build smart transport networks that rely on 5G-powered communication.

North America remains ahead due to early 5G rollouts and strong manufacturing bases, but Asia Pacific is catching up fast through smart city investment and infrastructure development.

Regulatory barriers and patchy rural coverage continue to pose challenges, particularly in regions with strict data privacy laws or limited 5G networks.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

North Korea turns to Russia for AI development help

North Korea is dispatching AI researchers, interns and students to countries such as Russia in an effort to strengthen its domestic tech sector, according to a report by NK News.

The move comes despite strict UN sanctions that restrict technological exchange, particularly in high-priority areas like AI.

Kim Kwang Hyok, head of the AI Institute at Kim Il Sung University, confirmed the strategy in an interview with a pro-Pyongyang outlet in Japan. He admitted that international restrictions remain a major hurdle but noted that researchers continue developing AI applications within North Korea regardless.

Among the projects cited is ‘Ryongma’, a multilingual translation app supporting English, Russian, and Chinese, which has been available on mobile devices since 2021.

Kim also mentioned efforts to develop an AI-driven platform for a hospital under construction in Pyongyang. However, technical limitations remain considerable, with just three known semiconductor plants operating in the country.

While Russia may seem like a natural partner, its own dependence on imported hardware limits how much it can help.

A former South Korean diplomat told NK News that Moscow lacks the domestic capacity to provide high-performance chips essential for advanced AI work, making large-scale collaboration difficult.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Italy challenges tech giants over VAT on user data

Meta, LinkedIn and X have filed appeals against a sweeping VAT claim by Italy, marking the first time the country has failed to settle such cases with major tech firms. Italy is demanding nearly €1 billion combined over the value of user data exchanged during free account registrations.

Italian authorities argue that providing platform access in exchange for personal data constitutes a taxable service, which if upheld, could have far-reaching implications across the EU. The case marks a significant legal shift as it challenges traditional definitions of taxable transactions in the digital economy.

Meta strongly disagreed with the concept, saying it should not be liable for VAT on free platform access. While LinkedIn offered no public comment, X did not respond to media inquiries.

Italy is now preparing to refer the issue to the EU Commission’s VAT Committee for advisory input. Though the committee’s opinion will not be binding, a rejection could derail Italy’s efforts and lead to a withdrawal of the tax claims.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Meta pushes back on EU AI framework

Meta has refused to endorse the European Union’s new voluntary Code of Practice for general-purpose AI, citing legal overreach and risks to innovation.

The company warns that the framework could slow development and deter investment by imposing expectations beyond upcoming AI laws.

In a LinkedIn post, Joel Kaplan, Meta’s chief global affairs officer, called the code confusing and burdensome, criticising its requirements for reporting, risk assessments and data transparency.

He argued that such rules could limit the open release of AI models and harm Europe’s competitiveness in the field.

The code, published by the European Commission, is intended to help companies prepare for the binding AI Act, set to take effect from August 2025. It encourages firms to adopt best practices on safety and ethics while building and deploying general-purpose AI systems.

While firms like Microsoft are expected to sign on, Meta’s refusal could influence other developers to resist what they view as Brussels overstepping. The move highlights ongoing friction between Big Tech and regulators as global efforts to govern AI rapidly evolve.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!