OpenAI deploys new safeguards for AI models to curb biothreat risks

OpenAI has introduced a new monitoring system to reduce the risk of its latest AI models, o3 and o4-mini, being misused to create chemical or biological threats.

The ‘safety-focused reasoning monitor’ is built to detect prompts related to dangerous materials and instruct the AI models to withhold potentially harmful advice, instead of providing answers that could aid bad actors.

These newer models represent a major leap in capability compared to previous versions, especially in their ability to respond to prompts about biological weapons. To counteract this, OpenAI’s internal red teams spent 1,000 hours identifying unsafe interactions.

Simulated tests showed the safety monitor successfully blocked 98.7% of risky prompts, although OpenAI admits the system does not account for users trying again with different wording, a gap still covered by human oversight instead of relying solely on automation.

Despite assurances that neither o3 nor o4-mini meets OpenAI’s ‘high risk’ threshold, the company acknowledges these models are more effective at answering dangerous questions than earlier ones like o1 and GPT-4.

Similar monitoring tools are also being used to block harmful image generation in other models, yet critics argue OpenAI should do more.

Concerns have been raised over rushed testing timelines and the lack of a safety report for GPT-4.1, which was launched this week instead of being accompanied by transparency documentation.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

AMD warns of financial hit from US AI chip export ban

AMD has warned that new US government restrictions on exporting AI chips to China and several other countries could materially affect its earnings.

The company said it may face charges of up to $800 million related to unsold inventory, purchase commitments, and reserves if it fails to secure export licences for its MI308 GPUs, now subject to strict control measures.

In a filing to the US Securities and Exchange Commission, AMD confirmed it would seek the necessary licences but admitted there is no guarantee they will be granted.

The move follows broader export restrictions aimed at protecting national security interests, with US officials arguing that unrestricted access to advanced chips would weaken the country’s strategic lead in AI, instead of preserving it.

AMD’s stock dropped around 6% following the announcement. Competitors are also feeling the impact. Nvidia expects charges of $5.5 billion from similar restrictions, and Intel’s Gaudi hardware line has reportedly been affected as well.

The US Commerce Department has defended the move as necessary to safeguard economic and national interests.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

xAI pushes Grok forward with memory update

Elon Musk’s AI venture, xAI, has introduced a new ‘memory’ feature for its Grok chatbot in a bid to compete more closely with established rivals like ChatGPT and Google’s Gemini.

The update allows Grok to remember details from past conversations, enabling it to provide more personalised responses when asked for advice or recommendations, instead of offering generic answers.

Unlike before, Grok can now ‘learn’ a user’s preferences over time, provided it’s used frequently enough. The move mirrors similar features from competitors, with ChatGPT already referencing full chat histories and Gemini using persistent memory to shape its replies.

According to xAI, the memory is fully transparent. Users can view what Grok has remembered and choose to delete specific entries at any time.

The memory function is currently available in beta on Grok’s website and mobile apps, although not yet accessible to users in the EU or UK.

Instead of being automatically enabled, it can be turned off in the settings menu under Data Controls. Deleting individual memories is also possible via the web chat interface, with Android support expected shortly.

xAI has confirmed it is working on adding memory support to Grok’s version on X. However, this expansion aims to deepen the bot’s integration with users’ digital lives instead of limiting the experience to one platform.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Microsoft unveils powerful lightweight AI model for CPUs

Microsoft researchers have introduced the largest 1-bit AI model to date, called BitNet b1.58 2B4T, designed to run efficiently on standard CPUs instead of relying on GPUs. This ‘bitnet’ model, now openly available under the MIT license, can even operate on Apple’s M2 chips.

Bitnets use extreme weight quantisation, storing only -1, 0, or 1 as values, making them far more memory- and compute-efficient than most conventional models.

With 2 billion parameters and trained on 4 trillion tokens, roughly the equivalent of 33 million books, BitNet b1.58 2B4T outperforms several similarly sized models in key benchmarks.

Microsoft claims it beats Meta’s Llama 3.2 1B, Google’s Gemma 3 1B, and Alibaba’s Qwen 2.5 1.5B on tasks like grade-school maths and physical reasoning. It also runs up to twice as fast while using significantly less memory, offering a potential edge for lower-end or energy-constrained devices.

The main limitation lies in its dependence on Microsoft’s custom bitnet.cpp framework, which supports only select hardware and does not yet work with GPUs.

Instead of being broadly compatible with existing AI systems, BitNet’s performance depends on a narrower infrastructure, a hurdle that may limit adoption, despite its promise for lightweight AI deployment.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Google uses AI and human reviews to fight ad fraud

Google has revealed it suspended 39.2 million advertiser accounts in 2024, more than triple the number from the previous year, as part of its latest push to combat ad fraud.

The tech giant said it is now able to block most bad actors before they even run an advert, thanks to advanced large language models and detection signals such as fake business details and fraudulent payments.

Instead of relying solely on AI, a team of over 100 experts from across Google and DeepMind also reviews deepfake scams and develops targeted countermeasures.

The company rolled out more than 50 LLM-based safety updates last year and introduced over 30 changes to advertising and publishing policies. These efforts, alongside other technical reinforcements, led to a 90% drop in reports of deepfake ads.

While the US saw the highest number of suspensions, with all 39.2 million accounts coming from there alone, India followed with 2.9 million accounts taken down. In both countries, ads were removed for violations such as trademark abuse, misleading personalisation, and financial service scams.

Overall, Google blocked 5.1 billion ads globally and restricted another 9.1 billion, instead of allowing harmful content to spread unchecked. Nearly half a billion of those removed were linked specifically to scam activity.

In a year when half the global population headed to the polls, Google also verified over 8,900 election advertisers and took down 10.7 million political ads.

While the scale of suspensions may raise concerns about fairness, Google said human reviews are included in the appeals process.

The company acknowledged previous confusion over enforcement clarity and is now updating its messaging to ensure advertisers understand the reasons behind account actions more clearly.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

EU plans major staff boost for digital rules

The European Commission is ramping up enforcement of its Digital Services Act (DSA) by hiring 60 more staff to support ongoing investigations into major tech platforms. Despite beginning probes into companies such as X, Meta, TikTok, AliExpress and Temu since December 2023, none have concluded.

The Commission currently has 127 employees working on the DSA and aims to reach 200 by year’s end. Applications for the new roles, including legal experts, policy officers, and data scientists, remain open until 10 May.

The DSA, which came into full effect in February last year, applies to all online platforms in the EU. However, the 25 largest platforms, those with over 45 million monthly users like Google, Amazon, and Shein, fall under the direct supervision of the Commission instead of national regulators.

The most advanced case is against X, with early findings pointing to a lack of transparency and accountability.

The law has drawn criticism from the current Republican-led US government, which views it as discriminatory. Brendan Carr of the US Federal Communications Commission called the DSA ‘an attack on free speech,’ accusing the EU of unfairly targeting American companies.

In response, EU Tech Commissioner Henna Virkkunen insisted the rules are fair, applying equally to platforms from Europe, the US, and China.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Nvidia hit by the new US export rules

Nvidia is facing fresh US export restrictions on its H20 AI chips, dealing a blow to the company’s operations in China.

In a filing on Tuesday, Nvidia revealed it now needs a licence to export these chips indefinitely, after the US government cited concerns they could be used in a Chinese supercomputer.

The company expects a $5.5 billion charge linked to the controls in its first fiscal quarter of 2026, which ends on 27 April. Shares dropped around 6% in after-hours trading.

The H20 is currently the most advanced AI chip Nvidia can sell to China under existing regulations.

Last week, reports suggested CEO Jensen Huang might have temporarily eased tensions during a dinner at Donald Trump’s Mar-a-Lago resort, by promising investments in US-based AI data centres instead of opposing the rules directly.

Just a day before the filing, Nvidia announced plans to manufacture some chips in the US over the next four years, though the specifics were left vague.

Calls for tighter controls had been building, especially after it emerged that China’s DeepSeek used the H20 to train its R1 model, a system that surprised the US AI sector earlier this year.

Government officials had pushed for action, saying the chip’s capabilities posed a strategic risk. Nvidia declined to comment on the new restrictions.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

OpenAI updates safety rules amid AI race

OpenAI has updated its Preparedness Framework, the internal system used to assess AI model safety and determine necessary safeguards during development.

The company now says it may adjust its safety standards if a rival AI lab releases a ‘high-risk’ system without similar protections, a move that reflects growing competitive pressure in the AI industry.

Instead of outright dismissing such flexibility, OpenAI insists that any changes would be made cautiously and with public transparency.

Critics argue OpenAI is already lowering its standards for the sake of faster deployment. Twelve former employees recently supported a legal case against the company, warning that a planned corporate restructure might encourage further shortcuts.

OpenAI denies these claims, but reports suggest compressed safety testing timelines and increasing reliance on automated evaluations instead of human-led reviews. According to sources, some safety checks are also run on earlier versions of models, not the final ones released to users.

The refreshed framework also changes how OpenAI defines and manages risk. Models are now classified as having either ‘high’ or ‘critical’ capability, the former referring to systems that could amplify harm, the latter to those introducing entirely new risks.

Instead of deploying models first and assessing risk later, OpenAI says it will apply safeguards during both development and release, particularly for models capable of evading shutdown, hiding their abilities, or self-replicating.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

xAI adds collaborative workspace to Grok

Elon Musk’s AI firm xAI has introduced a new feature called Grok Studio, offering users a dedicated space to create and edit documents, code, and simple apps.

Available on Grok.com for both free and paying users, Grok Studio opens content in a separate window, allowing for real-time collaboration between the user and the chatbot instead of relying solely on back-and-forth prompts.

Grok Studio functions much like canvas-style tools from other AI developers. It allows code previews and execution in languages such as Python, C++, and JavaScript. The setup mirrors similar features introduced earlier by OpenAI and Anthropic, instead of offering a radically different experience.

All content appears beside Grok’s chat window, creating a workspace that blends conversation with practical development tools.

Alongside this launch, xAI has also announced integration with Google Drive.

It will allow users to attach files directly to Grok prompts, letting the chatbot work with documents, spreadsheets, and slides from Drive instead of requiring uploads or manual input, making the platform more convenient for everyday tasks and productivity.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!