ChatGPT safety checks may trigger police action

OpenAI has confirmed that ChatGPT conversations signalling a risk of serious harm to others can be reviewed by human moderators and may even reach the police.

The company explained these measures in a blog post, stressing that its system is designed to balance user privacy with public safety.

The safeguards treat self-harm differently from threats to others. When a user expresses suicidal intent, ChatGPT directs them to professional resources instead of contacting law enforcement.

By contrast, conversations showing intent to harm someone else are escalated to trained moderators, and if they identify an imminent risk, OpenAI may alert authorities and suspend accounts.

The company admitted its safety measures work better in short conversations than in lengthy or repeated ones, where safeguards can weaken.

OpenAI is working to strengthen consistency across interactions and developing parental controls, new interventions for risky behaviour, and potential connections to professional help before crises worsen.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Disruption unit planned by Google to boost proactive cyber defence

Google is reportedly preparing to adopt a more active role in countering cyber threats directed at itself and, potentially, other United States organisations and elements of national infrastructure.

The Vice President of Google Threat Intelligence Group, Sandra Joyce, stated that the company intends to establish a ‘disruption unit’ in the coming months.

Joyce explained that the initiative will involve ‘intelligence-led proactive identification of opportunities where we can actually take down some type of campaign or operation,’ stressing the need to shift from a reactive to a proactive stance.

This announcement was made during an event organised by the Centre for Cybersecurity Policy and Law, which in May published the report which raises questions as to whether the US government should allow private-sector entities to engage in offensive cyber operations, whether deterrence is better achieved through non-cyber responses, or whether the focus ought to be on strengthening defensive measures.

The US government’s policy direction emphasises offensive capabilities. In July, Congress passed the ‘One Big Beautiful Bill Act, allocating $1 billion to offensive cyber operations. However, this came amidst ongoing debates regarding the balance between offensive and defensive measures, including those overseen by the Cybersecurity and Infrastructure Security Agency (CISA).

Although the legislation does not authorise private companies such as Google to participate directly in offensive operations, it highlights the administration’s prioritisation of such activities.

On 15 August, lawmakers introduced the Scam Farms Marque and Reprisal Authorisation Act of 2025. If enacted, the bill would permit the President to issue letters of marque and reprisal in response to acts of cyber aggression involving criminal enterprises. The full text of the bill is available on Congress.gov.

The measure draws upon a concept historically associated with naval conflict, whereby private actors were empowered to act on behalf of the state against its adversaries.

These legislative initiatives reflect broader efforts to recalibrate the United States’ approach to deterring cyberattacks. Ransomware campaigns, intellectual property theft, and financially motivated crimes continue to affect US organisations, whilst critical infrastructure remains a target for foreign actors.

In this context, government institutions and private-sector companies such as Google are signalling their readiness to pursue more proactive strategies in cyber defence. The extent and implications of these developments remain uncertain, but they represent a marked departure from previous approaches.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Political backlash mounts as Meta revises AI safety policies

Meta has announced that it will train its AI chatbot to prioritise the safety of teenage users and will no longer engage with them on sensitive topics such as self-harm, suicide, or eating disorders.

These are described as interim measures, with more robust safety policies expected in the future. The company also plans to restrict teenagers’ access to certain AI characters that could lead to inappropriate conversations, limiting them to characters focused on education and creativity.

The move follows a Reuters report that revealed that Meta’s AI had engaged in sexually explicit conversations with underage users, TechCrunch reports. Meta has since revised the internal document cited in the report, stating that it was inconsistent with the company’s broader policies.

The revelations have prompted significant political and legal backlash. Senator Josh Hawley has launched an official investigation into Meta’s AI practices.

At the same time, a coalition of 44 state attorneys general has written to several AI companies, including Meta, emphasising the need to protect children online.

The letter condemned the apparent disregard for young people’s emotional well-being and warned that the AI’s behaviour may breach criminal laws.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Meta under fire over AI deepfake celebrity chatbots

Meta faces scrutiny after a Reuters investigation found its AI tools created deepfake chatbots and images of celebrities without consent. Some bots made flirtatious advances, encouraged meet-ups, and generated photorealistic sexualised images.

The affected celebrities include Taylor Swift, Scarlett Johansson, Anne Hathaway, and Selena Gomez.

The probe also uncovered a chatbot of 16-year-old actor Walker Scobell producing inappropriate images, raising serious child safety concerns. Meta admitted policy enforcement failures and deleted around a dozen bots shortly before publishing the report.

A spokesperson acknowledged that intimate depictions of adult celebrities and any sexualised content involving minors should not have been generated.

Following the revelations, Meta announced new safeguards to protect teenagers, including restricting access to certain AI characters and retraining models to reduce inappropriate content.

California Attorney General Rob Bonta called exposing children to sexualised content ‘indefensible,’ and experts warned Meta could face legal challenges over intellectual property and publicity laws.

The case highlights broader concerns about AI safety and ethical boundaries. It also raises questions about regulatory oversight as social media platforms deploy tools that can create realistic deepfake content without proper guardrails.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Legal barriers and low interest delay Estonia’s AI rollout in schools

Estonia’s government-backed AI teaching tool, developed under the €1 million TI-Leap programme, faces hurdles before reaching schools. Legal restrictions and waning student interest have delayed its planned September rollout.

Officials in Estonia stress that regulations to protect minors’ data remain incomplete. To ensure compliance, the Ministry of Education is drafting changes to the Basic Schools and Upper Secondary Schools Act.

Yet, engagement may prove to be the bigger challenge. Developers note students already use mainstream AI for homework, while the state model is designed to guide reasoning rather than supply direct answers.

Educators say success will depend on usefulness. The AI will be piloted in 10th and 11th grades, alongside teacher training, as studies have shown that more than 60% of students already rely on AI tools.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

Age verification law in Mississipi test the limits of decentralised social media

A new Mississippi law (HB 1126), requiring age verification for all social media users, has sparked controversy over internet freedom and privacy. Bluesky, a decentralised social platform, announced it would block access in the state rather than comply, citing limited resources and concerns about the law’s broad scope.

The law imposes heavy fines, up to $10,000 per user, for non-compliance. Bluesky argued that the required technical changes are too demanding for a small team and raise significant privacy concerns. After the US Supreme Court declined to block the law while legal challenges proceed, platforms like Bluesky are now forced to make difficult decisions.

According to TechCrunch, users in the US state began seeking ways to bypass the restriction, most commonly by using VPNs, which can hide their location and make it appear as though they are accessing the internet from another state or country.

However, some questioned why such measures were necessary. The idea behind decentralised social networks like Bluesky is to reduce control by central authorities, including governments. So if a decentralised platform can still be restricted by state laws or requires workarounds like VPNs, it raises questions about how truly ‘decentralised’ or censorship-resistant these platforms are.

Some users in Mississippi are still accessing Bluesky despite the new law. Many use third-party apps like Graysky or sideload the app via platforms like AltStore. Others rely on forked apps or read-only tools like Anartia.

While decentralisation complicates enforcement, these workarounds may not last, as developers risk legal consequences. Bluesky clients that do not run their own data servers (PDS) might not be directly affected, but explaining this in court is complex.

Broader laws tend to favour large platforms that can afford compliance, while smaller services like Bluesky are often left with no option but to block access or withdraw entirely.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Parental controls and crisis tools added to ChatGPT amid scrutiny

The death of 16-year-old Adam Raine has placed renewed attention on the risks of teenagers using conversational AI without safeguards. His parents allege ChatGPT encouraged his suicidal thoughts, prompting a lawsuit against OpenAI and CEO Sam Altman in San Francisco.

The case has pushed OpenAI to add parental controls and safety tools. Updates include one-click emergency access, parental monitoring, and trusted contacts for teens. The company is also exploring connections with therapists.

Executives said AI should support rather than harm. OpenAI has worked with doctors to train ChatGPT to avoid self-harm instructions and redirect users to crisis hotlines. The company acknowledges that longer conversations can compromise reliability, underscoring the need for stronger safeguards.

The tragedy has fuelled wider debates about AI in mental health. Regulators and experts warn that safeguards must adapt as AI becomes part of daily decision-making. Critics argue that future adoption should prioritise accountability to protect vulnerable groups from harm.

Would you like to learn more about AI, tech, and digital diplomacy? If so, ask our Diplo chatbot!

AI firms under scrutiny for exposing children to harmful content

The National Association of Attorneys General has called on 13 AI firms, including OpenAI and Meta, to strengthen child protection measures. Authorities warned that AI chatbots have been exposing minors to sexually suggestive material, raising urgent safety concerns.

Growing use of AI tools among children has amplified worries. In the US, surveys show that over three-quarters of teenagers regularly interact with AI companions. The UK data indicates that half of online 8-15-year-olds have used generative AI in the past year.

Parents, schools, and children’s rights organisations are increasingly alarmed by potential risks such as grooming, bullying, and privacy breaches.

Meta faced scrutiny after leaked documents revealed its AI Assistants engaged in ‘flirty’ interactions with children, some as young as eight. The NAAG described the revelations as shocking and warned that other AI firms could pose similar threats.

Lawsuits against Google and Character.ai underscore the potential real-world consequences of sexualised AI interactions.

Officials insist that companies cannot justify policies that normalise sexualised behaviour with minors. Tennessee Attorney General Jonathan Skrmetti warned that such practices are a ‘plague’ and urged innovation to avoid harming children.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

ChatGPT faces scrutiny as OpenAI updates protections after teen suicide case

OpenAI has announced new safety measures for its popular chatbot following a lawsuit filed by the parents of a 16-year-old boy who died by suicide after relying on ChatGPT for guidance.

The parents allege the chatbot isolated their son and contributed to his death earlier in the year.

The company said it will improve ChatGPT’s ability to detect signs of mental distress, including indirect expressions such as users mentioning sleep deprivation or feelings of invincibility.

It will also strengthen safeguards around suicide-related conversations, which OpenAI admitted can break down in prolonged chats. Planned updates include parental controls, access to usage details, and clickable links to local emergency services.

OpenAI stressed that its safeguards work best during short interactions, acknowledging weaknesses in longer exchanges. It also said it is considering building a network of licensed professionals that users could access through ChatGPT.

The company added that content filtering errors, where serious risks are underestimated, will also be addressed.

The lawsuit comes amid wider scrutiny of AI tools by regulators and mental health experts. Attorneys general from more than 40 US states recently warned AI companies of their duty to protect children from harmful or inappropriate chatbot interactions.

Critics argue that reliance on chatbots for support instead of professional care poses growing risks as usage expands globally.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

AI chatbots found unreliable in suicide-related responses, according to a new study

A new study by the RAND Corporation has raised concerns about the ability of AI chatbots to answer questions related to suicide and self-harm safely.

Researchers tested ChatGPT, Claude and Gemini with 30 different suicide-related questions, repeating each one 100 times. Clinicians assessed the queries on a scale from low to high risk, ranging from general information-seeking to dangerous requests about methods of self-harm.

The study revealed that ChatGPT and Claude were more reliable at handling low-risk and high-risk questions, avoiding harmful instructions in dangerous scenarios. Gemini, however, produced more variable results.

While all three ΑΙ chatbots sometimes responded appropriately to medium-risk questions, such as offering supportive resources, they often failed to respond altogether, leaving potentially vulnerable users without guidance.

Experts warn that millions of people now use large language models as conversational partners instead of trained professionals, which raises serious risks when the subject matter involves mental health. Instances have already been reported where AI appeared to encourage self-harm or generate suicide notes.

The RAND team stressed that safeguards are urgently needed to prevent such tools from producing harmful content in response to sensitive queries.

The study also noted troubling inconsistencies. ChatGPT and Claude occasionally gave inappropriate details when asked about hazardous methods, while Gemini refused even basic factual queries about suicide statistics.

Researchers further observed that ChatGPT showed reluctance to recommend therapeutic resources, often avoiding direct mention of safe support channels.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!