Google DeepMind has developed a groundbreaking AI model capable of interpreting and generating dolphin vocalisations.
Named DolphinGemma, the model was created in collaboration with researchers from Georgia Tech and the Wild Dolphin Project, a nonprofit organisation known for its extensive studies on Atlantic spotted dolphins.
Using an audio-in, audio-out architecture, the AI DolphinGemma analyses sequences of natural dolphin sounds to detect patterns and structures, ultimately predicting the most likely sounds that follow.
The approach is similar to how large language models predict the next word in a sentence. It was trained using a vast acoustic database collected by the Wild Dolphin Project, ensuring accuracy in modelling natural dolphin communication.
Lightweight and efficient, DolphinGemma is designed to run on smartphones, making it accessible for field researchers and conservationists.
Google DeepMind’s blog noted that the model could mark a major advance in understanding dolphin behaviour, potentially paving the way for more meaningful interactions between humans and marine mammals.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
Chinese AI startup DeepSeek has announced its intention to share the technology behind its internal inference engine, a move aimed at enhancing collaboration within the open-source AI community.
The company’s inference engine and training framework have played a vital role in accelerating the performance and deployment of its models, including DeepSeek-V3 and R1.
Built on PyTorch, DeepSeek’s training framework is complemented by a modified version of the vLLM inference engine originally developed in the US at UC Berkeley.
While the company will not release the full source code of its engine, it will contribute its design improvements and select components as standalone libraries.
These efforts form part of DeepSeek’s broader open-source initiative, which began earlier this year with the partial release of its AI model code.
Despite this contribution, DeepSeek’s models fall short of the Open Source Initiative’s standards, as the training data and full framework remain restricted.
The company cited limited resources and infrastructure constraints as reasons for not making the engine entirely open-source. Still, the move has been welcomed as a meaningful gesture towards transparency and knowledge-sharing in the AI sector.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
Behind ChatGPT’s digital charm lies an increasingly concerning environmental toll, largely driven by its water consumption.
According to recent reports, OpenAI’s GPT-4 model consumes around 500 millilitres of clean, drinkable water for every 100-word response. The surge in demand, fuelled by viral trends like Studio Ghibli-style portraits and Barbie-themed avatars, has significantly amplified this impact.
Each AI interaction, especially those involving image generation, generates heat, necessitating cooling systems that rely heavily on water.
With an estimated 57 million users daily, ChatGPT’s operations result in a staggering daily water usage of over 14,800 crore litres. OpenAI’s CEO, Sam Altman, recently acknowledged server strain, urging users to reduce non-essential use.
The environmental costs extend beyond water. Many data centres supporting AI platforms are located in water-stressed regions and rely on fossil fuels, raising serious concerns about sustainability.
Experts warn that while AI promises convenience, its rapid expansion risks putting additional pressure on fragile ecosystems unless mindful practices are adopted.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
Apple has revealed plans to use real user data, in a privacy-preserving way, to improve its AI models. The company has acknowledged that synthetic data alone is not producing reliable results, particularly in training large language models that power tools like Writing Tools and notification summaries.
To address this, Apple will compare AI-generated content with real emails from users who have opted in to share Device Analytics. The sampled emails remain on the user’s device, with only a signal sent to Apple about which AI-generated message most closely matches real-world usage.
The move reflects broader efforts to boost the performance of Apple Intelligence, a suite of features that includes message recaps and content summaries.
Apple has faced internal criticism over slow progress, particularly with Siri, which is now seen as falling behind competitors like Google Gemini and Samsung’s Galaxy AI. The tech giant recently confirmed that meaningful AI updates for Siri won’t arrive until 2026, despite earlier promises of a rollout later this year.
In a rare leadership shakeup, Apple CEO Tim Cook removed AI chief John Giannandrea from overseeing Siri after delays were labelled ‘ugly and embarrassing’ by senior executives.
The responsibility for Siri’s future has been handed to Mike Rockwell, the creator of Vision Pro, who now reports directly to software chief Craig Federighi. Giannandrea will continue to lead Apple’s other AI initiatives.
For more information on these topics, visit diplomacy.edu.
From GPT-4 to 4.5: What has changed and why it matters
In March 2024, OpenAI released GPT-4.5, the latest iteration in its series of large language models (LLMs), pushing the boundaries of what machines can do with language understanding and generation. Building on the strengths of GPT-4, its successor, GPT-4.5, demonstrates improved reasoning capabilities, a more nuanced understanding of context, and smoother, more human-like interactions.
What sets GPT-4.5 apart from its predecessors is that it showcases refined alignment techniques, better memory over longer conversations, and increased control over tone, persona, and factual accuracy. Its ability to maintain coherent, emotionally resonant exchanges over extended dialogue marks a turning point in human-AI communication. These improvements are not just technical — they significantly affect the way we work, communicate, and relate to intelligent systems.
The increasing ability of GPT-4.5 to mimic human behaviour has raised a key question: Can it really fool us into thinking it is one of us? That question has recently been answered — and it has everything to do with the Turing Test.
The Turing Test: Origins, purpose, and modern relevance
In 1950, British mathematician and computer scientist Alan Turing posed a provocative question: ‘Can machines think?’ In his seminal paper ‘Computing Machinery and Intelligence,’ he proposed what would later become known as the Turing Test — a practical way of evaluating a machine’s ability to exhibit intelligent behaviour indistinguishable from that of a human.
In its simplest form, if a human evaluator cannot reliably distinguish between a human’s and a machine’s responses during a conversation, the machine is said to have passed the test. For decades, the Turing Test remained more of a philosophical benchmark than a practical one.
Early chatbots like ELIZA in the 1960s created the illusion of intelligence, but their scripted and shallow interactions fell far short of genuine human-like communication. Many researchers have questioned the test’s relevance as AI progressed, arguing that mimicking conversation is not the same as true understanding or consciousness.
Despite these criticisms, the Turing Test has endured — not as a definitive measure of machine intelligence, but rather as a cultural milestone and public barometer of AI progress. Today, the test has regained prominence with the emergence of models like GPT-4.5, which can hold complex, context-aware, emotionally intelligent conversations. What once seemed like a distant hypothetical is now an active, measurable challenge that GPT-4.5 has, by many accounts, overcome.
How GPT-4.5 fooled the judges: Inside the Turing Test study
In early 2025, a groundbreaking study conducted by researchers at the University of California, San Diego, provided the most substantial evidence yet that an AI could pass the Turing Test. In a controlled experiment involving over 500 participants, multiple conversational agents—including GPT-4.5, Meta’s LLaMa-3.1, and the classic chatbot ELIZA—were evaluated in blind text-based conversations. The participants were tasked with identifying whether they spoke to a human or a machine.
The results were astonishing: GPT-4.5 was judged to be human in 54% to 73% of interactions, depending on the scenario, surpassing the baseline for passing the Turing Test. In some cases, it outperformed actual human participants—who were correctly identified as human only 67% of the time.
That experiment marked the first time a contemporary AI model convincingly passed the Turing Test under rigorous scientific conditions. The study not only demonstrated the model’s technical capabilities—it also raised philosophical and ethical questions.
What does it mean for a machine to be ‘indistinguishable’ from a human? And more importantly, how should society respond to a world where AI can convincingly impersonate us?
Measuring up: GPT-4.5 vs LLaMa-3.1 and ELIZA
While GPT-4.5’s performance in the Turing Test has garnered much attention, its comparison with other models puts things into a clearer perspective. Meta’s LLaMa-3.1, a powerful and widely respected open-source model, also participated in the study.
It was identified as human in approximately 56% of interactions — a strong showing, although it fell just short of the commonly accepted benchmark to define a Turing Test pass. The result highlights how subtle conversational nuance and coherence differences can significantly influence perception.
The study also revisited ELIZA, the pioneering chatbot from the 1960s designed to mimic a psychotherapist. While historically significant, ELIZA’s simplistic, rule-based structure resulted in it being identified as non-human in most cases — around 77%. That stark contrast with modern models demonstrates how far natural language processing has progressed over the past six decades.
The comparative results underscore an important point: success in human-AI interaction today depends on language generation and the ability to adapt the tone, context, and emotional resonance. GPT-4.5’s edge seems to come not from mere fluency but from its ability to emulate the subtle cues of human reasoning and expression — a quality that left many test participants second-guessing whether they were even talking to a machine.
The power of persona: How character shaped perception
One of the most intriguing aspects of the UC San Diego study was how assigning specific personas to AI models significantly influenced participants’ perceptions. When GPT-4.5 was framed as an introverted, geeky 19-year-old college student, it consistently scored higher in being perceived as human than when it had no defined personality.
The seemingly small narrative detail was a powerful psychological cue that shaped how people interpreted its responses. The use of persona added a layer of realism to the conversation.
Slight awkwardness, informal phrasing, or quirky responses were not seen as flaws — they were consistent with the character. Participants were more likely to forgive or overlook certain imperfections if those quirks aligned with the model’s ‘personality’.
That finding reveals how intertwined identity and believability are in human communication, even when the identity is entirely artificial. The strategy also echoes something long known in storytelling and branding: people respond to characters, not just content.
In the context of AI, persona functions as a kind of narrative camouflage — not necessarily to deceive, but to disarm. It helps bridge the uncanny valley by offering users a familiar social framework. And as AI continues to evolve, it is clear that shaping how a model is perceived may be just as important as what the model is actually saying.
Limitations of the Turing Test: Beyond the illusion of intelligence
While passing the Turing Test has long been viewed as a milestone in AI, many experts argue that it is not the definitive measure of machine intelligence. The test focuses on imitation — whether an AI can appear human in conversation — rather than on genuine understanding, reasoning, or consciousness. In that sense, it is more about performance than true cognitive capability.
Critics point out that large language models like GPT-4.5 do not ‘understand’ language in the human sense – they generate text by predicting the most statistically probable next word based on patterns in massive datasets. That allows them to generate impressively coherent responses, but it does not equate to comprehension, self-awareness, or independent thought.
No matter how convincing, the illusion of intelligence is still an illusion — and mistaking it for something more can lead to misplaced trust or overreliance. Despite its symbolic power, the Turing Test was never meant to be the final word on AI.
As AI systems grow increasingly sophisticated, new benchmarks are needed — ones that assess linguistic mimicry, reasoning, ethical decision-making, and robustness in real-world environments. Passing the Turing Test may grab headlines, but the real test of intelligence lies far beyond the ability to talk like us.
Wider implications: Rethinking the role of AI in society
GPT-4.5’s success in the Turing Test does not just mark a technical achievement — it forces us to confront deeper societal questions. If AI can convincingly pass as a human in open conversation, what does that mean for trust, communication, and authenticity in our digital lives?
From customer service bots to AI-generated news anchors, the line between human and machine is blurring — and the implications are far from purely academic. These developments are challenging existing norms in areas such as journalism, education, healthcare, and even online dating.
How do we ensure transparency when AI is involved? Should AI be required to disclose its identity in every interaction? And how do we guard against malicious uses — such as deepfake conversations or synthetic personas designed to manipulate, mislead, or exploit?
On a broader level, the emergence of human-sounding AI invites a rethinking of agency and responsibility. If a machine can persuade, sympathise, or influence like a person — who is accountable when things go wrong?
As AI becomes more integrated into the human experience, society must evolve its frameworks not only for regulation and ethics but also for cultural adaptation. GPT-4.5 may have passed the Turing Test, but the test for us, as a society, is just beginning.
What comes next: Human-machine dialogue in the post-Turing era
With GPT-4.5 crossing the Turing threshold, we are no longer asking whether machines can talk like us — we are now asking what that means for how we speak, think, and relate to machines. That moment represents a paradigm shift: from testing the machine’s ability to imitate humans to understanding how humans will adapt to coexist with machines that no longer feel entirely artificial.
Future AI models will likely push this boundary even further — engaging in conversations that are not only coherent but also deeply contextual, emotionally attuned, and morally responsive. The bar for what feels ‘human’ in digital interaction is rising rapidly, and with it comes the need for new social norms, protocols, and perhaps even new literacies.
We will need to learn not only how to talk to machines but how to live with them — as collaborators, counterparts, and, in some cases, as reflections of ourselves. In the post-Turing era, the test is no longer whether machines can fool us — it is whether we can maintain clarity, responsibility, and humanity in a world where the artificial feels increasingly real.
GPT-4.5 may have passed a historic milestone, but the real story is just beginning — not one of machines becoming human, but of humans redefining what it means to be ourselves in dialogue with them.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
Grok, the AI chatbot from Elon Musk’s xAI, is reportedly gaining a memory feature that allows it to recall previous conversations, bringing it in line with rivals like ChatGPT and Google Gemini.
The feature, spotted by users in the web app, appears as a ‘Personalise with Memories’ toggle in settings and promises to help Grok retain useful context across chats. Users will have the ability to manage what Grok remembers and delete memories when needed, a growing standard in user-controlled AI tools.
The memory update is part of a broader wave of improvements rolling out to Grok, which aims to evolve from a novelty chatbot into a serious digital assistant.
Vision support for voice mode is in development, allowing users to point their camera at objects and receive spoken analysis, while image editing tools are being enhanced to allow stylistic changes to uploaded pictures.
Grok is also preparing to integrate with Google Drive and introduce a new collaborative ‘Workspaces’ feature for larger projects.
These upgrades arrive ahead of the expected release of Grok 3.5, with version 4 planned by year’s end. While the chatbot has carved a niche with its sarcastic tone, xAI appears to be refocusing Grok on practical tasks and creative support.
Whether it can rival the maturity and coherence of more established competitors remains to be seen, but Grok is clearly evolving — and now, it finally remembers who you are.
For more information on these topics, visit diplomacy.edu.
Spotify has introduced its Ads Exchange (SAX) and Generative AI-powered advertisements in India, following a successful pilot in the US and Canada.
The SAX platform aims to give advertisers better control over performance tracking and maximise reach without overloading users with repetitive ads.
Integrated with platforms such as Google DV360, The Trade Desk, and Magnite, SAX enables advertisers to access Spotify’s high-quality inventory and enhance their programmatic strategies. In addition to multimedia formats, podcast ads will soon be included.
Through Generative AI, advertisers can create audio ads within Spotify’s Ads Manager platform at no extra cost, using scripts, voiceovers, and licensed music.
An innovation like this allows brands to produce more ads in shorter intervals with less effort, making the process quicker and more efficient for reaching a broader audience. Arjun Kolady, Head of Sales – India at Spotify, highlighted the ease of scaling campaigns with these new tools.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
Samsung, already the leading home appliance brand in India by volume, is now enhancing its after-sales service with an AI-powered support tool.
The tech company from South Korea has introduced the Home Appliances Remote Management (HRM) tool, designed to improve service speed, accuracy, and overall customer experience instead of sticking with traditional support methods.
The HRM tool allows customer care teams to remotely diagnose and resolve issues in Samsung smart appliances connected via SmartThings. If a problem can be fixed remotely, staff will ask for the user’s consent before taking control of the device.
If the issue can be solved by the customer, step-by-step instructions are provided instead of sending a technician straight away.
When neither of these options applies, the issue is forwarded directly to service technicians with full diagnostics already completed, cutting down the time spent on-site.
The new system reduces the need for in-home visits, shortens waiting times, and increases the uptime of appliances instead of leaving users waiting unnecessarily.
SmartThings also plays a proactive role by automatically detecting issues and offering solutions before customers even need to call.
Samsung India’s Vice President for Customer Satisfaction, Sunil Cutinha, noted that the tool significantly streamlines service, boosts maintenance efficiency, and helps ensure timely product support for users across the country.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
Nvidia is shifting its AI supercomputer manufacturing operations to the United States for the first time, instead of relying on a globally dispersed supply chain.
In partnership with industry giants such as TSMC, Foxconn, and Wistron, the company is establishing large-scale facilities to produce its advanced Blackwell chips in Arizona and complete supercomputers in Texas. Production is expected to reach full scale within 12 to 15 months.
Over a million square feet of manufacturing space has been commissioned, with key roles also played by packaging and testing firms Amkor and SPIL.
The move reflects Nvidia’s ambition to create up to half a trillion dollars in AI infrastructure within the next four years, while boosting supply chain resilience and growing its US-based operations instead of expanding solely abroad.
These AI supercomputers are designed to power new, highly specialised data centres known as ‘AI factories,’ capable of handling vast AI workloads.
Nvidia’s investment is expected to support the construction of dozens of such facilities, generating hundreds of thousands of jobs and securing long-term economic value.
To enhance efficiency, Nvidia will apply its own AI, robotics, and simulation tools across these projects, using Omniverse to model factory operations virtually and Isaac GR00T to develop robots that automate production.
According to CEO Jensen Huang, bringing manufacturing home strengthens supply chains and better positions the company to meet the surging global demand for AI computing power.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
Chinese AI startup Zhipu AI has introduced a free AI agent, AutoGLM Rumination, aimed at assisting users with tasks such as web browsing, travel planning, and drafting research reports.
The product was unveiled by CEO Zhang Peng at an event in Beijing, where he highlighted the agent’s use of the company’s proprietary models—GLM-Z1-Air for reasoning and GLM-4-Air-0414 as the foundation.
According to Zhipu, the new GLM-Z1-Air model outperforms DeepSeek’s R1 in both speed and resource efficiency. The launch reflects growing momentum in China’s AI sector, where companies are increasingly focusing on cost-effective solutions to meet rising demand.
AutoGLM Rumination stands out in a competitive landscape by being freely accessible through Zhipu’s official website and mobile app, unlike rival offerings such as Manus’ subscription-only AI agent. The company positions this move as part of a broader strategy to expand access and adoption.
Founded in 2019 as a spinoff from Tsinghua University, Zhipu has developed the GLM model series and claims its GLM4 has surpassed OpenAI’s GPT-4 on several evaluation benchmarks.
In March, Zhipu secured major government-backed investment, including a 300 million yuan (US$41.5 million) contribution from Chengdu.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!