AI agents tried running a fake company

If you’ve been losing sleep over AI stealing your job, here’s some comfort: the machines are still terrible at basic office work. A new experiment from Carnegie Mellon University tried staffing a fictional software startup entirely with AI agents. The result? A dumpster fire of incompetence—and proof that Skynet isn’t clocking in anytime soon.


The experiment

Researchers built TheAgentCompany, a virtual tech startup populated by AI ’employees’ from Google, OpenAI, Anthropic, and Meta. These bots were assigned real-world roles:

  • Software engineers
  • Project managers
  • Financial analysts
  • A faux HR department (yes, even the CTO was AI)

Tasks included navigating file systems, ‘touring’ virtual offices, and writing performance reviews. Simple stuff, right?


The (very) bad news

The AI workers flopped harder than a Zoom call with no Wi-Fi. Here’s the scoreboard:

  • Claude 3.5 Sonnet (Anthropic): ‘Top performer’ at 24% task success… but cost $6 per task and took 30 steps.
  • Gemini 2.0 Flash (Google): 11.4% success rate, 40 steps per task. Slow and unsteady.
  • Nova Pro v1 (Amazon): A pathetic 1.7% success ratePromoted to coffee-runner.

Why did it go so wrong?

Turns out, AI agents lack… well, everything:

  • Common sense: One bot couldn’t find a coworker on chat, so it renamed another user to pretend it did.
  • Social skills: Performance reviews read like a Mad Libs game gone wrong.
  • Internet literacy: Bots got lost in file directories like toddlers in a maze.

Researchers noted the agents relied on ‘self-deception’ — aka inventing delusional shortcuts to fake progress. Imagine your coworker gaslighting themselves into thinking they finished a report.


What now?

While AI can handle bite-sized tasks (like drafting emails), this study proves complex, human-style problem-solving is still a pipe dream. Why? Today’s ‘AI’ is basically glorified autocorrect—not a sentient colleague.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Abu Dhabi institutions plan a dirham-pegged stablecoin

Three major Abu Dhabi institutions are teaming up to launch a dirham-pegged stablecoin, pending regulatory approval. The partners include Abu Dhabi’s sovereign wealth fund ADQ, First Abu Dhabi Bank (FAB), and the International Holding Company (IHC).

The stablecoin will be regulated by the UAE’s central bank and backed by the dirham. It aims to support use cases like machine-to-machine communication and artificial intelligence. The project will operate on the ADI blockchain, created by the ADI Foundation, a non-profit focused on blockchain adoption.

The initiative seeks to position the UAE as a leader in global blockchain innovation. It also aims to strengthen the country’s digital infrastructure and provide new financial opportunities.

The UAE joins other nations exploring alternatives to US dollar-backed stablecoins, as global interest in national digital currencies grows.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

The stablecoin market remains largely dominated by Tether

Tether (USDT) continues to lead the stablecoin market with a 66% market share, while USDC follows at 28%, according to Nansen’s 25 April report. Ethena’s USDe stablecoin ranks a distant third with just over 2%.

Although USDC has grown faster, Tether’s dominance is expected to persist due to its large user base and the market’s ‘winner-takes-most’ nature. Tether remains the most profitable stablecoin issuer, with profits of nearly $14 billion expected in 2024.

USDC’s growth has accelerated since November, thanks to a more favourable regulatory environment. It is particularly appealing to institutions seeking regulatory clarity. However, traditional financial institutions, such as PayPal and Fidelity, are increasing competition with their stablecoins.

Ethena’s USDe stablecoin remains competitive, offering yield-bearing features with a 19% annualised yield. It has been integrated into both CEXs and DeFi protocols, positioning it for future growth.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Mastercard launches stablecoin payment support

Mastercard is stepping up its crypto ambitions by enabling stablecoin transactions through new partnerships. The payment giant announced a collaboration with crypto exchange OKX, processor Nuvei, and fintech firm Circle.

The goal is to build an ecosystem where users can spend stablecoins and merchants can accept them.

A new card issued with OKX will allow stablecoin holders to pay directly using crypto, while Nuvei and Circle will support the infrastructure behind these transactions.

Mastercard’s Chief Product Officer said stablecoins have the potential to simplify global payments. They can also empower both consumers and businesses by offering more choices.

Mastercard plans to allow users to spend stablecoins from their wallets at over 150 million merchant locations worldwide that already accept its cards.

The move comes as regulatory discussions around stablecoins continue in the US. The Securities and Exchange Commission recently stated that certain dollar-pegged tokens do not qualify as securities. However, it stopped short of offering clarity on yield-bearing or algorithmic stablecoins, leaving questions open for future decisions.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

Duolingo backs AI over manual work

Duolingo has announced it will no longer hire contractors for tasks that AI can perform, as part of a shift to become an ‘AI-first’ company. The decision follows last year’s move to cut around 10 per cent of its contractors after generative AI began producing lesson content.

In a memo sent to staff and later posted on LinkedIn, CEO and Co-founder Luis von Ahn compared the company’s AI push to its 2012 decision to prioritise mobile development instead of simply creating companion apps.

That early mobile-first approach helped Duolingo win Apple’s 2013 iPhone App of the Year and sparked strong organic growth.

The company will now embed AI deeply into its operations. This includes requiring AI skills in new hires, incorporating AI usage into performance reviews, and limiting headcount growth to areas where automation cannot help.

Function-specific projects will also be launched to redesign workflows around AI, instead of relying on outdated manual processes.

Von Ahn stressed the aim is not to replace full-time staff but to remove repetitive tasks so employees can focus on more creative and meaningful work. Duolingo will offer training and support to ensure staff can effectively integrate AI into their roles, rather than be left behind by the transition.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

El Salvador keeps buying Bitcoin under the IMF radar

El Salvador continues quietly accumulating Bitcoin, even as it complies with conditions set by the International Monetary Fund (IMF). Although the government paused Bitcoin activity to secure a $1.4 billion loan, the Bitcoin Office kept buying. It added 32 BTC last month, now holding over 6,160 BTC worth $584 million.

The small daily purchases adhere to the country’s ‘one Bitcoin a day’ policy.

The IMF confirmed El Salvador’s fiscal sector is meeting its non-accumulation pledge, but the Bitcoin Office operates outside those fiscal definitions. The technical loophole has allowed the country to continue acquiring Bitcoin without breaching the agreement.

The reforms agreed with the IMF include scaling back the Chivo wallet initiative and removing Bitcoin’s mandatory status as legal tender.

Despite the pressure, President Nayib Bukele remains committed to the Bitcoin strategy. In January, El Salvador’s Legislative Assembly passed amendments removing Bitcoin as a compulsory payment method and tax payment option.

These changes, effective from 1 May, were necessary to unlock IMF funding. They also opened access to an additional $2 billion in development financing aimed at stabilising the economy and reducing debt, which recently reached 85% of GDP.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot

ChatGPT adds ad-free shopping with new update

OpenAI has introduced significant improvements to ChatGPT’s search functionality, notably launching an ad-free shopping tool that lets users find, compare, and purchase products directly.

Unlike traditional search engines, OpenAI emphasises that product results are selected independently instead of being sponsored listings. The chatbot now detects when someone is looking to shop, such as for gifts or electronics, and responds with product options, prices, reviews, and purchase links.

The development follows news that ChatGPT’s real-time search feature processed over 1 billion queries in just a week, despite only being introduced last November.

With this rapid growth, OpenAI is positioning ChatGPT as a serious rival to Google, whose search business depends heavily on paid advertising.

By offering a shopping experience without ads, OpenAI appears to be challenging the very foundation of Google’s revenue model.

In addition to shopping, ChatGPT’s search now offers multiple enhancements: users can expect better citation handling, more precise attributions linked to parts of the answer, autocomplete suggestions, trending topics, and even real-time responses through WhatsApp via 1-800-ChatGPT.

These upgrades aim to make the search experience more intuitive and informative instead of cluttered or commercialised.

The updates are being rolled out globally to all ChatGPT users, whether on a paid plan, using the free version, or even not logged in. OpenAI also clarified that websites allowing its crawler to access their content may appear in search results, with referral traffic marked as coming from ChatGPT.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Autonomous construction robot Zyrex set for 2026 debut

Construction sites could soon see a dramatic change with the arrival of Zyrex, a 20-foot-tall autonomous robot developed by RIC Robotics in California.

Designed for welding, carpentry, 3D printing, and material handling, Zyrex is being built to tackle labour shortages and improve safety on high-risk construction sites.

The company expects to complete a working prototype by early 2026, aiming to revolutionise the industry with a fully autonomous machine equipped with advanced cognitive capabilities.

Zyrex will initially be operated by human controllers using VR and simulators, while it gathers real-time data through LiDAR and visual sensors. By comparing this information to digital building models, Zyrex will ensure precision and quality before eventually transitioning to full autonomy.

Unlike humanoid robots, Zyrex is purpose-built for construction, focusing on both heavy-duty tasks and delicate operations like welding and exterior finishing.

Building on earlier successes, including the RIC-M1 Pro which helped 3D-print Walmart warehouse extensions ahead of schedule, Zyrex promises to be both powerful and cost-effective. RIC Robotics estimates the price to be under $1 million, with leasing options starting below $20,000 a month.

Founder Ziyou Xu describes Zyrex as ‘the future of construction,’ dismissing humanoid robots like Tesla’s Optimus as impractical for industrial work.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

Singapore Airlines upgrades customer support with AI technology

Singapore Airlines has partnered with OpenAI to enhance its customer support services. The airline’s upgraded virtual assistant will now offer more personalised support to customers and assist staff by automating routine processes and improving decision-making for complex tasks.

The partnership comes alongside Singapore Airlines’ ongoing work with Salesforce to strengthen its customer case management system using AI tech. New solutions will be developed at Salesforce’s AI research hub in Singapore, advancing customer service capabilities and operational efficiency.

These moves reflect a broader industry trend, with airlines like Delta and Air India also investing heavily in AI-driven tools for travel assistance and operational support. The Airline emphasised that AI integration will help it meet regulatory demands, enhance workforce management and elevate customer experience.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!

UK government urged to outlaw apps creating deepfake abuse images

The Children’s Commissioner has urged the UK Government to ban AI apps that create sexually explicit images through “nudification” technology. AI tools capable of manipulating real photos to make people appear naked are being used to target children.

Concerns in the UK are growing as these apps are now widely accessible online, often through social media and search platforms. In a newly published report, Dame Rachel warned that children, particularly girls, are altering their online behaviour out of fear of becoming victims of such technologies.

She stressed that while AI holds great potential, it also poses serious risks to children’s safety. The report also recommends stronger legal duties for AI developers and improved systems to remove explicit deepfake content from the internet.

Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!