Research unveils AI overreliance on memorisation

Recent research from MIT’s Computer Science and Artificial Intelligence Laboratory (CSAIL) has uncovered significant insights into the capabilities of large language models (LLMs). The study found that while LLMs excel in familiar scenarios, they struggle with novel tasks, raising questions about their true reasoning abilities versus reliance on memorisation.

The researchers compared LLMs’ performance on common tasks to hypothetical scenarios that deviated from their training data. For instance, models like GPT-4 showed proficiency in arithmetic using base-10 but faltered with other number bases, indicating a lack of generalisable addition skills. The pattern was consistent across various tasks, including spatial reasoning and chess, where models performed no better than random guessing in unfamiliar settings.

Lead author Zhaofeng Wu emphasised the importance of these findings, noting that as AI becomes more integrated into society, it must handle diverse scenarios reliably. The study’s insights aim to inform the development of more adaptable and robust future LLMs. The team plans to expand their research to include more complex and varied tasks, further exploring AI’s limitations and improving interpretability.

Supported by the MIT–IBM Watson AI Lab, the MIT Quest for Intelligence, and the National Science Foundation, the study was presented at the North American Chapter of the Association for Computational Linguistics (NAACL).

HR embraces AI in hiring

A recent survey reveals that 66% of HR leaders have a more positive view of AI in the workplace compared to a year ago. Commissioned by recruitment platform HireVue, the research also found that 67% believe AI is as effective or better than humans at identifying qualified applicants.

Linsey Zuloaga, HireVue’s chief data scientist, highlighted AI’s potential to streamline hiring by automating repetitive tasks and improving candidate communication.

The survey also showed that 64% of candidates feel AI is as fair or fairer than human recruiters, with 49% believing AI can address bias in hiring. Despite this, 75% of candidates oppose AI making final hiring decisions. Zuloaga emphasised the importance of transparency, suggesting HR departments clearly communicate how AI is used in the hiring process to build trust.

Rich Bye of Workday noted that attitudes toward AI improve as its benefits become apparent, such as increased efficiency and reliability in candidate screening.

However, the survey found that 42% of HR professionals are waiting for corporate guidelines on generative AI, and 33% have implemented AI without formal approval. Zuloaga advised HR leaders to ensure potential AI tools comply with ethical and regulatory standards before implementation.

AI model improves speed and accuracy of heart MRI analysis

Researchers from the Universities of East Anglia, Sheffield, and Leeds have developed an AI model to examine heart images from MRI scans. The model uses a four-chamber plane view to quickly and accurately determine the size and function of the heart’s chambers. 

Dr Pankaj Garg from UEA’s Norwich Medical School stated that while manual MRI analysis can take up to 45 minutes, the AI model performs the task in just a few seconds.

The study used data from 814 patients at Sheffield and Leeds hospitals to train the AI model, with additional testing on 101 patients from Norfolk and Norwich University Hospitals. 

Unlike previous studies, this model was trained on diverse data from multiple hospitals and scanner types, providing a comprehensive analysis of all four heart chambers. Dr Hosamadin Assadi from UEA highlighted the model’s potential to improve diagnosis, treatment decisions, and patient outcomes.

Future research will focus on testing the AI model with larger patient groups from different hospitals and scanner types. The study was a collaboration between several universities and NHS trusts, supported by the Wellcome Trust Clinical Research Career Development Fellowship.

Go legend Lee Saedol defeated by AI in landmark match

Lee Saedol, once the world’s top Go player, experienced a turning point in 2016 when he was defeated by AlphaGo, an AI program developed by Google’s DeepMind. This unexpected loss highlighted the significant advancements in AI, showcasing its ability to master complex tasks previously considered exclusive to human expertise.

AlphaGo’s victory over Lee, an 18-time world champion, demonstrated AI’s potential to achieve superhuman proficiency in skills such as Go, a game known for its complexity and strategic depth. The match, which garnered global attention, revealed the profound impact AI could have on various fields beyond board games.

Following his defeat, Lee Saedol retired, acknowledging that AI had fundamentally changed the nature of Go. Now, at 41, he urges others to familiarise themselves with AI technology to avoid being unprepared for its widespread implications. He lectures about AI, emphasising its growing influence and the need for society to adapt.

Despite his initial shock, Lee remains engaged with the Go community, writing books and founding a Go academy for children. He frequently discusses AI’s future impact on his family, particularly its influence on job markets and everyday life, underscoring the importance of choosing careers resilient to AI advancements.

Microsoft reveals VALL-E 2 AI, achieving human-like speech

Microsoft has made a significant leap forward in AI speech generation with its VALL-E 2 text-to-speech (TTS) system. VALL-E 2 achieves human parity, meaning it can produce voices indistinguishable from real people. The system only needs a few seconds of audio to learn and mimic a speaker’s voice.

Tests on speech datasets like LibriSpeech and VCTK showed that VALL-E 2’s voice quality matches or even surpasses human quality. Features like ‘Repetition Aware Sampling’ and ‘Grouped Code Modeling’ allow the system to handle complex sentences and repetitive phrases naturally, ensuring smooth and realistic speech output.

Despite releasing audio samples, Microsoft considers VALL-E 2 too advanced for public release due to potential misuse like voice spoofing. This cautious approach aligns with the wider industry’s concerns, as seen with OpenAI’s restrictions on its voice technology.

While VALL-E 2 represents a significant breakthrough, it remains a research project for now. The development of AI continues apace, with companies striving to balance innovation with ethical considerations.

Moroccan AI influencer wins Miss AI title

In a groundbreaking event, Kenza Layli, an AI-generated Moroccan influencer, has been crowned the first Miss AI. Layli, created by Myriam Bessa of the Phoenix AI agency, aims to bring diversity and inclusivity to the AI creator landscape. With nearly 200,000 Instagram followers and 45,000 on TikTok, Layli is entirely AI-generated, from her images to her captions and acceptance speech.

The inaugural Miss AI contest, organized by the influencer platform Fanvue, attracted entries from 1,500 AI programmers worldwide. Layli’s creator, Myriam Bessa, will receive $5,000, support on Fanvue, and a publicist to elevate Layli’s profile. Runners-up included AI contestants Lalina Valina from France and Olivia C from Portugal.

Unlike earlier virtual influencers, these contestants were created solely using AI programs such as DALL·E 3, Midjourney, and Stable Diffusion, with their speeches and posts generated by ChatGPT. Layli’s Instagram page features her fondness for the color red, motivational advice, and support for her national sports team.

Judges, including AI influencer Aitana Lopez and human pageantry historian Sally-Ann Fawcett, assessed contestants on looks, AI tool usage, and social media influence. Despite Layli’s unique representation, experts warn that AI beauty pageants may further homogenize beauty standards, reflecting existing biases in society.

London startup founded by former Snapchat employee secures $4m for AI in gaming

London-based games startup Iconic AI has secured $4m (£3m) in a pre-seed funding round. Founded in 2023 by former Snapchat employee John Lusty, the company aims to revolutionise high-budget game development using AI.

Lusty believes AI can expedite the development process, which has become increasingly costly and time-consuming.

Despite the global gaming industry generating significant revenue, many studios have been downsizing. Earlier this year, Microsoft and Bethesda announced the closure of several game development studios. Lusty is confident that Iconic AI can enhance human creativity rather than replace it, making game development more efficient and secure for developers.

The funding round was led by HodlCo, with participation from FOV Ventures, Interface Capital, Deepwater Asset Management, and scout funds from Sequoia and Atomico. Angel investors included former senior executives from DeepMind, OpenAI, Disney, Tencent, and Microsoft. Lusty believes their AI-driven approach will create more jobs and enable smaller teams to produce numerous games efficiently.

New AI partnership focuses on early lung cancer diagnosis

Personalised medicine company Spesana and Imidex, developer of computer-aided detection technology, have announced a strategic partnership to explore AI’s impact on lung cancer detection. The collaboration will combine Imidex’s VisiRad XR detection algorithm and Spesana’s medical data platform to study the detection rates of lung nodules and masses in existing chest x-rays.

The clinical trial aims to quantify how many additional lung masses can be identified, identify at-risk patients for clinical trials, and evaluate the use of liquid biopsies resulting from nodule detection. Carla Balch, CEO of Spesana, envisions early lung cancer detection leading to earlier treatment and better patient outcomes.

Wes Bolsen, CEO of Imidex, highlighted that their FDA-cleared algorithm will improve the screening of potential lung cancer patients. The collaboration aims to equip healthcare providers and pharmaceutical companies with tools to detect lung nodules earlier, optimising healthcare resources and improving patient outcomes.

SoftBank group acquires AI chipmaker Graphcore

SoftBank Group, the Japanese multinational investment holding company, has acquired Graphcore, a British AI chipmaker, in a strategic business move that ends speculation about Graphcore’s future amid financial struggles. Once positioned as a competitor to Nvidia, Graphcore has faced challenges securing sufficient investment despite its technology potential.

Graphcore, valued at $2.77 billion in 2020, had been grappling with financial viability, including layoffs and operational closures. CEO Nigel Toon acknowledged the company’s difficulties but expressed optimism about the deal with SoftBank, highlighting the substantial resources it brings.

Toon emphasised the significant investment from SoftBank, noting its transformative impact on Graphcore’s global competitiveness. However, he pointed out structural barriers in the UK tech industry, such as limited domestic investment from pension funds, hindering growth opportunities.

Regarding potential collaboration with SoftBank-owned Arm Holdings, a leading chip designer, Toon indicated Graphcore’s intention to leverage synergies within SoftBank’s portfolio, although specifics were not disclosed.

OpenAI introduces a five-tier system to measure AI progress

OpenAI has launched a five-tier system to measure its progress towards developing AI that can surpass human performance. The new classification aims to provide clearer insights into the company’s approach to AI safety and future goals. The system, unveiled to employees during an all-hands meeting, outlines stages from conversational AI to advanced AI that are capable of running an entire organisation.

Currently, OpenAI is at the first level but is approaching the second stage, called ‘Reasoners.’ That level represents AI systems that can perform basic problem-solving tasks comparable to a human with a doctorate but without additional tools. During the meeting, leadership showcased a research project involving the GPT-4 model, demonstrating new capabilities that exhibit human-like reasoning.

The five-tier framework is still a work in progress, with plans to gather feedback from employees, investors, and the board. OpenAI’s ultimate goal is to create artificial general intelligence (AGI), which involves developing AI that outperforms humans in most tasks. CEO Sam Altman remains optimistic that AGI could be achieved within this decade.