Big Tech’s AI models fall short of new EU AI Act’s standards
Prominent AI models fail to meet the EU regulations, particularly in cybersecurity resilience and non-discriminatory output.
A recent assessment of some of the top AI models has revealed significant gaps in compliance with the EU regulations, particularly in cybersecurity resilience and preventing discriminatory outputs. The study by Swiss startup LatticeFlow in collaboration with the EU officials, tested generative AI models from major tech companies like Meta, OpenAI, and Alibaba. The findings are part of an early attempt to measure compliance with the EU’s upcoming AI Act, which will be phased in over the next two years. Companies that fail to meet these standards could face fines of up to €35 million or 7% of their global annual turnover.
LatticeFlow’s ‘Large Language Model (LLM) Checker’ evaluated the AI models across multiple categories, assigning scores between 0 and 1. While many models received respectable scores, such as Anthropic’s ‘Claude 3 Opus,’ which scored 0.89, others revealed vulnerabilities. For example, OpenAI’s ‘GPT-3.5 Turbo’ received a low score of 0.46 for discriminatory output, and Alibaba’s ‘Qwen1.5 72B Chat’ scored even lower at 0.37, highlighting the persistent issue of AI reflecting human biases in areas like gender and race.
In cybersecurity testing, some models also struggled. Meta’s ‘Llama 2 13B Chat’ scored 0.42 in the ‘prompt hijacking’ category, a type of cyberattack where malicious prompts are used to extract sensitive information. Mistral’s ‘8x7B Instruct’ model fared similarly poorly, scoring 0.38. These results show the need for tech companies to strengthen security measures to meet the EU’s strict standards.
While the EU is still finalising the enforcement details of its AI Act, expected by 2025, LatticeFlow’s test provides an early roadmap for companies to fine-tune their models. LatticeFlow CEO Petar Tsankov expressed optimism, noting that the test results are mainly positive and offer guidance for companies to improve their models’ compliance with the forthcoming regulations.
The European Commission, though unable to verify external tools, has welcomed this initiative, calling it a ‘first step’ toward translating the AI Act into enforceable technical requirements. As tech companies prepare for the new rules, the LLM Checker is expected to play a crucial role in helping them ensure compliance.