Publishers challenge OpenAI over alleged copyright infringement
Britannica and Merriam-Webster are suing OpenAI over the use of copyrighted training data.
Legal pressure is increasing on OpenAI as Encyclopaedia Britannica and Merriam-Webster file a lawsuit accusing the company of large-scale copyright violations.
According to the complaint, nearly 100,000 copyrighted articles were allegedly used without authorisation to train large language models. Publishers also argue that AI-generated outputs can reproduce parts of their content, raising concerns about unauthorised distribution.
Additional claims focus on how AI systems retrieve and present information. The lawsuit argues that retrieval-augmented generation tools may rely on proprietary databases, potentially undermining publishers’ business models by reducing traffic to original sources.
Concerns are also raised about inaccurate outputs attributed to publishers, which could affect trust in established information providers. The case highlights ongoing tensions between AI development and intellectual property protections.
Growing legal disputes involving media organisations, including The New York Times, suggest that courts will play a key role in defining how copyrighted material can be used in AI training.
Would you like to learn more about AI, tech and digital diplomacy? If so, ask our Diplo chatbot!
