12 Jan 2025

Meta accused of using pirated books for AI

Prominent authors sue the social media giant over copyright violations.

A group of authors, including Ta-Nehisi Coates and Sarah Silverman, has accused Meta Platforms of using pirated books to train its AI systems with CEO Mark Zuckerberg’s approval. Newly disclosed court documents filed in California allege that Meta knowingly relied on the LibGen dataset, which contains millions of pirated works, to develop its large language model, Llama.

The lawsuit, initially filed in 2023, claims Meta infringed on copyright by using the authors’ works without permission. The authors argue that internal Meta communications reveal concerns within the company about the dataset’s legality, which were ultimately overruled. Meta has not yet responded to the latest allegations.

The case is one of several challenging the use of copyrighted materials to train AI systems. While defendants in similar lawsuits have cited fair use, the authors contend that newly uncovered evidence strengthens their claims. They have requested permission to file an updated complaint, adding computer fraud allegations and revisiting dismissed claims related to copyright management information.

US District Judge Vince Chhabria has allowed the authors to file an amended complaint but expressed doubts about the validity of some new claims. The outcome of the case could have broader implications for how AI companies utilise copyrighted content in training data.

Meta accused of using pirated books for AI

Related topics

Related technologies

Related videos

Related news