9 Jan 2024

AI industry faces threat of copyright law in 2024

Copyright laws are set to provide a substantial challenge to the artificial intelligence (AI) sector in 2024, particularly in the context of generative AI (GenAI) technologies becoming pervasive in 2023. At the heart of the matter lie concerns about the use of copyrighted material to train AI systems and the generation of results that may be significantly similar to existing copyrighted works. Legal battles are predicted to affect the future of AI innovation and may even change the industry’s economic models and overall direction.
According to tech companies, the lawsuits could create massive barriers to the expanding AI sector. On the other hand, the plaintiffs claim that the firms owe them payment for using their work without fair compensation or authorization.

Legal Challenges and Industry Impact

AI programs that generate outputs comparable to existing works could infringe on copyrights if they had access to the works and produced substantially similar outcomes. In late December 2023, the New York Times was the first American news organization to file a lawsuit against OpenAI and its backer Microsoft, asking the court to erase all large language models (LLMs), including the famous chatbot ChatGPT, and all training datasets that rely on the publication’s copyrighted content. The prominent news media is alleging that their AI systems engaged in ‘widescale copying’, which is a violation of copyright law.
This high-profile case illustrates the broader legal challenges faced by AI companies. Authors, creators, and other copyright holders have initiated lawsuits to protect their works from being used without permission or compensation.

As recently as 5 January 2024, authors Nicholas Basbanes and Nicholas Gauge filed a new complaint against both OpenAI and its investor, Microsoft, alleging that their copyrighted works were used without authorization to train their AI models, including ChatGPT. In the proposed class action complaint, filed in federal court in Manhattan, they charge the companies with copyright infringement for putting multiple works by the authors in the datasets used to train OpenAI’s GPT large language model (LLM).

This lawsuit is one among a series of legal cases filed by multiple writers and organizations, including well-known names like George R.R. Martin and Sarah Silverman, alleging that tech firms utilised their protected work to train AI systems without offering any payment or compensation. The results of these lawsuits could have significant implications for the growing AI industry, with tech companies openly warning that any adverse verdict could create considerable hurdles and uncertainty.

Ownership and Fair Use

Questions about who owns the outcome generated by AI systems—whether it is the companies and developers that design the systems or the end users who supply the prompts and inputs—are central to the ongoing debate. The ‘fair use‘ doctrine, often cited by the United States Copyright Office (USCO), the United States Patent and Trademark Office (USPTO), and the federal courts, is a critical parameter, as it allows creators to build upon copyrighted work. However, its application to AI-generated content with models using massive datasets for training is still being tested in courts.

Policy and Regulation

The USCO has initiated a project to investigate the copyright legal and policy challenges brought by AI. This involves evaluating the scope of copyright in works created by AI tools and the use of copyrighted content in training foundational and LLM-powered AI systems. This endeavour is an acknowledgement of the need for clarification and future regulatory adjustments to address the pressing issues at the intersection of AI and copyright law.

Industry Perspectives

Many stakeholders in the AI industry argue that training generative AI systems, including LLMs and other foundational models, on the large and diverse content available online, most of which is copyrighted, is the only realistic and cost-effective method to build them. According to the Silicon Valley venture capital firm Andreessen Horowitz, extending copyright rules to AI models would potentially constitute an existential threat to the current AI industry.

Why does it matter?

The intersection of AI and copyright law is a complex issue with significant implications for innovation, legal liability, ownership rights, commercial interests, policy and regulation, consumer protection, and the future of the AI industry.

The AI sector in 2024 is at a crossroads with existing copyright laws, particularly in the US. The legal system’s reaction to these challenges will be critical in striking the correct balance between preserving creators’ rights and promoting AI innovation and progress. As lawsuits proceed and policymakers engage with these issues, the AI industry may face significant pressure to adapt, depending on the legal interpretations and policy decisions that will emerge from the ongoing processes. Ultimately, these legal fights could determine who the market winners and losers would be.