OpenAI’s use of Scarlett Johansson’s voice faces Hollywood backlash

OpenAI’s use of Scarlett Johansson’s voice likeness in its AI model, ChatGPT, has ignited controversy in Hollywood, with Johansson accusing the company of copying her performance from the movie ‘Her’ without consent. The dispute has intensified concerns among entertainment executives about the implications of AI technology for the creative industry, particularly regarding copyright infringement and the right to publicity.

Despite OpenAI’s claims that the voice in question was not intended to resemble Johansson’s, the incident has strained relations between content creators and tech companies. Some industry insiders view OpenAI’s actions as disrespectful and indicative of hubris, potentially hindering future collaborations between Hollywood and the tech giant.

The conflict with Johansson highlights broader concerns about using copyrighted material in OpenAI’s models and the need to protect performers’ rights. While some technologists see AI as a valuable tool for enhancing filmmaking processes, others worry about its potential misuse and infringement on intellectual property.

Johansson’s case could set a precedent for performers seeking to protect their voice and likeness rights in the age of AI. Legal experts and industry figures advocate for federal legislation to safeguard performers’ rights and address the growing impact of AI-generated content, signalling a broader dialogue about the need for regulatory measures in this evolving landscape.

Uganda minister urges stronger digital regulations for cultural diversity and artists’ rights

During World Culture Day in Kampala, Minister of State for Gender and Culture, Peace Mutuuzo, highlighted the urgent need for stronger regulation of digital platforms to protect cultural diversity, safeguard artists’ intellectual property, and ensure fair access to content. She noted the concern about the dominance of digital platforms in cultural content distribution, which poses challenges for artists in protecting their intellectual property and securing fair compensation.

This year’s World Culture Day, themed “Digital Transformation of the Culture and Creative Industries: Packaging Art and Culture as a National Public Good,” calls for updated legal structures to support digital transformation while ensuring accessibility and benefits for all.

Mutuuzo stressed that the government remains committed to strengthening the culture and creative industry through new and existing policies and legal frameworks. As part of this effort, the commemorative day aims to raise public awareness about culture’s role in development, deepen understanding of cultural diversity, and encourage appreciation of Uganda’s heritage, as guaranteed by its Constitution. It also seeks to advance the UNESCO Convention on the Protection and Promotion of the Diversity of Cultural Expression’s goals, including sustainable governance for culture, balanced cultural exchanges, increased mobility for artists, integrating culture into development, and promoting human rights.

Why does it matter?

In this context, the music industry, in particular, faces significant challenges with the growth of digital platforms. Uganda and other countries share these concerns. The rapid rise of AI-generated content, exemplified by the incident where a song mimicking Drake and The Weeknd was released, uncovered the need for the music industry to adapt to technological advancements. Early this year, the EU proposed changes to the music streaming industry to promote smaller artists and ensure fair compensation by addressing inadequate royalties and biassed algorithms. Meanwhile, the Online Streaming Act has introduced new regulations for digital distributors and media in Canada, potentially including new CanCon requirements.

Scarlett Johansson slams OpenAI for voice likeness

Scarlett Johansson has accused OpenAI of creating a voice for its ChatGPT system that sounds ‘eerily similar’ to hers despite declining an offer to voice the chatbot herself. Johansson’s statement, released Monday, followed OpenAI’s announcement to withdraw the voice known as ‘Sky’.

OpenAI CEO Sam Altman clarified that a different professional actress performed Sky’s voice and was not meant to imitate Johansson. He expressed regret for not communicating better and paused the use of Sky’s voice out of respect for Johansson.

Johansson revealed that Altman had approached her last September with an offer to voice a ChatGPT feature, which she turned down. She stated that the resemblance of Sky’s voice to her own shocked and angered her, noting that even her friends and the public found the similarity striking. The actress suggested that Altman might have intentionally chosen a voice resembling hers, referencing his tweet about ‘Her’, a film where Johansson voices an AI assistant.

Why does it matter?

The controversy highlights a growing issue in Hollywood concerning the use of AI to replicate actors’ voices and likenesses. Johansson’s concerns reflect broader industry anxieties as AI technology advances, making computer-generated voices and images increasingly indistinguishable from human ones. She has hired legal counsel to investigate the creation process of Sky’s voice.

OpenAI recently introduced its latest AI model, GPT-4o, featuring audio capabilities that enable users to converse with the chatbot in real-time, showcasing a leap forward in creating more lifelike AI interactions. Scarlett Johansson’s accusations underline the ongoing challenges and ethical considerations of using AI in entertainment.

US voice actors claim AI firm illegally copied their voices

Two voice actors have filed a lawsuit against AI startup Lovo in Manhattan federal court, alleging that the company illegally copied their voices for use in its AI voiceover technology without permission. Paul Skye Lehrman and Linnea Sage claim Lovo tricked them into providing voice samples under false pretences and is now selling AI versions of their voices. They seek at least $5 million in damages for the proposed class-action suit, accusing Lovo of fraud, false advertising, and violating their publicity rights.

The actors were approached via the freelance platform Fiverr for voiceover work, with Lehrman being told his voice would be used for a research project and Sage for test scripts for radio ads. However, Lehrman later discovered AI versions of his voice in YouTube videos and podcasts, while Sage found her voice in Lovo’s promotional materials. It was revealed that their Fiverr clients were actually Lovo employees, and the company was selling their voices under pseudonyms.

The mentioned lawsuit adds to the growing list of legal actions against tech companies for allegedly misusing content to train AI systems. Lehrman and Sage seek to prevent similar misuse of voices by Lovo and other companies, emphasising the need for accountability in the AI industry. Lovo has not yet responded to the allegations.

Dotdash Meredith partners with OpenAI for AI integration

Dotdash Meredith, a prominent publisher overseeing titles like People and Better Homes & Gardens, has struck a deal with OpenAI, marking a big step in integrating AI technology into the media landscape. The agreement involves utilising AI models for Dotdash Meredith’s ad-targeting product, D/Cipher, which will enhance its precision and effectiveness. Additionally, licensing content to ChatGPT, OpenAI’s language model, will expand the reach of Dotdash Meredith’s content to a wider audience, thereby increasing its visibility and influence.

Through this partnership, OpenAI will integrate content from Dotdash Meredith’s publications into ChatGPT, offering users access to a wealth of informative articles. Moreover, both entities will collaborate on developing new AI features tailored for magazine readers, indicating a forward-looking approach to enhancing reader engagement.

One key collaboration aspect involves leveraging OpenAI’s models to enhance D/Cipher, Dotdash Meredith’s ad-targeting platform. With the impending shift towards a cookie-less online environment, the publisher aims to bolster its targeting technology by employing AI, ensuring advertisers can reach their desired audience effectively.

Dotdash Meredith’s CEO, Neil Vogel, emphasised the importance of fair compensation for publishers in the AI landscape, highlighting the need for proper attribution and compensation for content usage. The stance reflects a broader industry conversation surrounding the relationship between AI platforms and content creators.

Why does it matter?

While Dotdash Meredith joins a growing list of news organisations partnering with OpenAI, not all have embraced such agreements. Some, like newspapers owned by Alden Global Capital, have pursued legal action against OpenAI and Microsoft, citing copyright infringement concerns. These concerns revolve around using their content in AI models without proper attribution or compensation. These contrasting responses underscore the complex dynamics as AI increasingly intersects with traditional media practices.

OpenAI to introduce content creator control in AI development

OpenAI has announced that it’s developing a tool to enhance ethical content usage for AI development. The tool, called Media Manager, allows content creators to specify the use of their work in AI training, aligning with the digital rights movement and addressing long-standing issues around content usage in a context where it faces a growing number of copyright infringement lawsuits.

The concept isn’t entirely new. It parallels the decades-old robots.txt standard used by web publishers to control crawler access to website content. Last summer, OpenAI adapted this idea, pioneering the use of similar permissions for AI, thus allowing publishers to set preferences for the use of their online content in AI model training.

However, many content creators do not control the websites where their content appears, and their work is often used in various forms across the internet, rendering previously proposed solutions as nonsufficient. The Media Manager tool comes as an attempt to create a more scalable and efficient solution for creators to assert control over their content’s use in AI systems. It is being developed as a comprehensive tool that will allow creators to register their content and specify inclusion or exclusion from AI research and training. OpenAI plans to enhance this tool over time with more features, supporting a broader range of creator needs. This initiative involves complex machine learning research to develop a system capable of identifying copyrighted text, images, audio, and video across diverse sources. 

OpenAI is collaborating with creators, content owners, and regulators to shape the Media Manager tool, with an expected launch by 2025. This collaborative approach aims to develop the tool in a way that meets the nuanced requirements of various stakeholders and sets a standard for the AI industry.

Why does it matter?

The significance of OpenAI’s Media Manager stems from its attempt to respond to the foundational discord of how AI interacts with human-generated content. By providing tools that respect and enforce the rights of creators, OpenAI is fostering a sustainable model where AI development is aligned with ethical and legal standards. This initiative is crucial for ensuring that AI technologies are developed in ways that do not exploit but instead respect and contribute positively to the creative economy. It sets a precedent for transparency and responsibility that could influence the entire AI industry towards more ethical practices.

Nvidia and Databricks sued for alleged copyright infringement in AI model development

Nvidia Corporation and Databricks Inc. face class-action lawsuits alleging copyright infringement in the creation of their AI models. The litigation highlights a growing concern over the use of copyrighted content without permission.

The lawsuits against Nvidia corp. and Databricks Inc., filed on March 8 by authors Abdi Nazemian, Brian Keene, and Stewart O’Nan in the U.S. District Court for the Northern District of California, argue that Nvidia’s NeMo Megatron and Databricks’ MosaicML models were trained on vast datasets containing millions of copyrighted works. Notably, the complaints suggest these models include content from well-known authors like Andre Dubus III and Susan Orlean, among others, without their consent. This has sparked a broader debate on whether such practices constitute fair use, as AI developers claim, or if they infringe upon the copyrights of individual creators.

The core of the dispute lies in how AI companies compile their training data. Reports indicate that some of the data used included copyrighted material from ‘shadow libraries’ like Bibliotik, which hosts and distributes unlicensed copies of nearly 200,000 books. The involvement of such sources in training datasets could potentially undermine the legality of the AI training process, which relies on the ingestion of large volumes of text to produce sophisticated AI outputs.

Legal experts and industry analysts are closely watching these cases, as the outcomes could set important precedents for the future of AI development. Companies like Nvidia have defended their practices, stating that their development processes comply with copyright laws and emphasizing the transformative nature of AI technology. However, the plaintiffs argue that this does not justify the unauthorized use of their work, which they claim undermines their financial and creative rights.

The lawsuits against Nvidia and Databricks are part of a larger trend of legal challenges that tech giants face regarding their development of AI technologies and using copyrighted materials to train large language models (LLMs), designed to process and generate human-like text.

OpenAI, the creator of the AI model known as ChatGPT, faced similar legal scrutiny when the New York Times filed a lawsuit against it, alleging that the company used copyrighted articles to train its language models without permission. 

These developments raise crucial questions about the balance between innovation and copyright protection in the digital context.

Snap introduces watermarks for AI-generated images

Social media company Snap announced its plans to add watermarks to AI-generated images on its platform, aiming to enhance transparency and protect user content. The watermark, featuring a small ghost with a sparkle icon, will denote images created using AI tools and will appear when the image is exported or saved to the camera roll. However, how Snap intends to detect and address watermark removal remains unclear, raising questions about enforcement methods.

This move aligns with efforts by other tech giants like Microsoft, Meta, and Google, who have implemented measures to label or identify AI-generated images. Snap currently offers AI-powered features like Lenses and a selfie-focused tool called Dreams for paid users, emphasising the importance of transparency and safety in AI-driven experiences.

Why does it matter?

In its commitment to ensuring equitable access and user expectations, Snap has partnered with HackerOne to stress-test its AI image-generation tools and established a review process to address potential biases in AI results. The company’s dedication to transparency extends to providing context cards with AI-generated images and implementing controls in the Family Center to monitor teen interactions with AI, following previous controversies surrounding inappropriate responses from the ‘My AI’ chatbot. As Snap continues to evolve its AI-powered features, its focus on transparency and safety underscores its commitment to fostering a positive and inclusive user experience on its platform.

US Congress proposes Generative AI Copyright Disclosure Act

A new bill introduced in the US Congress aims to require AI companies to disclose the copyrighted material they use to train their generative AI models. The bill, named the Generative AI Copyright Disclosure Act and introduced by California Democrat Adam Schiff, mandates that AI firms submit copyrighted works in their training datasets to the Register of Copyrights before launching new generative AI systems. Companies must file this information at least 30 days before releasing their AI tools or face financial penalties. The datasets in question can contain vast amounts of text, images, music, or video content.

Congressman Schiff emphasised the need to balance AI’s potential with ethical guidelines and protections, citing AI’s disruptive influence on various aspects of society. The bill does not prohibit AI from training on copyrighted material but requires companies to disclose the copyrighted works they use. This move responds to increasing litigation and government scrutiny around whether major AI companies have unlawfully used copyrighted content to develop tools like ChatGPT.

Entertainment industry organisations and unions, including the Recording Industry Association of America and the Directors Guild of America, have supported Schiff’s bill. They argue that protecting the intellectual property of human creative content is crucial, given that AI-generated content originates from human sources. Companies like OpenAI, currently facing lawsuits alleging copyright infringement, maintain their use of copyrighted material falls under fair use, a legal doctrine permitting certain unlicensed use of copyrighted materials.

Why does it matter?

As generative AI technology evolves, concerns about the potential impact on artists’ rights grow within the entertainment industry. Notably, over 200 musicians recently issued an open letter urging increased protections against AI and cautioning against tools that could undermine or replace musicians and songwriters. The debate highlights the intersection of AI innovation, copyright law, and the livelihoods of creative professionals, presenting complex challenges for policymakers and stakeholders alike.

OpenAI utilised one million hours of YouTube content to train GPT-4

In recent reports by The New York Times, the challenges faced by AI companies in acquiring high-quality training data have come to light. The New York Times elaborates on how companies like OpenAI and Google have navigated this issue, often treading in legally ambiguous territories related to AI copyright law.

OpenAI, for instance, resorted to developing its Whisper audio transcription model by transcribing over a million hours of YouTube videos to train GPT-4, its advanced language model. Although this approach raised legal concerns, OpenAI believed it fell within fair use. The company’s president, Greg Brockman, reportedly played a hands-on role in collecting these videos.

According to a Google spokesperson, there were unconfirmed reports of OpenAI’s activities, and both Google’s terms of service and robots.txt files prohibit unauthorised scraping or downloading of YouTube content. Google also utilised transcripts from YouTube, aligned with its agreements with content creators.

Similarly, Meta encountered challenges with data availability for training its AI models. The company’s AI team discussed using copyrighted works without permission to catch up with OpenAI. Meta explored options like paying for book licenses or acquiring a large publisher to address this issue.

Why does it matter?

AI companies, including Google and OpenAI, are grappling with the dwindling availability of quality training data to improve their models. The future of AI training may involve synthetic data or curriculum learning methods, but these approaches still need to be proven. In the meantime, companies continue to explore various avenues for data acquisition, sometimes straying into legally contentious territories as they navigate this evolving landscape.