How the ‘COPIED Act’ could make it unlawful to train AI using copyrighted material without permission…

July 17, 2024

MBW Explains is a series of analytical features in which we explore the context behind major music industry talking points – and suggest what might happen next. Only MBW+ subscribers have unlimited access to these articles. MBW Explains is supported by Reservoir.

Continue to article...

What’s happened?

A new front has opened in the growing battle between intellectual property owners and AI companies who would use copyrighted IP to train AI without permission.

A bipartisan group of US senators has introduced the Content Origin Protection and Integrity from Edited and Deepfaked Media Act (COPIED Act), which is being promoted by its sponsors as a law to fight deepfakes.

And while it certainly would do that, perhaps the most relevant aspect of this bill for copyright owners is that it would create a mechanism that would in effect make it unlawful to use copyrighted materials to train AI without permission.

The bill would require the National Institute of Standards and Technology (NIST) to develop standards for the creation of “content provenance information,” that is, “state-of-the-art, machine-readable information documenting the origin and history of a piece of digital content, such as an image, a video, audio, or text.”

Under the proposed law, this “content provenance information” would be embedded in digital forms of copyrighted material, and it would be unlawful to remove it or tamper with it, except in very limited cases where platforms are carrying out research to improve security.

It would also be unlawful for anyone to use any material with “content provenance information” to train AI, or to create AI-generated content, “unless such person obtains the express, informed consent of the person who owns the covered content.”

However, to enforce this law requires a few things: First, there needs to be a standardized system for creating “content provenance information,” and for adding it to copyrighted materials; and there also needs to be a standardized way for detecting AI-generated or AI-altered content.

The bill addresses these issues by ordering the NIST to work with the private sector to develop methods for digital watermarks and/or content provenance information to be added to content, and to create standard methods for detecting AI-generated or AI-altered content.

“Protecting the life’s work and legacy of artists has never been more important as AI platforms copy and use recordings scraped off the internet at industrial scale and AI-generated deepfakes keep multiplying at rapid pace.”
Mitch Glazier, RIAA

Companies that provide tools to create AI content would be required to provide users the option of adding content provenance information within two years of the law going into effect, and companies that enable the creation of digital versions of copyrighted content would need to give users that same ability, also within two years.

Finally, there’s the issue of enforcement. The bill allows state attorneys general and copyright owners to file civil suits for damages against businesses and individuals who violate the rule requiring express permission to use protected materials in training AI, or those who strip out the watermarks and/or content provenance information attached to copyrighted materials.

“The bipartisan COPIED Act I introduced with [Tennessee Republican] Senator [Marsha] Blackburn and [New Mexico Democrat] Senator [Martin] Heinrich, will provide much-needed transparency around AI-generated content,” said Sen. Maria Cantwell, a Washington Democrat and head of the Senate Commerce Committee.

“The COPIED Act will also put creators, including local journalists, artists and musicians, back in control of their content with a provenance and watermark process that I think is very much needed.”

“The urgent need to require all generative AI users to deal transparently and fairly with the creative community cannot be overstated.”
Society of Composers & Lyricists, Songwriters Guild of America, Music Creators North America

Organizations representing copyright owners seem to agree. Numerous groups have lined up behind the bill, including music industry groups such as the Recording Industry Association of America (RIAA), the National Music Publishers’ Association (NMPA), the Recording Academy, Nashville Songwriters Association International, the Society of Composers & Lyricists, and the Songwriters Guild of America and Music Creators North America.

It’s also garnered the support of film and TV union SAG-AFTRA, the News/Media Alliance, the National Newspaper Association, the National Association of Broadcasters, Artist Rights Alliance, and the Human Artistry Campaign.

“Protecting the life’s work and legacy of artists has never been more important as AI platforms copy and use recordings scraped off the internet at industrial scale and AI-generated deepfakes keep multiplying at rapid pace,” said Mitch Glazier, Chairman and CEO of the RIAA.

“RIAA strongly supports provenance requirements as a fundamental building block for accountability and enforcement of creators’ rights,” he said, adding that the COPIED Act “would grant much needed visibility into AI development and pave the way for more ethical innovation and fair and transparent competition in the digital marketplace.”

The Society of Composers & Lyricists, the Songwriters Guild of America, and Music Creators North America called the bill “a crucial, beginning step towards addressing the myriad of existential threats to the American songwriter and composer community posed by unregulated generative artificial intelligence… The urgent need to require all generative AI users to deal transparently and fairly with the creative community cannot be overstated.”

The groups behind this bill are a very similar bunch to those who have thrown their weight behind several other AI-related bills working their way through the US Congress. Some of these groups have also given their backing to a growing number of lawsuits against AI developers who stand accused of having used copyrighted material without permission to train their AI.

So how does this bill differ from all the others? And where do the lawsuits fit into the picture? They’re all part of a growing effort to rein in the “Wild West” of AI development, before it devalues – and potentially replaces – the copyrighted works that countless creators have spent lifetimes building up.

The COPIED Act can be seen as both the technical backbone of those efforts, and a “fallback” in case those other efforts fail.

How does this bill differ from other bills before Congress?

If you’re a regular reader of MBW, you’ve likely come across the No AI FRAUD Act and the NO FAKES Act, two bills, similar to each other, that have been brought forward in the US House of Representatives and the US Senate, respectively, in recent months.

These two bills have a fairly different focus from the new COPIED Act: they’re centered around the people (and businesses) harmed by the creation of deepfakes.

In effect, they create a “right of publicity” at the federal level in the US – that is, the right to one’s own likeness and voice.

They are a reaction to the proliferation of deepfakes, both those that threaten artists and the businesses behind them (such as the infamous fake Drake track that went viral in 2023) and those that threaten members of the public at large (such as AI-generated pornographic content featuring unwitting victims).

Like the COPIED Act, they create a right to sue the creators of unauthorized deepfakes, but they don’t address the issue of copyright directly.

To that end, Democratic House Rep. Adam Schiff introduced this spring the Generative AI Copyright Disclosure Act, a proposed law that would require anyone who creates or alters a dataset used for AI training to send a notice to the Register of Copyrights that includes “a sufficiently detailed summary of any copyrighted works used.”

In other words, it would require AI developers to be transparent about the materials they used to train their AI models – a key ask of copyright holders, including the music industry, who have been struggling to see who has used their materials to train AI models. They’ve gone to court against AI developers who they believe violated their copyrights, with what amounts to circumstantial evidence.

Even before these bills were introduced, copyright holders launched a number of lawsuits against AI developers, accusing them of using their intellectual property without permission in AI tech.

Writers including Sarah Silverman and George R.R. Martin went to court against OpenAI, arguing their chatbots have been plagiarizing their written works. Music publishers Universal, Concord and ABKCO have sued AI developer Anthropic alleging that Anthropic’s Claude chatbot is recycling copyrighted lyrics.

And, most recently, recording companies owned by the music majors – Universal Music Group, Sony Music Group and Warner Music Group – sued AI music generators Suno and Udio, arguing that they used copyrighted music to train their instant music-making AI tech.

All these lawsuits are likely to come down to one key question: Should the use of copyrighted works be treated as a “fair use” exemption to copyright laws?

Copyright owners vehemently argue that it shouldn’t, and point out that training AI fails the fair-use test in a number of ways, including that the AI-generated output competes directly with the copyrighted material used to train the AI.

AI companies argue that it should, that their new AI-generated content is “transformative” enough to be considered something new, and not a rip-off of the materials they used in training.

One of the most interesting aspects of the Senate’s new legislation is, if these lawsuits go against copyright holders, and courts declare use of copyrighted materials in AI to be “fair use,” it will give copyright owners, including the music business, a novel way to protect their IP all the same.

What could the COPIED Act mean for the music industry?

The “backup plan” aspect of the COPIED Act might be among its most important characteristics.

Let’s imagine, for a moment, a scenario in which Rep. Schiff’s Generative AI Copyright Act fails. It’s entirely conceivable that it won’t be passed into law, or if passed into law, it will be struck down by the courts.

AI companies could succeed in arguing that a law requiring them to disclose their AI training materials would force them to reveal their trade secrets: The developer of the best AI music generator would instantly be copied by everyone else, losing their competitive advantage.

Let’s imagine, also, that the AI companies also succeed in convincing the courts that their use of copyrighted materials deserves a fair use exemption under copyright law. (That’s a bit harder to imagine than the previous scenario, but hey, that’s apparently how things are in Japan.)

In this scenario, the COPIED Act could form an effective “backup plan” for the music industry and other rights holders, creating a novel mechanism for protecting copyrighted works.

Recording companies could add watermarks and content provenance information to their digital recordings, and under the COPIED Act, their material would be off limits in the use of training AI models and in the creation of AI-generated content.

That could prove to be a lot of work for copyright owners – imagine taking down and replacing millions of music files on the streaming services – but it would almost certainly be worth it.

It’s somewhat analogous to the mechanism created by the European Union’s AI Act, under which copyright holders can declare that they’re “opting out” of having their content used for the training of AI.

To that end, Sony Music Group and Warner Music Group both recently made public declarations that they are, indeed, opting out of having their content used in the training of AI.

While copyright owners’ groups have argued that the onus should be on AI developers to get permission, and not on copyright holders to pre-emptively opt out of AI, they would likely admit this mechanism is better than nothing.

Likewise with the COPIED Act, and its requirement to mark copyright content as off limits to AI.

A final thought…

The various lawsuits being fought in the courts and the various bills before Congress may seem like a haphazard way to approach the complex issue of AI and copyright, but it might actually have an advantage.

With so many approaches to the issue, copyright holders have many “shots on goal” – many different ways they can prevail in their efforts to protect human-created works against devaluation by AI-generated content. If one approach fails, another may yet work.

Yet from the point of view of copyright owners – and those victimized by deepfakes – it sure would be great if all the bills before Congress passed, because, in effect, they create an all-encompassing, comprehensive approach to the threats posed by unbridled AI tech.

Rep. Schiff’s Generative AI Copyright Act gives the public, and copyright holders, the ability to see what AI companies are doing behind the scenes. The COPIED Act provides the technical tools to enforce copyright protections. And the No AI FRAUD and NO FAKES acts cover the human element, giving victims of AI deepfakes recourse in the courts.

Of course, given the nature of lawmaking in Washington, copyright holders probably shouldn’t hold their collective breath waiting for the perfect outcome.

They can take some comfort, though, in the fact that – with all these approaches to the issue taking shape – a positive outcome for human creators is growing increasingly likely.Music Business Worldwide