Over the past year, numerous copyright holders have sued AI developers, alleging that the developers used copyrighted materials without permission to train their AI models.
Some of these lawsuits have come from the music business. In one such case, Universal Music Group, Concord Music Group, and ABKCO sued AI developer Anthropic over its alleged use of copyrighted lyrics to train its chatbot, Claude.
Perhaps most notably, record companies owned by the three majors – Sony Music Group, Universal Music Group, and Warner Music Group – earlier this year sued Suno and Udio, two generative AI music-making platforms that can whip up a track from nothing but some text prompts, and that some argue have become uncannily good at creating music.
The three majors allege that Suno and Udio violated the copyrights on their recordings by training their AI on those recordings and that the two music generators are now spitting out music similar to what they were trained on.
How the courts interpret these cases could be crucial for rights holders, including the music industry, in terms of how its relationship to AI will develop in the coming years – especially given that legislation to regulate AI is moving slowly (albeit in the right direction, from the point of view of rights holders).
OpenAI, the creator of the hyper-popular ChatGPT chatbot, has been hit with multiple lawsuits by news organizations, alleging that OpenAI used their copyrighted news articles to train ChatGPT. The most famous one of these cases is one brought by the New York Times, which is still working its way through court.
However, another case against OpenAI has come to a conclusion, of sorts, in what may be one of the first rulings on the relationship between AI training and copyrighted materials. On its face, it doesn’t look good for copyright holders: OpenAI won the case.
Earlier this year, The Raw Story, a progressive-leaning online news site, along with subsidiary AlterNet, sued OpenAI, asking a federal court in New York for damages over OpenAI’s alleged stripping-out of content management information from Raw Story and AlterNet articles, to use them to train AI. The news orgs also asked for an injunction to prevent OpenAI from using their content in future training.
In a decision on Thursday (November 7), Judge Colleen McMahon of the US District Court for the Southern District of New York ruled in favor of OpenAI, dismissing the case brought by Raw Story and AlterNet.
The judge’s reasoning? The news organizations weren’t able to show that OpenAI’s use of their content caused them any actual injury.
“I am not convinced that the mere removal of identifying information from a copyrighted work – absent dissemination – has any historical or common-law analogue,” Judge McMahon wrote in her dismissal, which can be read in full here.
Judge McMahon concluded that it’s unlikely that, given that ChatGPT was trained on countless millions of pieces of data, it would actually regurgitate a copyrighted article, or a large part of it, in answer to a question posed by a user.
That’s not necessarily the end of the road for the news organizations: The judge dismissed the case “without prejudice,” meaning The Raw Story can refile the case with the court if they can make a stronger argument for having suffered injury, though the judge did write that she is “skeptical about plaintiffs’ ability to allege a cognizable injury.”
Nonetheless, attorney Matt Topic of Loevy + Loevy, which is representing Raw Story in the case, told Reuters that he is “certain we can address the concerns the court identified through an amended complaint.”
In other words, this case isn’t entirely over just yet.
“I am not convinced that the mere removal of identifying information from a copyrighted work – absent dissemination – has any historical or common-law analogue.”
Judge Colleen McMahon, Raw Story v. OpenMedia
In the meantime, however, some observers have suggested that this is bad news for copyright holders. Drew Thurlow, former Head of A&R at Sony Music Entertainment and now the founder of music startup Opening Ceremony Media, argued in a LinkedIn post that this could strengthen Suno and Udio’s argument that their use of copyrighted songs to train AI constitutes “fair use” under US copyright law.
“One of the tenets of copyright infringement? The offender has to harm the market and/or financial profile of the plaintiff,” Thurlow wrote.
“Are these Gen AI companies hurting the recorded music market? So far, definitively no. In fact, there is evidence consumer Gen AI tools are increasing music engagement. They even might be helping to grow the recorded music market.”
If Thurlow’s assertion is correct, then one of the pillars of “fair use” doctrine could run in favor of AI companies: That is, the record companies may have a hard time proving that they, or the music market, were harmed by AI companies’ use of their material.
However, the Raw Story vs. OpenAI case is fairly different from the lawsuits the music industry has brought against Anthropic, Suno, and Udio. Below, we break down those differences and how they could mean a different outcome for the music industry’s lawsuits against AI developers.
Copyright was not at issue in the Raw Story/OpenAI case
The music industry’s lawsuits against AI companies all have one thing in common: They allege an infringement (or, more accurately, many, many infringements) of copyright.
But the Raw Story lawsuit didn’t allege copyright infringement; it only alleged that OpenAI had violated the US’s Digital Millennium Copyright Act (DMCA), which forbids stripping copyrighted materials of their content management information (in the case of new articles, that would be things such as the name of the news source, author, date of publication, copyright information, etc.).
That’s an unusual approach, given the other lawsuits filed by rights holders against AI companies, and we can only speculate as to why Raw Story and AlterNet wouldn’t have claimed copyright infringement at the same time.
In fact, it’s a weakness in the case that Judge McMahon herself seemed to sniff out.
“Let us be clear about what is really at stake here. The alleged injury for which plaintiffs truly seek redress is not the exclusion of [content management information] from defendants’ training sets, but rather defendants’ use of plaintiffs’ articles to develop ChatGPT without compensation to plaintiffs,” the judge wrote.
Using articles without permission “is not the type of harm that has been ‘elevated’ by… the DMCA,” the judge added. “Whether there is another statute or legal theory that does elevate this type of harm remains to be seen. But that question is not before the court today.”
Fortunately for rights holders (at least for now), that question is before the courts in the other lawsuits brought by music companies.
AI companies’ key defense – the “fair use” doctrine – has yet to be tested in AI cases
The fact that Raw Story’s case against OpenAI focused solely on content management information means that the courts have yet to rule on the key defense that AI companies are using in their fight with rights holders.
That defense is the “fair use” doctrine, the idea that, under certain limited circumstances, it’s acceptable to use copyrighted material without permission. One simple example would be using a fragment of a news article in an educational textbook.
Fair use is the key argument Suno and Udio are making in their defense against the copyright suits brought against them by the record majors. In fact, they appear to be so confident in that defense that they pretty much admitted to using copyrighted material in their responses to the lawsuits.
US courts use a four-factor test to determine whether something falls under fair use:
- The purpose and character of the use – is the use of the copyrighted work for educational purposes or for commercial purposes?
- The nature of the copyrighted work – whether or not the work is particularly creative and original.
- The amount and substantiality of the portion taken – just how much of a copyrighted work was used without permission?
- The effect of the use on the potential market for, or value of, the copyrighted work.
In his LinkedIn post, Thurlow’s argument has to do with that fourth point. If music rights holders can’t prove their intellectual property was damaged – or that the market was damaged – by AI’s use of copyrighted works, that weakens the rights holders’ claim.
The music companies are likely to reject that argument. They have argued, in various contexts, that AI-generated music is a direct competitor to their IP in the music market. Whether or not they can prove that is a different matter.
But that one factor alone is unlikely to decide these cases. Courts don’t take a systematic approach to fair use; these issues are determined case-by-case, with all four factors considered.
In their complaints against Suno and Udio, the record companies attacked the “fair use” argument head-on, addressing each of the four factors.
The first factor – the purpose and character of the use – has to do with how “transformative” the use of copyrighted material is. If you add a snippet of a new article to a textbook, that’s pretty “transformative” – its form, context, and purpose are very different from the original.
With Suno and Udio, “the use here is far from transformative, as there is no functional purpose for [the AI models] to ingest the copyrighted recordings other than to spit out new, competing music files,” stated the record companies’ complaints against the AI platforms. Those complaints can be read in full here and here.
The second factor looks at the kind of copyrighted work being allegedly infringed and values some more than others. A copyright on a functional news article (e.g., sports scores) is less strictly defended than the copyright on something truly and completely creative, like a new song.
In their complaints against Suno and Udio, the record companies argue that musical recordings are exactly the sort of works that copyright was meant to protect.
The third factor has to do with how much of a copyrighted work has been used. A small part of a copyrighted work can be seen as “fair use,” but it’s harder to make that case when an entire copyrighted work has been used.
It’s “abundantly clear” that Suno and Udio ingest “the most important parts” of copyrighted songs, the record companies argued, “as demonstrated by [their] ability to recreate, for instance, some of the most recognizable musical phrases, hooks, and choruses in popular music history.”
Actual damage?
That leaves the fourth factor, the one and only that maybe – just maybe – we got a hint of with the Raw Story v. OpenAI case. While the judge in that case wasn’t weighing the factors of “fair use,” she did show that simply stating that damage has been done to copyright holders isn’t enough to bring a case.
The record companies’ complaints against Suno and Udio assert that the platforms’ AI-generated music is “a significant threat to the market for and value of the copyrighted recordings.” But will a court simply agree?
That could be where the difficulty truly lies for copyright holders in these cases. The question poses a “what if” situation – what would recorded music revenues be if AI platforms like Suno and Udio hadn’t come along? How much would Michael Buble’s Sway be worth if Udio hadn’t (allegedly) used it to train its AI music generator? Tricky.
The record companies might have to dig deep into market research to show material damage – if such market research even exists at this point.
In the meantime, the dismissal of Raw Story v. OpenAI doesn’t need to cause sleepless nights for music owners: The case was different enough and limited enough in scope to leave the door wide open for very different verdicts in the cases yet to come.Music Business Worldwide