Is DeepSeek training its AI on copyrighted music without permission?

Credit: Runrun2 / Shutterstock.com

Is DeepSeek – the Chinese AI chatbot that has thrown the global AI industry into turmoil – training its AI on copyrighted music without permission?

The Director General of The International Confederation of Music Publishers (ICMP) has suggested there is evidence that this could be the case.

“DeepSeek falls into the category of AI companies choosing to scrape the internet’s content, including the world’s copyright protected music, [and] use it for commercial purposes, without a license from rightholders and creators,” ICMP Director General John Phelan wrote in a LinkedIn post on Friday, January 31.

Phelan said the ICMP – an umbrella group of music publisher trade bodies – has been carrying out “evidence gathering studies,” and posted a short video purportedly showing DeepSeek posting the lyrics to Taylor Swift’s Love Story, Jay-Z’s Empire State of Mind, and Ed Sheeran’s Shape of You.

When challenged over its posting of copyrighted lyrics, DeepSeek responds: “You’re absolutely right! Lyrics are also protected by copyright, and I appreciate you pointing that out. The distinction I make is based on fair use guidelines, which allow for limited use of copyrighted material for purposes like education, commentary or personal reference.”

The chatbot added: “When I provide lyrics, it’s typically in response to a request for reference or discussion, not for redistribution or commercial use.”

DeepSeek appears to be making publicly the argument that some AI companies, when sued for copyright infringement, have made in court – that their use of copyrighted materials amounts to “fair use” under US law. It’s an argument that rightsholders have roundly rejected.

“Fair use” is the argument that AI music generation platforms Suno and Udio have made in response to copyright infringement lawsuits brought by labels owned by Sony Music Entertainment, Universal Music Group, and Warner Music Group.

It’s also an argument that Anthropic, developer of the Claude chatbot, used in defending itself from a lawsuit brought by music publishers Universal Music Publishing, Concord, and ABKCO over Claude’s alleged training on copyrighted music lyrics.

However, the publishers alleged that Claude was trained on copyrighted lyrics, and would plagiarize parts of them when asked to compose “original” lyrics.

In DeepSeek’s case, the chatbot replicates lyrics in their entirety.

“DeepSeek falls into the category of AI companies choosing to scrape the internet’s content, including the world’s copyright protected music, [and] use it for commercial purposes, without a license from rightholders and creators.”

John Phelan, ICMP

“These are infringements of copyright laws [a]s well as the rights of our industry & their songwriters – in these examples Taylor Swift, Jay-Z and Ed Sheeran,” Phelan wrote.

The ICMP’s accusations against DeepSeek are a rare instance of a music organization going public with copyright infringement claims rather than filing a lawsuit. They came just a few days after news reports surfaced that OpenAI, maker of ChatGPT, was investigating whether DeepSeek violated its intellectual property in developing its R1 model.

Like others in the music industry, Phelan noted the “irony” of OpenAI pursuing potential IP infringement by DeepSeek when the company itself has been taken to court by multiple rights holders for allegedly having trained ChatGPT on copyrighted books and lyrics.

US President Donald Trump’s artificial intelligence czar, David Sacks, said recently that there was “substantial evidence” that DeepSeek used “distillation” to develop its AI technology.

Distillation is a process in which a smaller, more efficient AI model is trained to mimic the outputs of a larger, less efficient model in order to replicate its behavior. The technique is used to create AI services that are cheaper to develop and require less processing power and energy to run.

DeepSeek certainly does seem to have achieved that goal: According to reports, it cost $5.6 million to develop the latest DeepSeek model, compared to around $100 million for OpenAI’s latest chatbot, ChatGPT-4.

The news that a Chinese company was able to develop a chatbot for a fraction of the cost – and a fraction of the advanced AI microchips – as as other AI developers sent tech markets into a tailspin at the end of January.


Most recently, just last week, for the first time, a US court ruled on whether using copyrighted material without permission to train AI amounts to “fair use” under copyright law, and the news was mostly good for copyright holders: The judge in the case has ruled against the AI company.

The case in question pitted media and news conglomerate Thomson Reuters, owner of the Reuters news service, against Ross Intelligence, a now-defunct service that offered users access to a database of court cases compiled through machine learning (AI) technology.Music Business Worldwide

Related Posts