Stability AI releases free, open-source text-to-audio model that ‘respects creator rights’

June 10, 2024

Stability AI, known for its AI art generator Stable Diffusion, has launched Stable Audio Open, a free, open-source model for creating short audio clips.

Continue to article...

Stable Audio Open allows users to generate high-quality audio samples for sound design. It creates clips up to 47 seconds long using simple text descriptions.

It is specifically designed to create sound effects, drum beats, instrument riffs, ambience, and other production elements commonly used in music and sound design.

The open-source model enables users to fine-tune the product with their own custom audio data. This allows musicians such as a drummer to train the model on their own drum recordings to generate new and unique beats in their own style.

The launch of Stable Audio Open follows the release of Stable Audio 1.0 in September 2023. This technology, named one of TIME’s Best Inventions of 2023, lets users craft short audio clips based on textual descriptions.

The latest iteration, Stable Audio 2.0, was unveiled in April of this year. The update provides artists and musicians with a wider range of creative tools and the ability to produce full-length music tracks.

“Our commercial Stable Audio product produces high-quality, full tracks with coherent musical structure up to three minutes in length, as well as advanced capabilities like audio-to-audio generation and coherent multi-part musical compositions.”
Stability AI

Stable Audio Open, meanwhile, is specifically designed for shorter audio clips and production elements. While it can generate short musical snippets, it’s not optimized for creating full songs, melodies, or vocals.

“Our commercial Stable Audio product produces high-quality, full tracks with coherent musical structure up to three minutes in length, as well as advanced capabilities like audio-to-audio generation and coherent multi-part musical compositions,” Stability AI said in a blog post.

The Stable Audio Open model weights are available for download on Hugging Face, a platform for machine learning models. Stability AI encourages sound designers, musicians, developers, and anyone interested in audio to explore the model’s capabilities and provide feedback.

The release of Stable Audio Open comes amid a growing debate over the use of artificial intelligence in the music industry, particularly over copyright.

Ed Newton-Rex, Stability AI’s former Vice President of Audio, departed towards the end of 2023, citing disagreements over the use of copyrighted materials in training datasets.

“Companies worth billions of dollars are, without permission, training generative AI models on creators’ works, which are then being used to create new content that in many cases can compete with the original works,” said Newton-Rex.

“I don’t see how this can be acceptable in a society that has set up the economics of the creative arts such that creators rely on copyright,” Newton-Rex, who helped develop Stable Audio, said in a public resignation letter.

Stability AI says its new model was trained on a dataset of audio clips from Freesound and the Free Music Archive,.

“This allowed us to create an open audio model while respecting creator rights,” the company said.

Music Business Worldwide