Palo Alto-based PlayAI has secured $21 million in seed funding to advance its generative AI voice technology that is able to clone voices across multiple languages and accents.
The funding round was led by Kindred Ventures (a backer of London-headquartered music, travel and experiences company Pollen, and NFT marketplace Bitski) and 500 Global, with participation from Race Capital, Y Combinator, and other investors. Following the investment, Steve Jang of Kindred Ventures joined as a board observer.
PlayAI says it will use the funding to support its ambitious vision of creating “more natural” and “human-like” conversational voice interfaces. The company’s latest model, PlayDialog, uses conversational context to generate speech with nuanced prosody, emotion, and pacing.
PlayAI’s platform allows developers to create voice applications without building custom models from scratch. The company offers models supporting over 30 languages, targeting industries like healthcare, travel, and customer support. With the AI-powered voice generation market expected to grow fourfold in the next decade, according to a Market.Us report, investors see massive potential.
“Voice AI represents a $2 trillion market, and at Race Capital, we thrive on partnering with founders who tackle big challenges in massive markets,” said Chris McCann, General Partner at Race Capital.
“Building voice agents that can converse like humans and autonomously handle complex tasks is no easy feat, and I’m immensely proud of what our team has achieved.”
Mahmoud Felfel, PlayAI
“Play AI’s voice AI platform is the key to unlocking new applications across customer support, sales, marketing, and beyond. We couldn’t be more excited to partner with Mahmoud [Felfel], Hammad [Syed], and the PlayAI team on this journey.”
Mahmoud Felfel, Co-Founder and CEO of PlayAI, added: “Speech as an interface is exploding in popularity, and we knew it was a massive opportunity from the get-go.
“Building voice agents that can converse like humans and autonomously handle complex tasks is no easy feat, and I’m immensely proud of what our team has achieved. This funding will help us deliver our vision of powerful, emotive, and human-like voice interfaces for any application.”
However, the company faces substantial ethical challenges that could undermine its technological achievements.
A report at TechCrunch says PlayAI’s platform allows users to clone voices with minimal verification. In testing, it said users only need to self-certify their ownership of rights, allowing for the potential misuse of the technology. TechCrunch demonstrated this by creating unauthorized voice clones, including of public figures like Kamala Harris.
Despite claims of robust safeguards, TechCrunch said the platform’s community portal has hosted explicit content.
“PlayAI has several ethical safeguards in place. We’ve implemented robust mechanisms to identify whether a voice was synthesized using our technology, for example.”
Hammad Syed, PlayAI
PlayAI co-founder Hammad Syed told TechCrunch that they promptly investigate and take action against users who misuse the platform. Additionally, Syed highlighted the cost barrier to creating high-quality voice clones, suggesting that it may limit the potential for malicious use. PlayAI’s highest-fidelity voice clones that require 20 minutes of voice samples are priced at $49 per month billed annually or $99 per month, the news outlet said.
TechCrunch noted that PlayAI operates in a complex legal landscape as some states, like Tennessee, have laws preventing unauthorized voice recordings.
“PlayAI has several ethical safeguards in place. We’ve implemented robust mechanisms to identify whether a voice was synthesized using our technology, for example,” Syed said. “If any misuse is reported, we promptly verify the origin of the content and take decisive actions to rectify the situation and prevent further ethical violations.”
The company’s approach to model training remains vague. Syed claims they use “mostly open datasets” and proprietary in-house datasets, avoiding user data. However, the AI industry increasingly faces scrutiny over data sourcing and potential copyright infringements.
PlayAI enters a crowded market with competitors like ElevenLabs, Papercup, Deepdub, Acapela, Respeecher, and Voice.ai, as well as tech giants Amazon and Microsoft. Last week, Microsoft revealed its new Interpreter feature for Teams, a tool that lets users clone their voice for meetings.
Interestingly, some music labels, like Universal Music Group, are cautiously exploring AI voice cloning technology, but only with tools deemed ethical. In June, UMG partnered with SoundLabs, an AI technology company focused on ‘ethically’ trained tools for music creators. The collaboration equips UMG’s artists and producers with AI technology through SoundLabs‘ MicDrop, an AI vocal plug-in.
Last month, the first AI-driven release from that partnership was announced: a Spanish language version of Brenda Lee’s iconic holiday hit, Rockin’ Around The Christmas Tree, first recorded 66 years ago.
As generative AI continues to advance, PlayAI’s journey underscores the need for a balance between technological innovation and ethical considerations. PlayAI’s success will depend not just on its technological capabilities, but also on its ability to address the ethical challenges surrounding AI-generated voice technology.
Music Business Worldwide