The MBW Review offers our take on some of the music biz’s biggest recent goings-on. This time, Cherie Hu takes a long-lens look at changing consumer consumption habits when it comes to entertainment digital media. The MBW Review is supported by Instrumental.
You may already be familiar with this narrative: The dawn of the internet promised a more democratic, horizontal playing field with respect to accessing information, communicating with strangers and building an audience. Yet at the same time that all of these tasks have gotten easier in the digital age, the world has also become more polarized.
Multiple studies have shown that the way modern social-media platforms are built encourages us to dig ourselves deeper into ideological “filter bubbles” — even despite (or perhaps because of) heightened exposure to views we disagree with. Needless to say, this societal self-siloing has led to often-alarming sociopolitical outcomes on a global scale.
On a slightly less dire but equally concrete scale, this polarization has also infiltrated the nature of the actual content that media companies are now funding and creating on the internet.
Across audio, video and other entertainment formats, there is a growing amount of buzz around really short pieces of content on the one hand (i.e. ten seconds or less, such as the latest GIF or TikTok meme), and around really long pieces of content on the other (i.e. one hour or more, such as extended podcast interviews or Netflix docu-series). The purgatory in the middle — content that has, for lack of better descriptors, a “medium” or “normal” duration — has become more difficult to frame as a source of future growth or innovation.
In the context of music, this means that a standard-length song or music video, while certainly ripe with creative potential, is increasingly just a means to a more polarized end — an initial stepping stone to gauge demand for another piece of content down the line that is either much shorter, or much longer.
“We need to start going smaller, six seconds or less.”
This isn’t a completely new concept. Back in 2018, a founder working directly on the front lines of short-form media — GIPHY CEO Alex Chung — laid out the argument for content-level polarization during his keynote speech at SXSW.
Chung argued that in the age of on-demand video streaming platforms like Netflix and Hulu, long-form content will continue to get more expensive, not less. While this market wouldn’t necessarily be winner-take-all, those with the fattest wallets would have an easier time coming out on top.
In Chung’s eyes, this meant that everyone else who couldn’t compete at that long-form game had to start sprinting in the opposite direction. “The rest of us need to start shrinking,” said Chung. “We need to start going smaller, six seconds or less.”
Chung cited several data points from the film and video industry to substantiate his claims. “The average shot length in a movie is about four seconds; the average time a Facebook video is being watched is 10 seconds,” he said. “If you want to follow where the money is, what’s actually being registered as a video view is just three seconds.”
Of course, Chung has a vested stake in his prediction coming true: he runs a business that relies on millions of users sending over one billion pieces of micro-content every day, most of which last under six seconds.
But that doesn’t mean his words don’t have some inkling of truth in them — and the music industry presents an intriguing case study.
As chronicled in books like John Seabrook’s The Song Machine, pop songwriting has transformed over the past few decades from an intimate process shared among only a few collaborators, to a factory-like setup that churns out multiple hooks lasting only a handful of seconds each, stacked on top of each other to make a catchy hit song.
This growing preoccupation with “hooks” in songs has a strong parallel in the world of visual music content. Music marketers are perennially obsessed with the power of short-form video channels like TikTok, Snapchat and the Stories feature on Instagram to seed looping snippets of new songs and subsequently gauge market demand.
Nearly all of these platforms rely heavily on music for user retention — and, in their inherently viral-friendly nature, have helped amplify some of this generation’s defining celebrities, from Shawn Mendes to Lil Nas X and DJ Khaled.
Only recently have startups begun to experiment with consumer-facing, short-form audio formats in addition to visual ones. Startups like Audiobyte (the startup behind Songclip, formerly known as Gifnote) and Emoticast (behind TuneMoji) have deals with major labels to make short audio clips of their songs available for sharing in social environments — a partnership that, if successful, could potentially transform the otherwise tedious and expensive process of licensing major-label catalog for the next generation of mobile apps.
The rise of A.I.-driven music creation tools like Boomy also mutually reinforces the growth of short-form audio, because such tools lower the barrier to entry for self-expression and allow users to create, edit and monetize minute-long clips within, well, minutes.
Outside of the music industry, startups like Muze and Audtra are also working on short-form, voice-driven social networks in a similar manner to the early version of Spotify-owned podcast distribution platform Anchor, which was described back in 2017 as “Snapchat stories for audio.” And of course, companies like GIPHY continue to grow in popularity around GIFs, particularly in messaging apps — a connection that has helped GIPHY lure in seven-figure ad deals.
Enthusiastic media execs will point to what they see as a perfect storm of tech and business trends to justify more investment in short-form content, from the rise of 5G to consumers’ rapid, global adoption of mobile devices as their primary form of media consumption. Industry insiders also often describe this kind of content as “snackable” — evoking how consumers both crave the material constantly and gobble it down quickly, living in a perpetually unsatiated state of mind.
But media and entertainment companies are also realizing that no human being can subsist only on snacks forever. If we think about the wider context of a “media diet,” we also need “healthier,” more wholesome media to ensure that we don’t end up clogging our arteries with the “snackable” lowest common denominator.
Enter the long-form hype.
To extend the “media diet” analogy: if short-form pieces of content like GIFs, TikTok memes and song clips are “snackable” — consumed and shared on-the-go and for cheap, in the happenstance scenario of needing to kill cravings or time — long-form content could be viewed as an alternative, expensive, artisanal, farm-to-table cuisine, reserved only for special occasions that warrant diners’ full, undivided attention.
Or at least that’s how it seems to be marketed, particularly in the music industry.
Right alongside TikTok memes, record labels are increasingly investing in long-form content meant to provide the most “premium” and “authentic” experience of a given artist and their catalog. This spans biopics and documentaries, both for brick-and-mortar movie theaters and for streaming platforms like Netflix and Hulu; original podcasts, funded either by major labels or by Spotify, Deezer and other audio platforms; and Broadway musicals inspired by iconic artists’ discographies.
The motivations for investing more in long-form media are also clear: fostering deeper connections with superfans, charging a higher premium for access and/or allowing more wiggle room for creative experimentation.
Especially for major labels, long-form is also more appealing because it enables more creative control. “Viral” short-form content inherently takes on a life of its own, forcing its original creator to relinquish control of the messaging and emotional context that emerges from decentralized, scaled sharing.
“Flooding the market with short-form content has only led to a second flood of long-form content, the two extreme reinforcing and balancing each other out.”
In contrast, viewers of long-form content not only assume from the beginning that that content involves more control and involvement from the creator, but also embrace that control as the selling point. (A recent example: Beyoncé’s Netflix documentary Homecoming, which received almost unanimously positive reviews from critics, is also tightly controlled, with the celebrity showing only a brief glimpse into her creative process and behind-the-scenes prep for her renowned Coachella performance.)
In other words, flooding the market with short-form content has only led to a second flood of long-form content, the two extremes reinforcing and balancing each other out. In fact, some media professionals have argued that the relationship between short-form and long-form is more of a spectrum than a hard, polarized dichotomy, with today’s creators able to move more fluidly from one end of the spectrum to the next.
“A few years ago ‘professional content’ and ‘online video’ were two separate worlds, separated by a huge chasm from which each group would glare at — or ignore — the other,” reads a panel description from the 2019 edition of VidCon. “But instead of a wasteland, now there’s everything from Netflix and Hulu to YouTube Red, vertical video storytelling and so much more in between. Online video creators produce successful movies, while traditional movie stars luxuriate in their new YouTube cribs.”
It is true that the threshold for what counts as “professional” has all but blurred, allowing more creators to participate in the entertainment economy in whatever format they choose. That said, the “spectrum” model doesn’t take into account the fact that modern media and entertainment companies tend to treat attention as a zero-sum game.
There’s a reason why Netflix, for instance, has said in the past that it competes with sleep and with Fortnite more than it does with rival video-streaming platforms. The underlying assumption from Netflix’s perspective — shared by many of its peers and analogs in the music industry — is that they either have all of your attention, or none of it.
Said companies have to invest their resources accordingly — not only into formats that are most likely to capture your attention when they don’t have it, but also into formats that will retain and maximize your attention for a long time once they do have it. The best kinds of content to get these two jobs done, respectively, tend to be really short and really long.
In this vein, recent developments in the film and T.V. industries show how making “middle-form” content succeed may be a steeper uphill battle than sticking to the extremes.
New video startups like Quibi and Ficto are trying to follow in the footsteps of Snap Originals — once described as the “mobile HBO” — by producing both scripted and unscripted mini-series for mobile devices, each episode of which will last up to ten minutes long.
Supporters will claim that this “medium” format, which seems tailor-made for brief commutes or mental breathers during work, will help elevate certain kinds of originals stories that are increasingly hard to greenlight with mainstream, franchise-focused film-production studios. Quibi in particular has already generated $100 million in ad sales, several months ahead of its launch in April 2020.
But as John Jurgensen recently wrote for the Wall Street Journal, these companies are “vying for phone screens already dominated by a different form of ‘short’: YouTube vlogs, Instagram stories, looping TikTok clips, and the like. What’s more, people already have a way to watch movies and TV shows on the go in short doses — their Netflix app, for example — and its pause button.” In other words, pushing even more content and even more subscriptions into an already-saturated media ecosystem isn’t so much a market opportunity as it is a nuisance.
Homing in on the music industry, one can find similar struggles to make “middle-form” content work — the two most prominent examples being vertical and long-form music videos.
“It’s telling that Spotify has yet to reveal any engagement data around its vertical music videos, despite the format existing for years.”
Artists as wide-ranging as Taylor Swift, Yella Beezy and J Balvin have invested in vertical, mobile-friendly music videos to accompany their biggest singles (e.g. for “Lover,” “Bacc At It Again,” and “La Canción”) — even if “regular” (i.e. 16:9) videos for those songs already exist. But the vast majority of these vertical-friendly videos exist only on Spotify, and hence only “to squeeze every bit of engagement out of the millions of people who use its app each month,” in the words of The Ringer’s Alyssa Bereznak. It’s also telling that Spotify has yet to reveal any engagement data around its vertical music videos, despite the format existing for years.
Long-form music videos — which usually max out at around 15 minutes, which isn’t that “long” at all compared to a biopic or documentary — have also struggled to solidify themselves as a noteworthy standalone category. The MTV Video Music Awards has presented the award for “Best Breakthrough Long Form Video” only twice in its 35-year history: in 1991, and then spontaneously in 2016 in the wake of Beyoncé’s multimedia Lemonade release.
Despite the emergence of several other long-form music videos since 2016, the corresponding award category hasn’t returned to the VMAs ever since. And as recent Vevo data suggests, viewers might not even want long-form music videos anyway, because they usually skip past the early plot and dialogue to get to the actual music. Importantly, that gap between expectation and reality goes away if the piece of content is framed not as a “music video,” but as a more deeply narrative piece of work such as a biopic, documentary or podcast.
Of course, whether in the context of sociopolitical issues or entertainment, relying on extremes usually leads to a loss of nuance.
Back in 2014, in an op-ed for the New York Times, Jonathan Mahler lamented a warped obsession with long-form that was emerging in the literary world, whereby long-form essays were immediately lauded by critics even if they carried no meaning whatsoever. “When you fetishize — as opposed to value — something, you wind up celebrating the idea of the thing rather than the thing itself,” wrote Mahler. “When we fetishize ‘long-form,’ we are fetishizing the form and losing sight of its function.”
As long as artists and music companies strive to stand out and continue to treat attention as a zero-sum game, they will continue to fetishize extremes in a similar way, giving everyday “normality” lower and lower returns. In this environment, “middle-form” content does not become useless per se, so much as it becomes bait for more and more urgent questions about what’s next: “Cool, but how is this going to go viral?” or “Cool — but what’s the bigger story?” Today, those kinds of extremes hold most of our curiosity — and our money.