For some time I’ve been wondering whether we should consider retiring the word “podcast”. Before I explain why, let’s explore the origins of the word.
Podcast is a portmanteau, which is a combination of 2 words. Pod (short for iPod) and cast (short for broadcast).
The medium rose to popularity through the use of the iPod and Apple devices. While Apple is still the primary platform where people consume podcasts, there’s been new entrants to the market including: Spotify, Google Podcast not to be confused with Google Play, Stitcher and even Pandora are trying to get more market share.
With all this activity you’d think everyone is listening to a podcast, but that’s not the case.
This means approximately 74% of Americans do not listen to podcasts on a monthly basis. (Source: Edison Research)
Why is this the case? What are the reasons why more people aren’t listening to podcasts? So going back to my opening sentence – could the reason be the name we are calling it?
Remember TiVo in the early 2000s? The technology wasn’t new. Before TiVo it was known as “Replay TV“, but TiVo gave the technology a rebrand with a sleek design and ease of use for consumers with a high price tag to boot. Soon users began to use the phrase “TiVo it” to reference the functionality the device offered.
Eventually TiVo gave way to the more generic DVR device that entered the market and was offered by Cable service providers. The DVR was way cheaper with a subscription pricing or rental fee added to your monthly bill. The DVR, it’s wider adoption has eclipsed the TiVo and has evolved to include the OnDemand and multi-room functionality. However, in the last couple years, there’s been a steady rise of consumers ditching the DVR for streaming services.
Similarly, before word “podcast” became the popular word it was previously known as audioblogging. However unlike other technologies, “podcast” has mostly used the same name even as we no longer listen on iPods because our phones, computers and now smart speakers are the main listening devices. Maybe the answer to mass adoption of the podcast is two-fold like with TiVo:
If we are moving into the customer centric era – how can the podcast industry capture the 70% of non podcast listeners aka soon to be consumers?
Do you think retiring the word “podcast” could be the answer?
With the rise of podcasts, there’s a growing trend where podcast platforms or podcast apps, are leveraging the use of natural language (NL) processing to help users find and listen to shows. This is similar to smart speakers that use voice activated Artificial Intelligence (AI) technology such as Alexa, Google Home and Apple Homepod.
With any new trend, where there are opportunities, there are also issues. The issue here is with the increase use of natural language and artificial intelligence presents a diversity and inclusion problem. This doesn’t only affect Caribbean and people of color but all “non-traditional” or “broadcast English” speakers.
“Artificial intelligence (AI) makes it possible for machines to learn from experience, adjust to new inputs and perform human-like tasks. Most AI examples that you hear about today – from chess-playing computers to self-driving cars – rely heavily on deep learning and natural language processing. Using these technologies, computers can be trained to accomplish specific tasks by processing large amounts of data and recognizing patterns in the data.” (Definition by SAS)
Simply put – AI and NLP learns from what you input, it gets smarter/improves with more human interaction and it repeats the learning cycle.
“SEO is the acronym for Search Engine Optimization. It’s the practice of optimizing websites to make them reach a high position in Google’s – or another search engine’s – search results. SEO focuses on rankings in the organic (non-paid) search results.” (Definition by Yoast)
The content on a website is one way to reach high in the search results of Google or any other search engine.
SEO. “Voice SEO” to be exact. The use of Artificial Intelligence (AI) and Natural Language Processing (NLP) trends in podcasts and voice activated gadgets like smart speakers, is about searching. There’s already a problem with the discoverability of podcasts by people of color without the natural language processing and AI aspects, and this just makes it even more difficult. And if the tech doesn’t understand you when you’re speaking then they are lost in translation.
A few months ago there was an article about Castbox a podcast app that raised 13.5 million to launch its own programming. The article mentioned how Castbox’s use of natural language was going to revolutionize podcasting.
“What makes Castbox interesting is the proprietary technology it has under the hood. The platform uses natural language processing and machine learning techniques to power some of its unique features, like personalized recommendations and in-audio search.
The app is capable of making suggestions of what to listen to next based on users’ prior listening behavior, which can help to improve discovery of podcasts people may like. Meanwhile, the in-audio search feature takes advantage of the recent leaps the industry has seen with voice recognition technology, and actually transcribes the audio content inside podcasts, indexes it and makes it available for search within the Castbox app.
That means users no longer have to rely on things like episode titles, descriptions and show notes to find a podcast related to a topic they want to listen to — they can just search the Castbox app for any podcasts where a term was mentioned.” ~ Source: Techcrunch
Both Carry On Friends The Caribbean American Podcast and The Style & Vibes Podcast are on Castbox. On it’s face all these features are great, however my primary concern with podcast platforms or apps that provide natural language processing via in-audio voice search is it will benefit only or mostly popular shows backed by network providers, and those shows that are in standard English. In my legal industry career I’ve been familiar with natural language processing and the limitations that were causing my concerns. However, being that I’m no longer in that industry perhaps things have improved, so I decided to do some experimenting.
There’s this Castbox demo video on Youtube showing how the in-audio voice search feature that leverages NLP works. As in the video, a search was done on “how to get through the present”. The results were similar and were categorized in channels (aka shows), episodes and audio. The audio it appears, is the natural language going through the audio of each show and pinpoints the exact timestamp when one or more of the terms in the search is found in particular episodes. A similar search was done using Carry On Friends’ content and didn’t have the same results.
The episode used in my demo wasn’t even in patois or heavy in it. I spoke in my natural voice. My simple experiment confirmed my concerns that with natural language processing there are limitations with standard english much less for those with accents.
In June, I did a fire side chat during CITE week on the digital voice and had a follow up lunch with a friend where we discussed the topic further. As it turns out the Washington Post and it’s vast resources where already exploring my concerns.
“At first, all accents are new and strange to voice-activated AI, including the accent some Americans think is no accent at all — the predominantly white, nonimmigrant, non regional dialect of TV newscasters, which linguists call “broadcast English”.
The AI is taught to comprehend different accents, though, by processing data from lots and lots of voices, learning their patterns and forming clear bonds between phrases, words and sounds.
To learn different ways of speaking, the AI needs a diverse range of voices — and experts say it’s not getting them because too many of the people training, testing and working with the systems all sound the same. That means accents that are less common or prestigious end up more likely to be misunderstood, met with silence or the dreaded, “Sorry, I didn’t get that.”
…for people with accents — even the regional lilts, dialects and drawls native to various parts of the United States — the artificially intelligent speakers can seem very different: inattentive, unresponsive, even isolating. For many across the country, the wave of the future has a bias problem, and it’s leaving them behind.” ~ Source: The Washington Post
Simply put people with accents or communicate in a way their audience appreciates and loves are being left out of smart speaker revolution and if more podcast platforms are using natural language processing to make shows searchable for the audience, shows produced by Breadfruit Media are already on the losing end.
Dear @Apple, can we get a Caribbean Siri?
— Melanin Queen 🍫✨ (@__Deedz) July 24, 2018
And instead of Siri she’d be called Shelly Ann https://t.co/L2Ukskk0yr
— Rewind N Come Again (@RACAblog) July 17, 2018
The same way people want themselves represented visually in images, videos, print or in text the same is applicable with the digital voice. It’s not only an issue with podcasts and voice activated speakers. Don’t believe me? Have you tried to use mic button on your smart phone to dictate and send a message? Forget using patois (patwa) to send a message and sometimes it doesn’t catch my proper english correctly!
I see this as a potential opportunity in the rise of the digital voice. Many people get into podcasting to be the talent/host however I think there needs to be diversity in those producing the technology being used in the space.
I intentionally use a transcription service that understands patois. I would outsource production if it weren’t for me wanting to control the quality. However, let’s say that I was ready to outsource – I can’t find a production company that I think would be able to handle and understand the nuances of my patois and code switching enough to know when and how to make the edits.
Podcast platforms and apps continue to offer ways to improve the discoverability of a show to reach new audiences. However some of these new efforts aren’t beneficial to all podcasters. Yes, we should leverage what we can, like Castbox’s commenting feature which allows for more engagement with the audience. However, a content creator shouldn’t have to decide between their show being searched and discovered using smart speaks or voice searchable podcast platforms over the authenticity in language that their audience loves about their shows.
The case can be made that more strides can be made when there’s diversification in the people developing and testing the technology as the Washington Post article pointed out. I think there’s opportunities for minority tech entrepreneurs to get in the space to solve some of the problems content creators are having. Or existing technology companies engage diverse voices to improve the AI offerings.
I get asked a lot of podcast related questions and in response to my most frequently asked questions and lessons learned, I created a podcast episode over at Carry On Friends. In addition to answering questions asked, I also share some of the lessons learnt. The blog post also includes some software and equipment that I have used to podcast.
You can also listen to the episode using the player below.
For now, I have decided against creating a specific course of my own because I think there are already plenty of free and paid options available for anyone who wants to do it themselves.
At Breadfruit Media, I collaborate with our clients to clarify their mission, create a content strategy and in turn develop and produce a show that connects with their audience.
If you’re interested or want more information about collaboration options, please take a look at our offerings.