For 10 straight years, the most popular way to introduce oneself on YouTube has been a single phrase: “Hey guys!” That’s pretty obvious to anyone who’s ever watched a YouTube video about, well, anything, but the company still put in the energy to actually track the data over the past decade. “What’s up” and “good morning” come in at second and third, but “hey guys” has consistently remained in the top spot. (Here is a fun supercut of a bunch of YouTubers saying it until the phrase ceases to mean anything at all.)
This is not where the sonic similarities of YouTubers, regardless of their content, end. For nearly as long as YouTube has existed, people have been lamenting the phenomenon of “YouTube voice,” or the slightly exaggerated, over-pronounced manner of speaking beloved by video essayists, drama commentators, and DIY experts on the platform. The ur-example cited is usually Hank Green, who’s been making videos on everything from the American health care system to the “Top 10 Freaking Amazing Explosions” since literally 2007.
Actual scholars have attempted to describe what YouTube voice sounds like. Speech pathologist and PhD candidate Erin Hall told Vice that there are two things happening here: “One is the actual segments they’re using — the vowels and consonants. They’re over-enunciating compared to casual speech, which is something newscasters or radio personalities do.” Second: “They’re trying to keep it more casual, even if what they’re saying is standard, adding a different kind of intonation makes it more engaging to listen to.”
In an investigation into YouTube voice, the Atlantic drew from several linguists to determine what specific qualities this manner of speaking shares. Essentially, it’s all about pronunciation and pacing. On camera, it takes more effort to keep someone interested (see also: exaggerated hand and facial motions). “Changing of pacing — that gets your attention,” said linguistics professor Naomi Baron. Baron noted that YouTubers tended to overstress both vowels and consonants, such as someone saying “exactly” like “eh-ckzACKTly.” She guesses that the style stems in part from informal news broadcast programs or “infotainment” like The Daily Show. Another, Mark Liberman of the University of Pennsylvania, put it bluntly: It’s “intellectual used-car-salesman voice,” he wrote. He also compared it to a carnival barker.
GO JOIN MY @irl GROUP CHAT I can’t wait to talk to you guys :))♬ original sound - Isabella Avila
I started noticing YouTube voice proliferate on TikTok as soon as the app started to go mainstream (as opposed to its delightfully cringe origins). Most instances were essentially 60-second versions of YouTube videos, dudes pointing to an image or a graph while explaining whatever factoids went alongside it, or giving scripted hot takes on some depressing political event. Popular users like @onlyjayus iterated on the voice by adding visual eye-grabbers like walking into the frame, holding their camera up to a bathroom mirror, then giving a clearly rehearsed spiel. A sizable percentage of TikTok are accounts that use the “person-pointing-and-explaining” format alongside its requisite inflection, to the point where it’s almost become distinct: I’d argue that TikTok voice is more of a sped-up monotone as opposed to the drawn-out dramatics of YouTube, likely due to the 60-second time limit (though that could change once every user has access to three-minute videos).
Last week a TikTok came on my For You page that explored digital accents. TikToker @Averybrynn was referringspecifically to what she called the “beauty YouTuber dialect,” which she described as “like they weirdly pronounce everything just a little bit too much in these small little snippets?” What makes it distinct from regular YouTube voice is that each word tends to be longer than it should, while also being bookended by a staccato pause. There’s also a common inclination to turn short vowels into long vowels, like saying “thee” instead of “the.” Others in the comments pointed out the repeated use of the first person plural when referring to themselves (“and now we’re gonna go in with a swipe of mascara”), while one linguistics major noted that this was in fact a “sociolect,” not a dialect, because it refers to a social group.
It’s the sort of speech typically associated with female influencers who, by virtue of the job, we assume are there to endear themselves to the audience or present a trustworthy sales pitch. But what’s most interesting to watch about YouTube voice and the influencer accent is how normal people have adapted to it in their own, regular-person TikTok videos and Instagram stories. If you listen closely to enough people’s front-facing camera videos, you’ll hear a voice that sounds somewhere between a TED Talk and sponsored content, perhaps even within your own social circles. Is this a sign that as influencer culture bleeds into our everyday lives, many of the quirks that professionals use will become normal for us too? Maybe! Is it a sign that social platforms are turning us all into salespeople? Also maybe!
But here’s the question I haven’t heard anyone ask yet: If everyone finds these ways of speaking annoying to some degree, then how should people actually talk into the camera?
To find out, I asked the one person I knew who has lots of experience talking in YouTube videos and also does not sound like a big dork. Joss Fong, senior producer and one of the most recognizable faces of the Vox video team, told me, “I generally tell people to speak slightly faster and with more energy than they normally do and to imagine that they’re talking to a friend.“ She added, “What to do with your face, eyes, or body is a whole other thing that I’m still a bit clueless about.”
What this really asks of us, however, is a lot more difficult than imitating an influencer’s voice — it’s acting. It’s a skill that’s notoriously tricky to nail convincingly, but also even more so when the person we’re acting as is ourselves at our most comfortable and relaxed. So it’s not exactly the YouTube creators’ fault that they’re unable to stop themselves from sounding either cringey or smug or annoying or whatever, and it’s certainly not a regular person’s fault that they can’t turn themselves off and on at will.
As more of us pivot to this kind of life, a life of talking into our cameras, performing authenticity, and building precious clout, it’ll be interesting to see how YouTube voice evolves. Perhaps the frequent up-and-downspeak will turn toward the monotone of TikTok voice or the staccato influencer voice or whatever voice is suited to the platform that comes next. Perhaps it’s a faceless audio-only app like Clubhouse or some kind of virtual reality massive multiplayer online game where we need to figure out how to listen to each other in real time or else we’ll be eaten by a robot dragon. I, for one, look forward to a universe where “door-to-door salesman” isn’t the voice I hear radiating from the internet every day.
This column first published in The Goods newsletter. Sign up here so you don’t miss the next one, plus get newsletter exclusives.