The Science of "Ums" and "Likes": Why filler word removal is actually backed by linguistics (and why it matters for creators)
If you’ve spent any time editing audio or video lately, you’ve probably noticed that almost every major platform—from Descript to Riverside—now features a "one-click filler word removal" button.
We are often told by public speaking coaches to completely purge "ums," "uhs," "likes," and "you knows" because they make us sound nervous or unprepared. But as content consumption shifts to captive, high-retention audio formats, what does the actual science say about how these verbal habits impact an audience?
Here is a look at the real data behind speech disfluencies, how they change listener perception, and why cleaning them up is a massive cheat code for creators.
1. The Real Cost to Professional Credibility
While some people claim filler words are completely harmless, academic research shows they take a measurable toll on how an audience judges your intelligence and authority.
In a peer-reviewed study published in ResearchGate, The Effect of Vocal Fillers on Credibility, Communication Competence, and Likeability, researchers explicitly tracked how an audience reacts when speech disfluencies are present versus when they are absent.
The Blueprint: Researchers manipulated responses to interview questions by isolating the frequency of vocal fillers (none vs. many) across both male and female speakers.
The Impact: A high volume of filler words significantly degrades a speaker’s perceived professional credibility, personal credibility, and overall communication competence.
The Gender Bias: While the negative impact applied across the board, the study revealed a distinct demographic split: male listeners judge vocal fillers significantly more harshly than female listeners do. If your audience or industry includes male corporate decision-makers, heavy filler usage actively undermines your pitch.
2. The Capture of the Audio Audience
Why do filler words stand out so much more in a modern podcast or video essay than they do in a standard phone call? It comes down to the environment.
According to Edison Research’s comprehensive study, The Podcast Consumer Report, modern media consumers expect a premium, low-intrusion environment.
The Metric: Edison found that 46% of monthly podcast listeners consider podcast ads "not at all intrusive." To put that in perspective, that makes podcasts the least intrusive ad medium available, beating out YouTube (23%), television, and radio.
The Takeaway: Because audiences approach digital audio/video expecting a seamless, highly intimate experience, repetitive speech disfluencies stand out like a sore thumb. They disrupt the premium flow of a medium where consumers are already primed to give you their undivided attention.
3. The Cognitive Reality: Why Your Brain Does It
To understand why editing filler words out is so effective, you have to understand why we say them in the first place. On an episode of Stanford Graduate School of Business's podcast, “Um, Like, So”: How Filler Words Can Be Effective in Communication, sociolinguist Valerie Fridland broke down the actual cognitive purpose of these pauses.
Brain Processing: "Filled pauses" (uh and um) are natural indicators of a heavy cognitive processing load. Your brain throws them out when it is working through complex syntax, abstract words, or structural transitions.
The Battle for the Floor: In casual, two-way conversation, these sounds serve a vital social function. They are your brain's way of signaling to the other person: "Hey, hold on, I am doing heavy mental lifting to formulate this next thought, so please don't interrupt me yet."
The Verdict: Why Creators Should Edit Them Out
Here is the ultimate paradox of content creation: The social utility of a filler word completely vanishes in a recorded medium. In a live conversation, an "um" keeps your friend from cutting you off. But in a one-way broadcasting format (a podcast episode, a YouTube video, a corporate presentation), your audience cannot interrupt you. Because the social utility is gone, the listener is left experiencing only the cognitive drag.
When you leave every single uh, um, and you know in your final cut, you force the listener's brain to filter out structural "fluff" just to get to your core point. Over a 30-minute episode, this dramatically increases listener fatigue.
The Strategy: You don’t need to completely sterilize your audio to the point where you sound like a robot—keeping a few natural pauses preserves your humanity. However, utilizing editing tools to remove 60–80% of your heavy filler clusters lowers the cognitive load on your audience, completely bypasses inherent professional biases, and keeps people listening longer.
In real life, filler words protect your turn to speak. In recorded media, they just cause listener fatigue and quietly chip away at your professional credibility. Clean them up!
