For a long time, the basic model for how the human voice works was wonderfully simple. We were taught to think of it like a speaker system: there’s a sound source (the vibrating vocal folds in your larynx) and a filter (the vocal tract, a tube-like space running from your larynx to your lips). The source creates a buzzing sound, and the filter shapes that buzz into vowels and consonants. This is the “linear source–filter theory,” and it’s been a useful way to understand the basics of speech. This is a construct that I use in my teaching, to create a simplified model for beginner singers to understand basic physiological happenings.

But while this model is easy to grasp, it’s also a major oversimplification. It works reasonably well for lower-pitched male speech, but it begins to break down when we look at the complexities of female voices, children’s voices, and especially the athletic demands of operatic singing.

Scientific work done in the early 2000’s by leading researcher Dr Ingo Titze revealed a much more exciting and dynamic picture. The source and filter are not independent parts working in sequence. Instead, they are in a constant state of interaction, influencing each other in a process called “nonlinear coupling.” The acoustic pressures generated in your vocal tract don’t just shape the sound; they actively feed back and change how the vocal folds vibrate in the first place. This is the physics behind vocal brilliance, and it changes everything. This interaction is where the real power, efficiency, and nuance of the human voice come from.

Here are four of the most surprising and impactful discoveries that this understanding has taught us about how your voice really works.

You Can Create a Rich, Harmonic Sound Without Your Vocal Cords Even Touching!

For a long time, voice science assumed that the rich, buzzing sound of the voice—full of a fundamental tone and its overtones (harmonics), which are the mathematical multiples of the main frequency that give an instrument its unique timbre or colour—was created by the sharp, repetitive collision of the vocal folds. It was thought that this abrupt closure was the main event that generated the complex spectrum of frequencies we hear as voice.

But it turns out, this isn’t nearly the whole story. Groundbreaking simulations show that through nonlinear source-filter coupling, it’s possible to generate a full spectrum of harmonics even with a smooth, gentle, non-colliding vibration of the vocal folds. How? The acoustic back-pressure from the vocal tract interacts with the airflow coming through the larynx, “skewing” the airflow pulse.

Imagine a smooth, steady stream of water from a hose (the airflow). Now, imagine pulsing the air just at the nozzle (the vocal tract pressure). This will chop the smooth stream into sharp, distinct pulses. In the same way, the vocal tract’s acoustic pressure chops the smooth airflow from the lungs into a sharper pulse, and it’s this new, sharper shape that mathematically contains all the extra harmonic frequencies.

In the past it has been assumed that the harmonic spectrum of the source comes primarily from vocal fold collision. This may be true for many phonations, but this example shows that vocal fold collision is not essential to produce source harmonics.

This is a profound insight. It suggests that vocal richness is not solely dependent on aggressive vocal fold closure. In fact, studies show this interaction alone can produce a sound with a surprisingly rich harmonic profile , all without the forceful impact of tissue-on-tissue contact! This could have a significant impact on voice therapy and training, especially for treating pathologies that result from the stress of excessive tissue collision.

Singers Aren’t ‘Tuning’ Their Voice Like an Instrument—They’re Dodging Instability

We often think of skilled singers as master tuners, precisely aligning the harmonics of their voice with the resonant frequencies (formants) of their vocal tract to make a louder, more powerful sound. It’s an intuitive idea, but it turns out to be wrong.

Unlike in woodwind or brass instruments, where locking a source frequency to a tube resonance is the goal, doing so in the human voice can actually create vocal instability. Attempting to perfectly align a strong harmonic with a formant can destabilize the vocal folds’ vibration, leading to sudden and undesirable pitch jumps. This isn’t just an academic distinction—that “instability” is the very thing you hear as a voice crack or an uncontrolled yodel.

The reality is far more sophisticated. Expert vocalists achieve a powerful, stable sound not by perfect tuning, but by strategically placing their harmonics into “favourable reactance regions.” In most cases, this means positioning a harmonic on the lower frequency side of a formant. This allows the vocal tract’s acoustics to provide positive reinforcement and boost the sound, but without courting the instability that comes from hitting the resonance peak dead-on. They are, in essence, skillfully dodging the acoustic cliffs where their voice might fail.

…a stable harmonic source spectrum is not obtained by tuning harmonics to vocal tract resonances, but rather by placing harmonics into favourable reactance regions.

A Tiny Tube in Your Throat Is a Secret Knob for Vocal Power

If source-filter interaction is so important, what controls it? The primary physical controller is a small, narrow region of your airway just above the vocal folds called the epilarynx tube. The cross-sectional area of this tiny tube acts like a knob that dials the degree of interaction up or down.

When the epilarynx tube is wide, the acoustic impedance of the vocal tract is low, and the source and filter are largely independent (more linear coupling). When a speaker or singer narrows this tube, however, the impedance increases dramatically, creating a strong interaction (nonlinear coupling) between the vocal tract’s acoustics and the vocal folds.

And the effect of turning this “knob” is nothing short of astonishing. In computer simulations, when the epilarynx tube area was narrowed from a wide 3.0 cm² to a very narrow 0.2 cm², two incredible things happened:

• The radiated acoustic power (the amount of sound energy coming out of the mouth) increased by a factor of 10.

• The glottal efficiency (the ratio of acoustic power out to aerodynamic power in) increased nearly nine-fold, from 0.62% to 5.49%.

This reveals a tangible mechanism that allows a trained vocalist to get significantly more acoustic output for less aerodynamic effort, simply by adjusting the shape of a small section of their throat. This “secret knob” is the most direct physical evidence we have of a singer actively managing nonlinear coupling to their advantage.

Your Windpipe Isn’t Just an Air Hose—It’s an Active Partner in Phonation

When we think about the “filter” part of the voice, we almost always focus on the vocal tract above the vocal folds—the pharynx, mouth, and nose. But the interaction doesn’t stop there. The airway below the vocal folds, known as the subglottal system or trachea, is also an active participant.

The science shows that for the modal register (our typical speaking voice), the “ideal” acoustic load to help the vocal folds vibrate most easily would be a compliant subglottal tract and an inertive supraglottal tract. Think of a compliant tract as a “springy air cushion” that easily pushes back, and an inertive tract as a “heavy column of air” that is sluggish and slow to get moving. The ideal combination helps create the perfect push-pull pressure cycle on the vocal folds.

However, human anatomy presents an interesting compromise. Because the trachea and the supraglottal tract are roughly the same length, they both tend to behave as inertive systems at low frequencies. This “inertive-inertive” combination is less favorable for efficient vocal fold vibration than the ideal “compliant-inertive” setup. This shows that our vocal instrument has built-in acoustic trade-offs. The subglottal tract is not a passive air hose from the lungs; its acoustic properties are an active and crucial part of the complex, interactive system that produces our voice.

Your Voice is a Dynamic System

The science makes one thing exceptionally clear: the human voice is not a simple, linear assembly of parts. It is a deeply interactive, nonlinear system where every component influences every other. We’ve seen how this interaction can create rich harmonics out of thin air, how singers master it by dodging instability rather than perfect tuning, how a tiny tube in the throat acts as its master control, and how even the windpipe below is part of the system. Each discovery dismantles the old linear model and replaces it with a far more fascinating one.

This perspective reveals that speakers and singers can learn to operate along a spectrum. They can choose a “linear region” with high vocal stability, or they can move into a “nonlinear region” to harness greater power and efficiency, but at the risk of encountering instabilities. This is the delicate dance of vocal mastery.

Knowing that your voice is a dynamic, tuneable system full of hidden potential, how might you think differently about the sounds you create?

This blog is based on Dr Ingo Titze’s article “Nonlinear source–filter coupling in phonation: Theory and written using AI/ Notebook LM

For a simplified video version, check out my YouTube channel here!

Casta Diva Voice Studio

Meta Powell, Soprano

4 Surprising Truths About How Your Voice Really Works