A significant proportion of my clientele at the Musicians’ Clinics of Canada are for “second opinions”.  Recently I had a referral where a trombone player was fit with ER-15 musicians’ earplugs but was complaining of the occlusion effect.  The ear impressions were remade with a much longer earmold bore but the occlusion complaints remained.

Indeed when I saw this man, the earmolds were very long and I doubted that anyone could make them longer without going into the middle ear!

For readers of this blog, people know that I am half audiologist, half linguist, and half musician…. Well, maybe 1/3 musician…. So I am always looking at ways to combine linguistics and audiology (and music).

But let’s take a step back…

The occlusion effect is a measureable increase in low frequency sound pressure that is created by plugging up or occluding the outer ear canal.  Typically this low frequency sound is allowed to exit the ear canal to the environment and we are not aware of this low frequency build up- its only when we block the exit that the occlusion effect is noticed.

Because of the laws of physics low frequency sounds will always take the path of least resistance and in a normal unoccluded ear canal this means that low frequency sounds would tend to go out of the ear canal to the environment rather than continue on through the middle ear to the cochlea and to lands upstream.  But with an ear canal occlusion, this low frequency sound is forced through the middle ear and onwards.

For any language of the world, the only sounds that have significant low frequency sound energy are the high vowels [i] as in ‘beat’ and [u] as in ‘boot’ and every single language of the world have both of these two vowels in some form- either as a diphthong as in English ([iy] and [uw]) or just as a pure vowel as in French ([i] and [u]).  This is considered a “linguistic universal” (and of course every language also has the low back vowel [a] or some closely related cousin.

It is these high vowels ([i] and [u]) that create the problem with the occlusion effect for speech, and musical sounds that have their fundamentals in this same frequency region that create the problem for music.

Actually this is an incorrect figure The mouth configuration is that of a low vowel [a] and there is no occlusion effect with this vowel where its lowest energy concentration is at 500 Hz and above. Courtesy of www.hearingaidsaustralia.com

Clinically, whenever I perform a hearing aid evaluation and hearing aid fitting, I ask my clients to say the pair of vowels [i] and [a].  In the normal unocccluded state, these two vowels have similar sound levels- they are equally loud.   If however, the high vowel [i] is perceived as louder than the [a], then this is an occlusion effect that needs to be resolved before I let the hearing aid wearer out of my office… or else I will be seeing them in several days’ time for a hearing aid “return for credit”.

If there is an occlusion effect, whether it’s from a hearing aid fitting or hearing protection or even in-ear monitors, the two clinical strategies are to either remake the earmold with a longer bore or to incorporate a vent to allow the low frequency sound energy to be bled off.

In the case of hearing aids, unless the fitting is for someone with a very significant hearing loss in the low and mid frequency region, we can successfully incorporate a vent into the earmold coupling.  With hearing protection however, we would compromise some of the lower frequency attenuation.

So, remaking the earmold with a very long earmold bore is quite reasonable if there is significant occlusion reported with a musicians’ earplug.

But sometimes this doesn’t work well and venting is required.

The short answer here is that I ended up drilling a 1.4 mm wide vent and the trombonist was quite happy, despite the trade-off that needed to be made between wear-ability and low frequency sound attenuation.

This is the measured occlusion effect for this trombonist while wearing the ER15 long bore earplugs. The technque to accomplish this will be discussed in part 2…. and this took only about 15 seconds to obtain!

In part 2 of this blog series I will share some real ear measurement results and a clinically quick technique to show how to measure the occlusion effect (and in this man’s case, it was over 20 dB at 250 Hz even with these long bore remade musicians’ earplugs) and just as importantly, how to verify that the occlusion effect has been successfully resolved.

We tend to be biased, both in our training and in our technologies that we use. We tend to look at things based on spectra or frequencies.  Phrases such as “bandwidth” and long term average speech spectrum show this bias. The long term average speech spectrum, with is averaged over time, is indeed a broad bandwidth spectrum made up of lower frequency vowels and higher frequency consonants, but this is actual false.

At any point in time, speech is either low frequency OR high frequency, but not both.  A speech utterance over time, may be LOW FREQUENCY VOWEL then HIGH FREQUENCY CONSONANT then SOMETHING ELSE, but speech will never be both low frequency emphasis and high frequency emphasis at any one point in time. 

Speech is sequential.  One speech segment follows another in time, but never at the same time.

In contrast, music is broadband and very rarely narrow band. 

With the exception of percussion sounds, music is made up of a low frequency fundamental note (or tonic) and then a series of progressively higher frequency harmonics whose amplitudes and exact frequency location define the timbre.  It is impossible for music to be a narrow band signal.

It is actually a paradox that (1) hearing aids have one receiver that needs to have similar efficiency at the both the high frequency and the low frequency regions and (2) musicians “prefer” in-ear monitors and earphones that have more than one receiver.  If anything, it should be the opposite.  I would suspect that the musicians’ preference for more receivers (and drivers) is a marketing element where “more” may be perceived as “better”.

At any one point in time, musicians should be wearing a single receiver, single microphone, and single bandwidth in-ear monitor.  This will ensure that what is generated in the lower frequency region (the fundamental or tonic) will have a well-defined amplitude (and frequency) relationship with the higher frequency harmonics.  This can only be achieved with a truly single channel system. A “less is more solution”.

This same set of constraints does not hold for speech.  If speech contains a vowel (or nasal, collectively called a sonorant), it is true that there are well-defined harmonics that generate a series of resonances or formants but for good intelligibility one only needs to have energy up to about 3500 Hz.  Indeed, telephones only carry information up to 3500 Hz.  If speech contains a sibilant consonant, also known as obstruents (‘s’, ‘sh’, ‘th’,’f’,…) there are no harmonics and minimal sound energy below 2500 Hz.  Sibilant consonants can extend beyond 12,000 Hz, but never have energy below 2500 Hz.

Speech is either low frequency sonorant (with well-defined harmonics) or high frequency obstruent (no harmonics), but at any one point in time it’s one or the other, but not both.  Music must have both low and high frequency harmonics and the exact frequencies and amplitudes of the harmonics provide much of the definition to music.

This also has ramifications for the use of frequency transposition or shifting.

It makes perfect sense to use a form of frequency transposition or shifting for speech.  This alters the high frequency bands of speech where no harmonics exist.  Moving a band of speech (e.g. ‘s’) to a slightly lower frequency region will not alter any of the harmonic relationships.

But for music, which is defined only by harmonic relationships in both the lower and the higher frequency regions, frequency transposition or shifting will alter these higher frequency harmonics.

Clinically for a music program, if there are sufficient dead cochlear regions or severe sensory damage, reducing the gain in a frequency region is the correct approach, rather than changing the frequency for a small group of harmonics.