Masking and Music- Upwards spread of masking – part 1

Marshall Chasin
March 21, 2017

The following 4 part series of blogs overviews the phenomenon of masking. It is written for the musician, and not the audiologist.  The first three parts (upwards spread of masking, downwards spread of masking, and temporal masking) relate to the function and structure of the cochlea and associated neural structures, whereas the last part (phase) refers to the acoustics of any room.  Strictly speaking, phase issues are not related to “masking” in the typical sense but can be viewed as masking in a more general sense, since it can be responsible for a deletion of important information.

Despite the masking noise being low frequency (left side of the graph), its effects are felt well into the upper frequency regions due to “upwards spread of masking”. Figure courtesy of

Other than pure tones that have energy at only one frequency, all speech and all music is characterized by a relatively wide band of frequencies with time varying acoustic properties.  This is just a nice way of saying that speech and music have energy components which change from moment to moment in both their sound level and their frequencies.

When this sound energy enters the ear several things happen in a strict series of events.  High frequency sounds above 1500 Hz are enhanced by the presence of the pinna or outer ear.  This feeds into a resonance at roughly 2700 Hz of about 15-20 dB that is the result of the 25-30 mm long outer ear canal.  In the next stage, the eardrum and associated middle ear bones and structures attenuate (or lessen) the lower frequency sounds below 1000 Hz.  And to complicate things further, the middle ear stapedius reflexes attenuate sounds over about 85 -90 dB SPL.  Moving from the middle ear to the inner ear, the cochlea serves as a Fourier spectrum analyzer and breaks the sound vibrations into well-defined bands, much like the notes arrayed across the piano keyboard.  And finally, neurologically these bands of energy are transformed into electrical impulses that are routed up to the auditory cortex in the brain.

And did I say “finally”?  To make matters even more complicated, there is a feedback loop back to the cochlea which serves to enhance the amplitude of the lower level sounds.

All of this comprises a normal hearing and normally functioning hearing mechanism.

For someone with an inner hearing loss, such as due to aging or noise/music exposure, the sensitivity and other properties of the narrow band filters or piano notes, in the inner ear become altered as does the loss of the feedback loop that serves to enhance softer sounds.   These are two reasons why people with a minor amount of cochlear damage may say “Oh, I can hear you OK, its just that people mumble” and indeed for these people, speaking a bit louder or turning up the music volume slightly will substantially improve things.

An important feature of most forms of hearing loss is that the outer ear and middle ear still function normally- higher frequencies are enhanced (outer ear), and lower frequencies are attenuated (middle ear). 

In the human (and mammalian) cochlea, sound vibrations are set up as a piano keyboard with the high pitched treble notes represented near the “front end” of the cochlear spiral and the lower pitched bass notes nearer to the inside of the cochlear spiral.  This is like a piano keyboard that is backwards. 

The lower pitched bass notes near the left side of the piano keyboard are in a well-protected and enviable part of the cochlea- the nerve endings in the ear associated with these lower pitched sounds are situated in the inner most part and as such are the least prone to being damaged by a life time of loud vibrations.  In contrast, the higher pitched sounds are represented near the outer periphery and are the first to be damaged by a life time of loud sounds.

Courtesy of

It is as if, during an explosion, a person who is unfortunate enough to be sitting near a window of a house is much more susceptible to damage than those lucky souls who happen to be in the basement in a bomb shelter.

One feature of the cochlea is that higher pitched sounds need only to travel a short distance along the backwards piano keyboard and the relevant cochlear nerve endings are activated.  In contrast,  for the lower frequency bass notes, sound vibrations need to travel almost the entire length of the cochlea into the inner portions of the spiral, leaving in its wake vibrational disturbances.  It’s as if the lower bass notes disrupt the entire frequency range of the cochlea whereas the higher pitched notes have disturbances that are more localized to their frequency regions.

Another reason is that all signals have a masking pattern that is “asymmetrical”.  A signal at a certain frequency will of course mostly mask that particular frequency, but also a little bit for the lower frequency sounds and also still quite a bit for the adjacent higher frequency sounds.  That is, a 1000 Hz signal will also be picked up a bit in the channel or band for 990 Hz but a lot for the 1010 Hz channel or band.

For this reason, lower frequency bass notes tend to activate some elements of the higher frequency channels in the cochlea, whereas the opposite is not as true.  When lower pitched bass notes tend to activate some of the higher pitched channels we call this masking.

Specifically when lower pitched sound activates the higher pitched channels or filters, this is called upwards spread of masking.

This is not necessarily a bad thing.

For every sound, speech or music, that we hear, not only is there activation in the cochlea at the pitches associated with those sounds, but also lower pitched notes are included in this band of sound.  It’s as if a C on the piano also is transmitted to the brain on the same channel as the D or E above that.

When we do hear an E on the piano, it not only is made up of sounds associated with the note E but also with some of the energy that derives from the notes just below that E.

In the normally functioning ear, this is part of the beauty of music and speech- lower frequency sound energy erroneously but beneficially contributes to the perception of the note or sound that we are trying to perceive.

In an ear with a hearing loss due to aging or noise/music exposure, these lower pitched notes tend to spread up in frequency even more and in some cases cover up some of the important energy that would normally be perceived at that higher pitched note or sound.  People with this type of hearing loss may have greater than average difficulty hearing speech and (appreciating) music in noisier locations.  That is why people with cochlear (or sensory neural) hearing loss appear to have greater difficulty in noisier locations.

Too little upwards spread of masking is bad and too much upwards spread of masking is bad.  Just the right amount results in what we normally consider to be the correct perception and appreciation of the sound or music.

In part 2 of this blog series, the opposite (but not really… that’s a hint…) of upwards spread of masking will be discussed.

Leave a Reply