Frequency compression can’t work for music

Marshall Chasin
March 29, 2016

Frequency compression can be useful for speech but never for music.  Simply stated, with music frequency compressed harmonics can be too close together and it will sound discordant, or at least fuzzy.

Frequency compression of any form can be quite useful to avoid dead regions in the cochlea for speech but this does not follow for music.   The difference is that in damaged regions- typically in the higher frequencies- speech has a “continuous” spectrum, whereas music is always a “discrete” spectrum regardless of frequency.

While this sounds more like an obscure lesson in acoustics, it is actually central to why frequency compression in hearing aids simply should not be used for music stimuli.

Here is the story why frequency compression can’t work with music:

A discrete spectrum, also known as a line spectrum, has energy at multiples of the fundamental frequency (f0) also know in music, as the tonic. If the fundamental frequency of a man’s voice (such as mine) is 125 Hz, there is energy at 125 Hz, 250 Hz, 375 Hz, 500 Hz, 625 Hz, and so on.  There is no energy at all at 130 Hz or 140 Hz- just at multiples of the 125 Hz fundamental.  In speech acoustics we frequently see very pretty looking spectra for the vowel [a] or [i] (and they are pretty) but are erroneous.  For the vowels and nasals of speech there is only energy at well-defined integer multiples of F0 but nothing in between those harmonics.  Speech is an “all-or-nothing” spectrum- energy is there or isn’t.

With the exception of percussion, all music also has a discrete or line spectrum. For middle C (262 Hz) there is energy at 262 Hz, 2 x 262 Hz, 3 x 262 Hz, and so on.  These harmonics are well-defined where the amplitude of the harmonics defines the timbre and help to identify which musical instrument we are hearing.  This is as much the case for low frequency sounds as it is for very high frequency sounds (or harmonics).

The reason why frequency compression can be so useful for speech is that speech is not only made up of discrete line spectra for voiced sonorants (i.e., vowels, nasals, and the liquids, [l] and [r]) but also higher frequency continuous spectra. The higher frequency continuous spectra are for the obstruents and voiceless sounds such as [s] and [š] as in ‘see’ and ‘she’ respectively.  And these high frequency continuous spectra that do not rely on the well-defined properties of harmonic spacing, are usually the ones that are near cochlea dead regions.  Transposing away from this region for sounds that have continuous spectra will have minimal effect on speech intelligibility, but this should not be extrapolated to the music.

Any frequency compression for a discrete or line spectrum (such as music) would have disastrous effects.  Figures such as that to the right is misleading  for music- continuous spectra do not exist for music; just line spectra with well-defined energy at the harmonics and absolutely nothing in between.

Assuming that because frequency compression may work for speech, it should be useful for music, is an erroneous assumption and has nothing to do with how the brain encodes speech and music. Changing harmonic relationships for music will never improve the quality of the sound.

In cases of cochear dead regions while listening to music, less may be more- simply reducing the gain in these damaged frequency regions, rather than shifting or transposing away would have greater clinical success.

Leave a Reply