In last week’s blog entry , the psycho-acoustics of hearing two tones as “beating”, tonal, or simply as a fuzzy mess, was discussed. In short, if two tones are in excess of 30 Hz apart, then they will sound tonal; if two tones are between 20 and 30 Hz, they will sound fuzzy when played together; and if they are less than 20 Hz apart, then one perceives a beating or pulsing sound where the frequency of the beating is directly related to the frequency difference between the close sounds.
So what does this have to do with music and hearing aids?
Given these limitations of our auditory system, this explains why some notes, when played together, sound great, and other tones sound lousy. These limitations assist us in understanding why musical keys are the way they are- if two notes, or the harmonics of the two (or more) notes are within 30 Hz of each other, then they will not sound as tonal or musical as they should. Musical keys that are quite different are comprised of notes (and their associated harmonics) that are simply too close in frequency. This is a feature of all music, and a direct consequence of having a human (or more generally, a mammalian auditory system). It is an open question whether this would still sound as bad for Martian auditory systems and this would make for a great AuD Capstone study.
Speech has its lowest frequency at the fundamental frequency- the rate of vibration of the vocal chords. Regardless of the vocal tract configuration, there cannot be any sounds lower than this fundamental frequency. Except for gigantic truckers who are over 6’6” tall, male fundamental frequencies rarely are below 100 Hz. This means that harmonics cannot have a spacing of less than 100 Hz. Music that emanates from the left side of the piano keyboard can, and does, have fundamental (or tonic) notes whose harmonics are far less than 100 Hz. And music, unlike speech, is made up of two or more notes that occur simultaneously. Speech is made up of only the fundamental frequency and its integer multiples of that fundamental, called harmonics. With speech, two or more notes cannot occur simultaneously.
So, that’s a “brief” summary of last week’s blog entry.
What does all of this say about frequency altering algorithms that are found in modern hearing aids? This is the case whether this is frequency shifting or frequency transposing.
Figures and graphs from hearing aid manufacturers tend to show narrow (and some wider) bands of speech (and music) being shifted downwards in frequency. Although this looks very pretty, this is a fallacy. For voiced sounds, continuous spectra do not really exist- the only thing that exists are “line spectra”. For example, for a person speaking with a fundamental frequency at 125 Hz (such as myself), I have speech energy at 125 Hz, 250 Hz, 375 Hz, and so on, but not at 130 Hz or 140 Hz. For speech and for music, energy is only found exactly at multiples of the harmonics of speech and music, and nothing at all in between.
For speech, it’s not a large issue. Frequency altering transposition or shifting schemes can generate sound energy that is at a lower frequency region, and properly adjusted frequency transposition schemes, allow the harmonic structures to remain “relatively” intact. It would be a bad stroke of luck to have this algorithm shift some energy to just beside an already existing harmonic. And even if it did, we are talking about shifting the higher frequency obstruents such as the ‘s’ and the ‘sh’ sounds to a lower frequency. These are voiceless sounds and don’t have harmonics in any event. And even the higher frequency voiced obstruents such as ‘z’ and ‘j’ (as in ‘judge’) have minimal overall energy.
For speech, frequency transposition of any sort, although there may be some period of adaptation, should not be a big issue with creating rogue sounds that appear to push the limits of our auditory systems. For speech one sees line spectra for voiced sounds and continuous spectra for voiceless, lower energy sounds.
For music, frequency transposition of any sort runs the risk of creating harmonics that are very closely spaced in frequency. All spectra are line spectra. This easily can (and typically does) result in fuzzy and atonal quality sounds that destroy the beauty of music.
Personally I cannot think of any frequency transposition algorithm that would not distort music. Clinically my intuition is that when confronted with a cochlear dead region, less is more- simply reduce the gain in that frequency region but do not transpose it for a music program.