Hearing But Not Understanding

Image
Wayne Staab
August 25, 2014

Poem 14Speech Perception to Speech Recognition

Hearing but not understanding is the general theme of this post.  Acoustic characteristics of speech that lead to “hearing” have been the subject of the last two posts. However, being able to “hear” does not necessarily translate to “understanding.” This post will conclude the acoustic characteristics of speech by discussing briefly suprathreshold loudness discrimination, and then move on to a short discussion of speech recognition – phenomena that contribute to speech understanding. Recall that we are still reviewing how this all relates to potential benefits of amplified mid-range frequencies to hearing.

Suprathreshold loudness discrimination (acoustic characteristic)

How does loudness relate to the speech presentation level – how much change must there be for a person to notice a difference?

Figure 1. Mean intensity change as a function of two suprathreshold starting intensity levels – 80 and 100 dB SPL (After Harbart et. al., 1972).

Figure 1. Mean intensity change as a function of two suprathreshold starting intensity levels – 80 and 100 dB SPL (After Harbart et. al., 1972).

Figure 1 shows the average intensity change (in dB) for a noticeable difference to be heard, by frequency, for signals above threshold{{1}}[[1]]Harbart F, Paris D, Wenner C: Factors affecting the loudness discrimination of suprathreshold signals increasing or decreasing in intensity. J Aud Res 1972;12:149-153[[1]]. While various methods had been used to make this determination, a difference was noted between 80-dB and 100-dB SPL presentation levels (levels consistent with hearing aid usage). For the 80-dB signal, the largest loudness change required for a listener to notice a difference (JND, or Just Noticeable Difference) was 4.2 dB at 1000 Hz (blue line). The smallest loudness change required to notice a difference was 3.5 dB at 4000 Hz.

This suggests that mid frequencies (1000 Hz) might be able to accept greater amplification than has been previously thought, and that high frequencies (2000 to 4000 Hz) require less amplification. For the 100 dB SPL input (red line), the trend is the same, especially if the frequencies of 500 Hz and below are ignored. However, the 100 dB SPL higher input level requires even less to notice a difference. This realization is featured in WDFR (Wide Dynamic Frequency Response) circuitry in hearing aids, where the amplification becomes less intense as the amplified signal (input plus gain) increases. In other words, not as much of an intensity change is required at high frequencies to notice a loudness difference as is needed at lower frequencies. This seems to suggest less high-frequency gain than may often be applied.

Speech Recognition

The shift now is from speech perception to speech recognition. Mid- and low-frequency components of speech can also improve speech intelligibility by offering suprasegmental or prosodic cues. Rhythm, intonation, stress, and duration play important roles in speech intelligibility and are phenomena of frequencies below essentially 2000 Hz. Keep in mind that this discussion relates to basic speech recognition involving mid frequencies, and is unrelated to issues that affect overall amplification use, such as distance, echo, SNR, reverberation, etc.

Formant transitions

Transitions are the movement of one sound into another, often identified as how a vowel seems to “glide” into a consonant. This movement between vowels and consonants is very important because such changes are known to be cues for the perception of consonants{{2}}[[2]]Lehiste I, Peterson GE: Transitions, glides, and diphthongs. J Acoust Soc Am 1961: 33(3); 268-277[[2]]. For the words Tom, Tim, Tam, and Tum, for example, if just the initial consonant and vowel in each are phonated, the final sound can be anticipated with a great deal of accuracy – even though the final consonant is not phonated. This implies the necessity of having good vowel-frequency-response range in hearing aids (essentially below 2000 Hz, because few vowels have second formant energy above this frequency, and the second formant is considered to be the most important for vowel recognition). In the example, the transition information from the vowel helps significantly.

Frequency analysis

Frequency analysis consists of frequency discrimination and frequency selectivity, both of which seem to be reduced in cochlear hearing losses. Because of this, cochlear losses require larger frequency differences to discriminate between two tones, and they have a wider critical band.

Frequency discrimination depends on frequency analysis and refers to the ability to distinguish one frequency from another. For human hearing, the frequency difference limen becomes smaller as the sound becomes louder, and is markedly smaller below 1000 Hz. Cochlear losses show larger DLs (Difference Limen) than normal and conductive losses (four to eight times larger). This relates to speech discrimination in that the smaller the frequency difference limen, the better the discrimination.

Gross pitch characteristic – voice pitch. An important range for pitch identification is from 300 to 2000 Hz, and if voice pitch is not amplified properly, it will sound incorrect{{3}}[[3]] Miller JD: Auditory processing of the acoustic patterns of speech. Arch Otolaryngol 1984; 110: 154-159[[3]]. Voice pitch is also important in transitions and in deciding whether the input is periodic, aperiodic, or a mixture. It might play a role in the identification of voiced phonetic segments and is crucial to the suprasegmental analysis. Therefore, some low- and mid-frequency information is required in the amplification system. Many who fit hearing aids frequently  (perhaps unknowingly) use a non-standardized/identified “voice test” to determine a successful hearing aid fitting when they ask patients, “How does my voice sound? How does your voice sound to you? How does your wife/husband’s voice sound to you?”

Fine pitch discrimination. This is used to identify the appropriate frequency regions for second-formant transitions, and also to identify certain consonants (discussed in “formant transitions”).

Frequency selectivity is the ability to detect the presence of one frequency in the presence of other frequencies. Frequency selectivity is critical to understanding speech (which consists of sounds that have many different frequencies). The listener must be able to analyze speech sounds into their component frequencies, especially formants.

According to one study, that ability varies among people having the same hearing levels{{4}}[[4]]Scharf B: Comparison of normal and impaired hearing II: Frequency analysis, speech perception. Scan Audiol (Suppl) 1978; 6:81-106[[4]]. The study states evidence that frequency selectivity is reduced in cochlear impairment, and such patients need a greater frequency difference to discriminate two tones, even though they have a wider critical band. If an abnormally wide bandwidth of internal auditory filters is present, other bands interfere with (mask) the primary signal. The result is a reduction of differences in the amplitude of the spectral peaks and valleys (Figure 2), with a resultant uncertainty in the locations of spectral peaks{{5}}[[5]]Dorman MF: Temporal resolution, frequency selectivity and the identification of speech. Hear J 1986;41(3):24-26[[5]].

Figure 2. Hypothetical internal auditory representation of a vowel processed by arrays of sharply tuned auditory filters (top right) and of broadly tuned auditory filters (bottom right). Spectral peaks are not resolved effectively by the broadly tuned filters.  (After Bailey, 1983 in Dorman, 1986).

Figure 2. Hypothetical internal auditory representation of a vowel processed by arrays of sharply tuned auditory filters (top right) and of broadly tuned auditory filters (bottom right). Spectral peaks are not resolved effectively by the broadly tuned filters. (After Bailey, 1983 in Dorman, 1986).

A different approach, comparing damped and undamped hearing aid frequency responses (damping elements located in the hearing aid earhook), found that the hearing-impaired subjects in the study judged the undamped (greater primary peak around 1200 Hz) frequency responses to produce more clear, pleasant, and natural-sounding speech than the damped responses{{6}}[[6]]Cox, RM, Gilmore C. Damping the hearing aid frequency response: Effects on speech clarity and preferred listening level. J Speech Hear Res 1986; 29:357-365[[6]]. These effects were less, however, for hearing impaired listeners than for normal hearing persons.

Listener Preference Judgments

Attempts to justify high-frequency emphasis with HFE hearing aids have not met with unanimous approval. Listener preference for amplified sound as a judgment of hearing aid satisfaction was reported as early as 1947{{7}}[[7]]Davis H, Stevens SS, Nichols RH Jr, et al: Hearing Aids, An experimental study of design objectives. Cambridge, MA, Harvard University Press, 1947[[7]]. Although not totally related to mid-frequency amplification, listening-preference judgments are usually considered to be vowel (and hence low- or mid-frequency) dominated and have been used consistently as the dominant factor in hearing aid preference judgments{{8}}[[8]]Tecca JE, Goldstein DP: Effect of low-frequency hearing aid response on four measures of speech perception. Ear Hearing 1984; 5(1):22-29[[8]]{{9}}[[9]]Punch JL, Beck EL: Low frequency response of hearing aids and judgments of aided speech quality. J Speech Hear Disord August, 1980; 45:Mo. 3[[9]]{{10}}[[10]]Punch JL, Parker CA: Pairwise listener preferences in hearing aid evaluation. J Speech Hear Res 1981; 24:366-374[[10]].

A major argument against using listener judgments is that the listener might not select the hearing aid that provides the best speech intelligibility. However, results of an investigation by Tecca and Goldstein of the relationship between low-frequency amplification, presentation level, and listener-preference judgments offered no indication that selecting hearing aids using listener-preference judgments constituted an invalid method. They further stated that attention to using these judgments for evaluating hearing aid characteristics probably could be attributed to the findings that phonetically-balanced word lists are insensitive to small differences in hearing aid characteristics. They concluded that if speech intelligibility is not affected, it is reasonable to expect listener preferences to favor the rich quality traditionally associated with low-frequency amplification.

Another study found that in noisy situations the subjects preferred a high-pass hearing aid, but in quiet situations they felt they missed parts of speech with the high-pass aid and preferred a conventional aid{{11}}[[11]]Harford E, Fox J: The use of high-pass amplification for broad frequency sensorineural hearing loss. Audiology 1978; 12:10-26[[11]].

Summary

The purpose of these last three posts was not to discredit those approaches that advocate high-frequency emphasis in the hearing aid fitting process. Certainly, high frequencies are important. However, a number of hearing aids are fitted with primarily mid-frequency amplification (sometimes restricted to this because of attempts to reduce acoustic feedback resulting from venting, and in some cases primarily because of consumer acceptance of the sound quality). These would be for high-frequency hearing losses. These issues seem to be more common with custom-molded hearing aids, but not restricted to them. The posts on this subject have been an attempt to provide some insight into why mid-frequency amplification has been acceptable to many hearing-impaired aided listeners.

As was mentioned earlier in one of these posts, and paraphrasing J.D. Harris (going from memory), it is often possible to determine that a particular event is important when investigated in isolation, but it may not reach significance in an investigation in which all factors are included.

Leave a Reply