by Christopher Schweitzer, Ph.D.
H. Christopher Schweitzer, PhD has a long history of research, development, and clinical activity related to hearing and hearing aids, and continues to own the Family Hearing Centers of Colorado. He is a frequent contributor to HHTM (Hearing Health and Technology Matters).
Background
Students in Signals and Systems learn that there’s an important difference between ‘Precision’ and ‘Accuracy’ in the engineering related sciences. If an archer sends multiple consecutive arrows to the upper left of the red bulls eye, he receives high points for precision (and its cousin, reliability), but not accuracy, if the goal is to hit the center of the target (Figure 1). While there’s a need for both, the user of a system can be misled by a sufficiency of one, but a lack of the other. Such is the case with measurement of real ears.
In the case of Real Ear Measures (REMs), it can be argued that there is a tendency to be rich in the virtue of precision, but poor in accuracy if it is used as a tool to ‘verify’ or ‘validate’ hearing aid performance for individuals. Blistering arguments are often made of how unprofessional hearing aid fittings are when accomplished without the use of REMs. Indeed, entire careers have been attached to that premise as textbooks, articles and graduate degrees have focused on the importance of carefully obtained, and presumably vital, REMs. But this paper puts forth a case that it is well past time to go beyond REMs, and to pursue simple Real Hear measures for more relevant (accurate) listener benefit.
It Starts with the Audiogram
The audiogram’s familiar graphic portrayal of a listener’s barely audible sensitivity pattern for selected frequencies in unnaturally isolated ears is the nearly universal ‘go to’ starting point for most hearing fittings. The audiogram’s historic value was always as a robust and reliable tool for differential diagnosis and monitoring medically related progressive changes in hearing impairment. For that purpose, carefully obtained audiometric data can serve well with both accuracy and precision. But recall that those values represent minimum sensitivity for single tones with controlled durations presented to unnaturally isolated ears. Neurologically, the procedure may represent activation of several thousand peripheral neurons. To then apply these data to make predictions about comfortable clear listening levels for complex signals which involve hundreds of millions of cortical neurons, along with multiple binaural interactions to extract rapidly updated and time-varying acoustic messages is a massive stretch of confidence. But, of course, the pure-tone audiogram done on isolated ears in abnormal listening conditions is the basis for hearing aid fittings in offices around the world. Even the use of speech materials, while admittedly more relevant for the hearing aid user, are still generally collected on unnaturally isolated ears under contrived circumstances. To the credit of many researchers, valiant attempts to launch those clinical arrows towards the ‘target’ of comfortably clear listening. There remains a need to scrutinize the premises and outcomes of present approaches with a view towards greater accuracy and higher listener satisfaction. Editor’s note: The “problem” with basic audiometric testing has been a revisited topic in HHTM publications: A, B, C, D.
Omega
Consider the important notion of omega, Ω. In psychophysical research it is arguably the focal point of validating a measurement, as expressed in the simple formula [ Ω = ƒ(S) ]. To paraphrase Yost1, this is the ‘gold standard’ of psychophysics research, of which clinical audiology is essentially a professional subset. The formula simply states that a behavioral measure (Ω), such as a threshold audiogram, has a functional relationship to the stimuli (S). This relationship of the physical properties of acoustic stimuli and the behavior as reported in standard audiometric tests are admittedly precise given all the standardized care applied to control ambient noise, signal levels and their construction. So, once again precision is generally not an issue of concern. But these measures of mostly peripheral reception of simplistic signals are obtained at a troubling distance from the acquisition of time-varying spoken streams of messages. They are unfulfilling at best as a means of working out the ‘appropriate’ pattern of amplification details such as the slope of the frequency response pattern which often flattens as loudness increases above thresholds. If the target is to reduce the burden on the listener to interpret the brief burps of sound that convey clumps of meaning in spoken conversations, the “probe” of information needs to move above the Tympanic Membrane. While REMs are not behavioral measures, hence outside the omega assumptions, they are generally coupled to audiograms in the protocols.
Consider Also Duration
The standard audiometric presentation signal is designed to be presented for lengths of one second or sometimes two seconds with control rise and fall times. Audiometric ‘pulsed tones,’ which are generally easier to hear for most people at threshold, are typically one half a second or sometimes as brief as 0.2 second (200 msec). The reason for the 200 msec minimum length signals relates to the temporal integration properties of the auditory system. Signals that are shorter in duration than 100 to 200 milliseconds require higher levels of intensity to perceive as Zwislocki2 and others reported many years ago. Figure 2 is a review reminder of temporal integration showing how very brief signals require greater intensity to achieve the perceptual equality of those that reach full integration of energy after approximately 100 msec, or more, depending on frequency. Does it matter, one might ask? Given that there are many speech elements with durations less than 50 msec, the answer would seem to be important enough to take into consideration.
While vowel components of speech are generally several hundred milliseconds in duration, long enough for full integration of loudness, many speech plosives are much less, and their audibility can reasonably be presumed to require higher sound levels to achieve audibility. So, while a 3k Hz audiometric tone of 500 msec may properly represent a threshold for that particular signal construction, it’s entirely possible that a significant portion of the plosive /t/ or /k/ energy at 3k Hz may not be audible due to the rapidly spoken duration of less than 10 msec in a transient phoneme3.
And Frequency Modulations
Since seminal work at the Haskin’s Labs in the 1950s4,5 it has been well-understood that many crucial elements that differentiate some speech sounds change in frequency over short periods of time, i.e. they are frequency modulated. Indeed, Eimas and his colleagues,6 were among several groups that showed that new born babies, appear to be ‘pre-wired’ to hear those brief FM signatures that differentiate, for example, /ba/ from /da/. Recent work in the neurosciences show much more vigorous electrical activity in auditory regions of the human brain with FM signals, than for simpler, non-modulated acoustic signals.8,9 Yet they are notably absent in audiometric tests, except for the occasional use of ‘warble tones’ in some sound field measures, but they remain absent on a standard audiogram.
As a reminder to the reader, Figure 3 illustrates the well-studied formant transitions (frequency modulations in a brief period of time) that distinguishes the spoken syllables /ba/ /ga/ and /da/ as originally reported by the Haskin’s Lab group.
These properties of duration and modulation in the auditory system raise the simple question of ‘how can clinicians characterize a listener’s struggles with hearing speech using signals that do not represent the way the auditory system is organized to receive speech? It is a fundamental inadequacy of test sensitivity as a professional approach to alleviate the stress of spoken communication for hearing impaired individuals. There must be a willful acknowledgement that hearing for tones is not the same as hearing for speech at the neurophysiological level. Extraction of ‘meaning’ from the patterned pulses of speech is immensely more complex than the reporting of the audibility of barely audible sinusoids. It is assumed that hearing professionals know these facts, but have no convenient and established alternative to the conventional pure tone audiogram for verifying hearing aids.
Next week: Evolution of Electroacoustic Measures of Hearing Aids
References
- Yost, WA, Popper, AN, Fay, RR, (1993) Human Psychophysics. Springer-Verlag New York. Chapt 1. Psychoacoustics 1-12
- Zwislocki, J. Theory of Temporal Auditory Summation (1960) J. Acoust. Soc. Am. 32, 1046.
- Wieringen, A, Pols, L. (2006). Perception of highly dynamic properties of speech. Chapt 2 in Listening to Speech- An Auditory Perspective. Greenberg, S, Ainsworth, W. (eds) Laurence Erlbaum Pub. Mahwah, NJ 21-38.
- Delattre, PC., Liberman, AM. & Cooper, FS. (1955) Acoustic Loci and Transitional Cues for Consonants. J. Acoust. Soc. Am. 27, 769–773).
- Liberman, AM, Harris, KS, Hoffman, HS & Griffith, BC. (1957) The discrimination of speech sounds within and across phoneme boundaries. J. Exp. Psychol. 54, 358–368.
- Eimas PD, Siqueland ER, Jusczyk P, Vigorito J. (1971) Speech perception in infants. Science. 22;171(3968):303-6.
- Miller, CL, Morse, PA (1976). The “heart” of categorical speech discrimination in young infants. J. Speech Hear Res 19(3) 578-89.
- Hart, HC, Palmer, AR, Hall, DA. Amplitude and Frequency-modulated Stimuli Activate Common Regions of Human Auditory Cortex. (2003) Cerebral Cortex (Oxford Journals) 13(7) 773-781.
- Okamoto, H, Kakigi, R. (2015) Encoding of frequency –modulation (FM) rates in human auditory cortex. Scientific Reports 5, Article No 18143.