Researchers at Bielefeld University in Denmark have found evidence that suggests why high-frequency treble notes are written higher up and nearer to the top of the treble clef whereas the lower-frequency bass notes are found nearer the bottom. Historically it was thought to be metaphorical since the notes could just as easily been represented in the reverse order with higher-frequency notes being denoted nearer the bottom of a clef.
From extensive research in spatial localization we know that we perceive high-frequency cues (above 1000 Hz) based on the relative sound level of where the sounds reach the two ears- a higher level higher frequency cue would be perceived at the ear closer to the sound origin than the opposite ear. Essentially the head acts as an acoustic shadow for these shorter wavelength, higher-frequency cues such that there is a sound level asymmetry between the two ears.
In contrast, lower-frequency sounds, with longer wavelengths, are not obstructed by the human head- the wavelengths are just too large relative to the 8-9” diameter of the human head. There is no head shadow for lower-frequency sounds with energy below 1000 Hz.
This explanation is quite adequate to explain most aspects of sound localization on the horizontal plane.
It gets a little more complicated for sound localization above or below the horizontal plane. We have known for years that the concha in the outer ear not only has a resonance around 4500-5000 Hz, which contributes to the REUR (Real-Ear Unaided Resonance), but also serves to generate a notch or anti-resonance in this frequency region. It is this notch that many researchers have found to provide a cue of vertical sound localization and orientation.
The group from Bielefeld University and the Max Planck Institute for Biological Cybernetics in Tübingen have come up with what may be another piece of the puzzle. In an article entitled “Why auditory pitch and spatial elevation get high together: Shape of human ear may have evolved to mirror acoustics in natural world”, Dr. Cesare Parise and his colleagues looked at this problem from three points of view:
1. They found that sounds with a significant amount of high-frequency energy originated from high above the horizontal plane (e.g., sky, trees, etc.); 2. They noted that the outer ear was configured to receive more sound in the higher frequency region; and 3. When subjects were given a series of sounds from different angles and azimuths (altitudes), they found that most of the higher-pitched sounds were perceived as coming from above the horizontal plane. The authors concluded that because these three items were consistent and all indicated that higher frequencies were from above, this must be a universal phenomenon.
“These results are especially fascinating, because they do not just explain the origin of the mapping between frequency and elevation,’ says Parise. “They also suggest that the very shape of the human ear might have evolved to mirror the acoustic properties of the natural environment. What is more, these findings are highly applicable and provide valuable guidelines for using pitch to develop more effective 3D audio technologies, such as sonification-based sensory substitution devices, sensory prostheses, and more immersive virtual auditory environments.”
I tend to be slightly more skeptical that this is the reason why treble is up and bass is down. I feel that the question of why it seems that high musical notes are written higher up than the lower bass notes on a musical clef is still up in the air (sorry… bad pun). The three-pronged approach that these researchers took is scant evidence for this being an explanation. The second part of the quote above “…they also suggest that the very shape of the human ear might have evolved to mirror the acoustic properties of the natural environment ” is probably true in so far as there may be a correlation, but I doubt that it is cause and effect.
If the human ear had evolved to be more biased to low-frequency notes, the pinna would have to be almost 4 feet long from top to bottom, and the ear canal would have to be about 15 times as long as it is now (about 2 feet), in order to resonate best at middle C (262 Hz). And even in this case, this giant ear would be equally sensitive to low- and high-frequency sounds (since the harmonics of the ear resonances would extend in to the higher frequency region). I am not an evolutionary biologist, but I do know something about acoustics. I am sure that having to drag around large unwieldy ears would not provide an evolutionary advantage, especially when running through pointy branches and other poky shrubs. Imagine the size of earrings that people would have to wear and earmuffs against the cold would have to be the size of sabre tooth tigers.
I also wonder how much of this is environmentally and socially conditioned, rather than being a universal. Try this experiment (as I have on numerous occasions). Play for a group of people (or a class of students) a swept tone that starts at about 20 Hz, right up to 20,000 Hz. As the tones get higher and higher, people will slightly lift their heads up higher. Also, another experiment would be to have someone sing a scale going from low to high- again, the head starts low and then extends up. In this case, with the head down, there is simply less contraction of the various muscles surrounding the larynx, which makes it easier to vocalize a low-pitched sound- a muscular reason, but not necessarily one that gives insight into any universal or evolutionary traits.