How is the Stereophonic Sense of Hearing Formed?
1 The Concept of Stereo Sound
Stereo is a geometric concept referring to objects occupying position in three-dimensional space. Is sound also stereo? By analogy, the answer can be affirmative. Because sound sources have definite spatial positions, sounds have definite directional origins, and human hearing has the ability to discern the direction of sound sources; especially when multiple sources sound simultaneously, people can perceive the spatial distribution of the sound group. Therefore, it can be said that sound is 'stereo.' However, a more appropriate statement would be: 'The original sound is stereo.' When sound is recorded, amplified, and then reproduced, all sounds may come from a single speaker. This reproduced sound is not stereo. At this point, because all sounds emanate from the same speaker, the original spatial sense—especially the spatial distribution sense of the sound group—disappears. This reproduced sound is called 'mono' (Mono). If the playback system can partially restore the spatial sense of the original sound, this reproduced sound is called 'stereo' (Stereo). Since the original sound is inherently 'stereo,' the term 'stereo sound' specifically refers to reproduced sound that has some sense of space (or directionality).
2 Binaural Effect
To restore spatial sense in reproduced sound, we first need to understand why the human auditory system can discern sound source direction. Research shows this is primarily because people have two ears, not just one.
Ears are located on either side of the head. They are not only separated in space but also blocked by the skull, resulting in various differences between the sounds received by each ear. It is primarily based on these differences that people can distinguish the spatial position of the sound source. The main differences are:
(1) Interaural Time Difference (ITD)
Due to the distance between the ears, sounds arriving from directions other than directly front or back reach one ear before the other, creating a time difference. A source to the right reaches the right ear first, then the left; vice versa. The more lateral the source, the greater the time difference. Experiments show that artificially creating an ITD can produce the illusion of a shifted sound source. When the ITD reaches about 0.6ms, the sound seems to come entirely from one side.
(2) Interaural Level Difference (ILD)
Although the ears are close, the skull's obstruction causes differences in sound pressure level (SPL) reaching each ear. The SPL is higher at the ear nearer the source and lower at the farther ear. Experiments show maximum ILD can reach about 25dB.
(3) Interaural Phase Difference (IPD)
Sound propagates as waves, with different phases at different points in space (unless separated by exactly one wavelength). The spatial separation of the ears creates potential phase differences in the waves reaching them. The eardrums vibrate with the sound waves, and this phase difference becomes a factor in judging direction. Experiments show that even if SPL and time of arrival are identical at both ears, changing only the phase significantly alters the perceived sound source location.
(4) Interaural Timbral Difference
If a sound wave comes from the right, it must diffract around parts of the head to reach the left ear. Wave diffraction capability relates to the ratio of wavelength to obstacle size. The human head diameter is about 20cm, equivalent to the wavelength of a 1,700Hz sound wave in air. Thus, the head masks sound components above about a thousand Hertz. Components of the same sound diffract differently around the head; higher frequencies attenuate more. Consequently, the timbre heard by the left ear differs from that heard by the right, providing a cue for directional judgment.
(5) Difference between Direct Sound and Successive Reflections
Besides the direct sound reaching our ears, sound from a source undergoes reflections (single or multiple) from surrounding obstacles, forming groups of reflected sounds arriving successively. Therefore, the difference between the direct sound and the reflected sound groups provides information about the sound source's spatial distribution.
(6) Difference Caused by the Pinna
The pinna (outer ear) faces forward, clearly helping distinguish front from back. Its complex shape causes intricate effects on sounds from different directions, providing additional directional cues.
Practice shows that among these differences, ILD, ITD, and IPD have the greatest impact on auditory localization. However, their roles vary under different conditions. Generally, phase difference is more significant in low-mid frequencies; ILD dominates in mid-high frequencies. For transient sounds, ITD is particularly crucial. For vertical localization, the pinna's role is more important. Actually, the binaural effect is comprehensive; the auditory system likely judges direction based on an integrated effect.
Incidentally, besides loudness, timbre, and direction, the human auditory system has many other effects. One closely related to our topic is the 'Precedence Effect' (also known as the 'Haas Effect'). Experiments show that when two identical sounds, one delayed, reach the ears successively, if the delay is within 30ms, the delayed sound is not perceived as separate. Only changes in timbre and loudness are noticed. However, with longer delays, the situation changes. As known, when the time difference between two sounds exceeds 50ms-60ms (equivalent to a path difference greater than 17m), the listener becomes aware of the delayed sound.
Stereo is a geometric concept referring to objects occupying position in three-dimensional space. Is sound also stereo? By analogy, the answer can be affirmative. Because sound sources have definite spatial positions, sounds have definite directional origins, and human hearing has the ability to discern the direction of sound sources; especially when multiple sources sound simultaneously, people can perceive the spatial distribution of the sound group. Therefore, it can be said that sound is 'stereo.' However, a more appropriate statement would be: 'The original sound is stereo.' When sound is recorded, amplified, and then reproduced, all sounds may come from a single speaker. This reproduced sound is not stereo. At this point, because all sounds emanate from the same speaker, the original spatial sense—especially the spatial distribution sense of the sound group—disappears. This reproduced sound is called 'mono' (Mono). If the playback system can partially restore the spatial sense of the original sound, this reproduced sound is called 'stereo' (Stereo). Since the original sound is inherently 'stereo,' the term 'stereo sound' specifically refers to reproduced sound that has some sense of space (or directionality).
2 Binaural Effect
To restore spatial sense in reproduced sound, we first need to understand why the human auditory system can discern sound source direction. Research shows this is primarily because people have two ears, not just one.
Ears are located on either side of the head. They are not only separated in space but also blocked by the skull, resulting in various differences between the sounds received by each ear. It is primarily based on these differences that people can distinguish the spatial position of the sound source. The main differences are:
(1) Interaural Time Difference (ITD)
Due to the distance between the ears, sounds arriving from directions other than directly front or back reach one ear before the other, creating a time difference. A source to the right reaches the right ear first, then the left; vice versa. The more lateral the source, the greater the time difference. Experiments show that artificially creating an ITD can produce the illusion of a shifted sound source. When the ITD reaches about 0.6ms, the sound seems to come entirely from one side.
(2) Interaural Level Difference (ILD)
Although the ears are close, the skull's obstruction causes differences in sound pressure level (SPL) reaching each ear. The SPL is higher at the ear nearer the source and lower at the farther ear. Experiments show maximum ILD can reach about 25dB.
(3) Interaural Phase Difference (IPD)
Sound propagates as waves, with different phases at different points in space (unless separated by exactly one wavelength). The spatial separation of the ears creates potential phase differences in the waves reaching them. The eardrums vibrate with the sound waves, and this phase difference becomes a factor in judging direction. Experiments show that even if SPL and time of arrival are identical at both ears, changing only the phase significantly alters the perceived sound source location.
(4) Interaural Timbral Difference
If a sound wave comes from the right, it must diffract around parts of the head to reach the left ear. Wave diffraction capability relates to the ratio of wavelength to obstacle size. The human head diameter is about 20cm, equivalent to the wavelength of a 1,700Hz sound wave in air. Thus, the head masks sound components above about a thousand Hertz. Components of the same sound diffract differently around the head; higher frequencies attenuate more. Consequently, the timbre heard by the left ear differs from that heard by the right, providing a cue for directional judgment.
(5) Difference between Direct Sound and Successive Reflections
Besides the direct sound reaching our ears, sound from a source undergoes reflections (single or multiple) from surrounding obstacles, forming groups of reflected sounds arriving successively. Therefore, the difference between the direct sound and the reflected sound groups provides information about the sound source's spatial distribution.
(6) Difference Caused by the Pinna
The pinna (outer ear) faces forward, clearly helping distinguish front from back. Its complex shape causes intricate effects on sounds from different directions, providing additional directional cues.
Practice shows that among these differences, ILD, ITD, and IPD have the greatest impact on auditory localization. However, their roles vary under different conditions. Generally, phase difference is more significant in low-mid frequencies; ILD dominates in mid-high frequencies. For transient sounds, ITD is particularly crucial. For vertical localization, the pinna's role is more important. Actually, the binaural effect is comprehensive; the auditory system likely judges direction based on an integrated effect.
Incidentally, besides loudness, timbre, and direction, the human auditory system has many other effects. One closely related to our topic is the 'Precedence Effect' (also known as the 'Haas Effect'). Experiments show that when two identical sounds, one delayed, reach the ears successively, if the delay is within 30ms, the delayed sound is not perceived as separate. Only changes in timbre and loudness are noticed. However, with longer delays, the situation changes. As known, when the time difference between two sounds exceeds 50ms-60ms (equivalent to a path difference greater than 17m), the listener becomes aware of the delayed sound.