Basics of Sound and Hearing: Author
Basics of Sound and Hearing: Author
Author :
Prof. Ibrahim ELnoshokaty
BASICS OF SOUND AND HEARING
Author :
Prof. Ibrahim ELnoshokaty
Introduction
Everyday your world is filled with a multitude of sounds. Sound can let
you communicate with others or let others communicate with you. It can
be a warning of danger or simply an enjoyable experience. Some sounds
can be heard by dogs or other animals but cannot be heard by humans. The
ability the hear is definitely an important sense, but people who are deaf are
remarkable in the ways that they can compensate for their loss of hearing
All of the sounds you can hear from plucking the strings above occur
because mechanical energy produced by your computer speaker was trans-
ferred to your ear through the movement of atomic particles. Sound is a
pressure disturbance that moves through a medium in the form of mechani-
cal waves. When a force is exerted on an atom, it moves from its rest or equi-
librium position and exerts a force on the adjacent particles. These adjacent
particles are moved from their rest position and this continues throughout
the medium. This transfer of energy from one particle to the next is how
sound travels through a medium. The words “mechanical wave” are used to
describe the distribution of energy through a medium by the transfer of en-
ergy from one particle to the next. Waves of sound energy move outward in
5
all directions from the source. Your vocal chords and the strings on a guitar
are both sources which vibrate to produce sound waves. Without energy,
there would be no sound. Let’s take a closer look at sound waves.
Sound or pressure waves are made up of compressions and rarefactions.
Compression happens when particles are forced, or pressed, together. Rar-
efaction is just the opposite, it occurs when particles are given extra space
and allowed to expand. Remember that sound is a type of kinetic energy.
As the particles are moved from their rest position, they exert a force of
the adjacent particles and pass the kinetic energy. Thus sound energy trav-
els outward from the source.Sound travels through air, water, or a block
of steel; thus, all are mediums for sound. Without a medium there are no
particles to carry the sound waves. The word “particle” suggests a tiny con-
centration of matter capable of transmitting energy. A particle could be an
atom or molecule. In places like space, where there is no atmosphere, there
are too few atomic particles to transfer the sound energy.Let’s look at the
example of a stereo speaker. To produce sound, a thin surfaced cone, called
a diaphragm, is caused to vibrate using electromagnetic energy. When the
diaphragm moves to the right, its energy pushes the air molecules on the
right together, opening up space for the molecules on the left to move into.
We call the molecules on the right compressed and the molecules on the
left rarefied. When the diaphragm moves to the left, the opposite happens.
Now, the molecules to the left become compressed and the molecules to the
right are rarefied. These alternating compressions and rarefactions produce
a wave. One compression and one rarefaction is called a wavelength. Differ-
ent sounds have different wavelengths.
6
As the diaphragm vibrates back and forth, the sound waves produced
move the same direction (left and right). Waves that travel in the same direc-
tion as the particle movement are called longitudinal waves. Longitudinal
sound waves are the easiest to produce and have the highest speed. However,
it is possible to produce other types. Waves which move perpendicular to
the direction particle movement are called shear waves or transverse waves.
Shear waves travel at slower speeds than longitudinal waves, and can only
be made in solids. Think of a stretched out slinky, you can create a longitu-
dinal wave by quickly pushing and pulling one end of the slinky. This causes
longitudinal waves for form and propagates to the other end. A shear wave
can be created by taking one end of the slinky and moving it up and down.
This generates a wave that moves up and down as it travels the length of the
slinky. Another type of wave is the surface wave. Surface waves travel at the
surface of a material with the particles move in elliptical orbits. They are
slightly slower than shear waves and fairly difficult to make. A final type
of sound wave is the plate wave. The particles of these waves also move in
elliptical orbits but plate waves can only be created in very thin pieces of
material
7
Sound and speed
If you have ever been to a baseball game or sat far away from the stage
during a concert, you may have noticed something odd. You saw the batter
hit the ball, but did not hear the crack of the impact until a few seconds
later. Or, you saw the drummer strike the drum, but it took an extra mo-
ment before you heard it. This is because the speed of sound is slower than
the speed of light, which we are used to seeing. The same thing is at work
during a thunderstorm. Lightning and thunder both happen at the same
time. We see the lightning almost instantaneously, but it takes longer to hear
the thunder. Based on how much longer it takes to hear thunder tells us how
far away the storm is. The longer it takes to hear the thunder, the farther the
distance its sound had to travel and the farther away the storm is.
8
Sound and speed
If you have ever been to a baseball game or sat far away from the stage
during a concert, you may have noticed something odd. You saw the batter
hit the ball, but did not hear the crack of the impact until a few seconds
later. Or, you saw the drummer strike the drum, but it took an extra mo-
ment before you heard it. This is because the speed of sound is slower than
the speed of light, which we are used to seeing. The same thing is at work
during a thunderstorm. Lightning and thunder both happen at the same
time. We see the lightning almost instantaneously, but it takes longer to hear
the thunder. Based on how much longer it takes to hear thunder tells us how
far away the storm is. The longer it takes to hear the thunder, the farther the
distance its sound had to travel and the farther away the storm is.The flash
of light from lightning travels at about 300,000 kilometers per second or
186,000 miles per second. This is why we see it so much sooner than we hear
the thunder. If lightning occurs a kilometer away, the light arrives almost
immediately (1/300,000 of a second) but it takes sound nearly 3 seconds
to arrive. If you prefer to think in terms of miles, it takes sound nearly 5
seconds to travel 1 mile. Next time you see lightning count the number of
9
seconds before the thunder arrives, then divide this number by 5 to find out
how far away the lightning is
You are in a long mining tunnel deep under the earth. You have a friend
that is several thousands of feet away from you in the tunnel. You tell this
person using a walkie talkie to yell and clang on the pipes on the tunnel
floor at the same time. Press the play button below to find out what happens.
The speed of sound is not always the same. Remember that sound is a
vibration of kinetic energy passed from molecule to molecule. The closer
the molecules are to each other and the tighter their bonds, the less time
it takes for them to pass the sound to each other and the faster sound can
travel. It is easier for sound waves to go through solids than through liquids
because the molecules are closer together and more tightly bonded in solids.
Similarly, it is harder for sound to pass through gases than through liquids,
because gaseous molecules are farther apart. The speed of sound is faster in
10
solid materials and slower in liquids or gases. The velocity of a sound wave is
affected by two properties of matter: the elastic properties and density. The
relationship is described by the following equation.
11
The phase of matter has a large impact upon the elastic properties of
a medium. In general, the bond strength between particles is strongest in
solid materials and is weakest in the gaseous state. As a result, sound waves
travel faster in solids than in liquids, and faster in liquids than in gasses.
While the density of a medium also affects the speed of sound, the elastic
properties have a greater influence on the wave speed.
12
Density
The density of a medium is the second factor that affects the speed of
sound. Density describes the mass of a substance per volume. A substance
that is more dense per volume has more mass per volume. Usually, larger
molecules have more mass. If a material is more dense because its molecules
are larger, it will transmit sound slower. Sound waves are made up of kinetic
energy. It takes more energy to make large molecules vibrate than it does
to make smaller molecules vibrate. Thus, sound will travel at a slower rate
in the more dense object if they have the same elastic properties. If sound
waves were passed through two materials with approximately the same elas-
tic properties such as aluminum (10 psi) and gold (10.8 psi), sound will
travel about twice as fast in the aluminum (0.632cm/microsecond) than in
the gold (0.324cm/microsecond). This is because the aluminum has a den-
sity of 2.7gram per cubic cm which is less than the density of gold, which
is about 19 grams per cubic cm. The elastic properties usually have a larger
effect that the density so it is important to both material properties.
13
Air Density and Temperature
Suppose that two volumes of a substance such as air have different densi-
ties. We know the more dense substance must have more mass per volume.
More molecules are squeezed into the same volume, therefore, the mole-
cules are closer together and their bonds are stronger (think tight springs).
Since sound is more easily transmitted between particles with strong bonds
(tight springs), sound travels faster through denser air.
However, you may have noticed from the table above that sound travels
faster in the warmer 40oC air than in the cooler 20oC air. This doesn’t seem
right because the cooler air is more dense. However, in gases, an increase
in temperature causes the molecules to move faster and this account for the
increase in the speed of sound. This will be discussed in more detail on the
next page.
14
Temperature and the speed of sound
15
Human ear
The human ear has three main sections, which consist of the outer ear,
the middle ear, and the inner ear. Sound waves enter your outer ear and
travel through your ear canal to the middle ear. The ear canal channels the
waves to your eardrum, a thin, sensitive membrane stretched tightly over
the entrance to your middle ear. The waves cause your eardrum to vibrate.
It passes these vibrations on to the hammer, one of three tiny bones in your
ear. The hammer vibrating causes the anvil, the small bone touching the
hammer, to vibrate. The anvil passes these vibrations to the stirrup, another
small bone which touches the anvil. From the stirrup, the vibrations pass
into the inner ear. The stirrup touches a liquid filled sack and the vibrations
travel into the cochlea, which is shaped like a shell. Inside the cochlea, there
are hundreds of special cells attached to nerve fibers, which can transmit
information to the brain. The brain processes the information from the ear
and lets us distinguish between different types of sounds. As you know, there
are many different sounds. Fire alarms are loud, whispers are soft, sopranos
sing high, tubas play low, every one of your friends has a different voice. The
differences between sounds are caused by intensity, pitch, and tone.
16
Ear and Hearing
17
18
The Tympanic Membrane
19
Intensity
20
Sounds and their Decibels
21
Pitch
Pitch helps us distinguish between low and high sounds. Imagine that
a singer sings the same note twice, one an octave above the other. You can
hear a difference between these two sounds. That is because their pitch is
different.Pitch depends on the frequency of a sound wave. Frequency is the
number of wavelengths that fit into one unit of time. Remember that a wave-
length is equal to one compression and one rarefaction. Even though the
singer sang the same note, because the sounds had different frequencies, we
heard them as different. Frequencies are measured in hertz. One hertz is
equal to one cycle of compression and rarefaction per second. High sounds
have high frequencies and low sounds have low frequencies. Thunder has a
frequency of only 50 hertz, while a whistle can have a frequency of 1,000
hertz.The human ear is able to hear frequencies of 20 to 20,000 hertz. Some
animals can hear sounds at even higher frequencies. The reason we cannot
hear dog whistles, while they can, is because the frequency of the whistle is
too high be processed by our ears. Sounds that are too high for us to hear
are called ultrasonic.
Ultrasonic waves have many uses. In nature, bats emit ultrasonic waves
and listen to the echoes to help them know where walls are or to find prey.
Captains of submarines and other boats use special machines that send
22
out and receive ultrasonic waves. These waves help them guide their boats
through the water and warn them when another boat is near.
Pitch = frequency of sound
For example, middle C in equal temperament = 261.6 Hz
Sounds may be generally characterized by pitch, loudness, and quality.
The perceived pitch of a sound is just the ear’s response to frequency, i.e.,
for most practical purposes the pitch is just the frequency. The pitch percep-
tion of the human ear is understood to operate basically by the place theory,
with some sharpening mechanism necessary to explain the remarkably high
resolution of human pitch perception.
The place theory and its refinements provide plausible models for the
perception of the relative pitch of two tones, but do not explain the phenom-
enon of perfect pitch.
The just noticeable difference in pitch is conveniently expressed in cents,
and the standard figure for the human ear is 5 cents.
23
Details About Pitch
Although for most practical purposes, the pitch of a sound can be said
to be simply a measure of its frequency, there are circumstances in which a
constant frequency sound can be perceived to be changing in pitch.
One of most consistently observed “psychoacoustic” effects is that a sus-
tained high frequency sound (>2kHz) which is increased steadily in inten-
sity will be perceived to be rising in pitch, whereas a low frequency sound
(<2kHz) will be perceived to be dropping in pitch. (More detail)
The perception of the pitch of short pulses differs from that of sustained
sounds of the same measured frequency. If a short pulse of a pure tone is
decaying in amplitude, it will be perceived to be higher in pitch than an
identical pulse which has steady amplitude. Interfering tones or noise can
cause an apparent pitch shift.
Further discussion of these and other perceptual aspects of pitch may be
found in Chapter 7 of Rossing, The Science of Sound, 2nd. Ed.
24
Effect of Loudness Changes on
Perceived Pitch
25
Perfect Pitch
26
Tone & Harmonics
Another difference you may have noticed between sounds is that some
sounds are pleasant while others are unpleasant. A beginning violin player
sounds very different than a violin player in a symphony, even if they are
playing the same note. A violin also sounds different than a flute playing
the same pitch. This is because they have a different tone, or sound qual-
ity. When a source vibrates, it actually vibrates with many frequencies at
the same time. Each of those frequencies produces a wave. Sound quality
depends on the combination of different frequencies of sound waves.Im-
agine a guitar string tightly stretched. If we strum it, the energy from our
finger is transferred to the string, causing it to vibrate. When the whole
string vibrates, we hear the lowest pitch. This pitch is called the fundamen-
tal. Remember, the fundamental is really only one of many pitches that the
string is producing. Parts of the string vibrating at frequencies higher than
the fundamental are called overtones, while those vibrating in whole num-
ber multiples of the fundamental are called harmonics. A frequency of two
times the fundamental will sound one octave higher and is called the second
harmonic. A frequency four times the fundamental will sound two octaves
higher and is called the fourth harmonic. Because the fundamental is one
times itself, it is also called the first harmonic.
27
What is the difference between music and noise?
Both music and noise are sounds, but how can we tell the difference?
Some sounds, like construction work, are unpleasant. While others, such as
your favorite band, are enjoyable to listen to. If this was the only way to tell
the difference between noise and music, everyone’s opinion would be differ-
ent. The sound of rain might be pleasant music to you, while the sound of
your little brother practicing piano might be an unpleasant noise. To help
classify sounds, there are three properties which a sound must have to be
musical.A sound must have an identifiable pitch, a good or pleasing quality
of tone, and repeating pattern or rhythm to be music. Noise on the other
hand has no identifiable pitch, no pleasing tone, and no steady rhythm.
28
Loudness
Loudness is not simply sound intensity!
29
Since “loudness” is a subjective measurement of perception, one must be
careful about how much accuracy you attribute to it. But though ff is much
louder than p in dynamic level, it is not 1000x louder, so one must attempt
to develop a scale of loudness that comes closer to mapping your ear’s per-
ception. The “rule of thumb” for loudness is one way to attempt that.
30
“Rule of Thumb” for Loudness
31
Why is it that doubling the sound intensity to the ear does not produce
a dramatic increase in loudness? We cannot give answers with complete
confidence, but it appears that there are saturation effects. Nerve cells have
maximum rates at which they can fire, and it appears that doubling the
sound energy to the sensitive inner ear does not double the strength of the
nerve signal to the brain. This is just a model, but it seems to correlate with
the general observations which suggest that something like ten times the
intensity is required to double the signal from the innner ear.
One difficulty with this “rule of thumb” for loudness is that it is applica-
ble only to adding loudness for identical sounds. If a second sound is widely
enough separated in frequency to be outside the critical band of the first,
then this rule does not apply at all.
While not a precise rule even for the increase of the same sound, the rule
has considerable utility along with the just noticeable difference in sound
intensity when judging the significance of changes in sound level.
32
Adding Loudness
When one sound is produced and another sound is added, the increase
in loudness perceived depends upon its frequency relative to the first sound.
Insight into this process can be obtained from the place theory of pitch per-
ception. If the second sound is widely separated in pitch from the first, then
they do not compete for the same nerve endings on the basilar membrane of
the inner ear. Adding a second sound of equal loudness yields a total sound
about twice as loud. But if the two sounds are close together in frequency,
within a critical band, then the saturation effects in the organ of Corti are
such that the perceived combined loudness is only slightly greater than ei-
ther sound alone. This is the condition which leads to the commonly used
rule of thumb for loudness addition.
33
Critical Band
When two sounds of equal loudness when sounded separately are close
together in pitch, their combined loudness when sounded together will be
only slightly louder than one of them alone. They may be said to be in the
same critical band where they are competing for the same nerve endings
on the basilar membrane of the inner ear. According the the place theory
of pitch perception, sounds of a given frequency will excite the nerve cells
of the organ of Corti only at a specific place. The available receptors show
saturation effects which lead to the general rule of thumb for loudness by
limiting the increase in neural response.
If the two sounds are widely separated in pitch, the perceived loudness of
the combined tones will be considerably greater because they do not overlap
on the basilar membrane and compete for the same hair cells. The phenom-
enon of the critical band has been widely investigated.
Backus reports that this critical band is about 90 Hz wide for sounds
below 200 Hz and increases to about 900 Hz for frequencies around 5000
Hertz. It is suggested that this corresponds to a roughly constant length on
the basilar membrane of length about 1.2 mm and involving some 1300 hair
cells. If the tones are far apart in frequency (not within a critical band), the
combined sound may be perceived as twice as loud as one alone.
34
Critical Band Measurement
For low frequencies the critical band is about 90 Hz wide. For higher
frequencies, it is between a whole tone and 1/3 octave wide.
Center Critical
Freq (Hz band-
width
)(Hz
100 90
200 90
500 110
1000 150
2000 280
5000 700
1000 1200
35
Timbre
36
Harmonic Content
37
The recognition of different vowel sounds of the human voice is largely
accomplished by analysis of the harmonic content by the inner ear. Their
distinctly different quality is attributed to vocal formants, frequency ranges
where the harmonics are enhanced.
38
Attack and Decay
The illustration above shows the attack and decay of a plucked guitar
string. The plucking action gives it a sudden attack characterized by a rapid
rise to its peak amplitude. The decay is long and gradual by comparison. The
ear is sensitive to these attack and decay rates and may be able to use them
to identify the instrument producing the sound.
This shows the sound envelope of striking a cymbal with a stick. The at-
tack is almost instantaneous, but the decay envelope is very long. The time
period shown is about half a second. The interval shown with the guitar
string above is also about half a second, but since its frequency is much
lower, you can resolve the individual periods in that sound envelope.
39
Vibrato/Tremolo
40
Above is an amplitude plot of a sustained “ee” vowel sound produced
by a female voice. The periodic amplitude change would be described as
tremolo by the ordinary definition of it. You could also hear pitch variation
along with it, so vibrato was present as well. That is commonly the case. The
period of the amplitude modulation is about 0.17 seconds, or a modulation
frequency of about 5.8 Hz superimposed on a tone of frequency centered
at about 395 Hz. Rough frequency measurements gave frequencies of 392
Hz when the amplitude was high and 399 Hz when the amplitude was low.
It is not known whether or not this kind of variation is typical. Scaling the
amplitude variation gives a range of about 7 dB in intensity associated with
the amplitude modulation.
In his “The Acoustical Foundations of Music”, Ch 11, John Backus com-
ments that voice measurements have shown a pitch variation of a singing
voice some six to seven times per second usually accompanied by an ampli-
tude variation at the same rate. He references Sacerdote.
The comments of Berg and Stork in their book “The Physics of Sound”, 2nd
ed, are very close to what I would conclude from my experience and reading.
“The vibrato of a singer’s voice, for example, aids significantly in distinguish-
ing the voice from other musical sounds. The term ‘vibrato’ in general use
refers not only to periodic changes in pitch, but also to periodic changes in
amplitude, which should more correctly be called tremolo. The ‘diaphragm
vibrato’ of a flute player is close to pure tremolo; the vibrato obtained when
a trombone player wiggles the slide in and out is almost a pure pitch vibrato.
Singing vibrato is actually a mixture of true vibrato and tremolo. Vibrato on a
violin or other string instrument is close to pure pitch vibrato.”
41
Chapter 2
Signal sources
44
45
Geometric Waves
Simple geometric waves are often used in sound synthesis since they
have a rich complement of harmonics. These harmonics can be filtered to
produce a variety of sounds.
Square Wave
46
Sawtooth Wave
Erase Head
Before passing over the record head, a tape in a recorder passes over the
erase head which applies a high amplitude, high frequency AC magnetic
field to the tape to erase any previously recorded signal and to thoroughly
randomize the magnetization of the magnetic emulsion. Typically, the tape
passes over the erase head immediately before passing over the record head.
The gap in the erase head is wider than those in the record head; the
tape stays in the field of the head longer to thoroughly erase any previously
recorded signal.
47
Biasing
High fidelity tape recording requires a high frequency biasing signal to
be applied to the tape head along with the signal to “stir” the magnetization
of the tape and make sure each part of the signal has the same magnetic
starting conditions for recording. This is because magnetic tapes are very
sensitive to their previous magnetic history, a property called hysteresis.
48
A magnetic “image” of a sound signal can be stored on tape in the form
of magnetized iron oxide or chromium dioxide granules in a magnetic
emulsion. The tiny granules are fixed on a polyester film base, but the direc-
tion and extent of their magnetization can be changed to record an input
signal from a tape head.
49
Tape Playback
When a magnetized tape passes under the playback head of a tape re-
corder, the ferromagnetic material in the tape head is magnetized and that
magnetic field penetrates a coil of wire which is wrapped around it. Any
change in magnetic field induces a voltage in the coil according to Faraday’s
law. This induced voltage forms an electrical image of the signal which is
recorded on the tape.
50
to the rate at which the magnetization in the coil changes. This means that
for a signal with twice the frequency, the output signal is twice as great for
the same degree of magnetization of the tape. It is therefore necessary to
compensate for this increase in signal to keep high frequencies from being
boosted by a factor of two for each octave increase in pitch. This compensa-
tion process is called equalization.
51
Sound Synthesis
52
Methods of Synthesis
53
• linear predictive coding - technique for speech synthesis
• direct digital synthesis - computer modification of generated wave-
forms
• wave sequencing - linear combinations of severtal small segments to
create a new sound
• vector synthesis - technique for fading between any number of differ-
ent sound sources
• physical modeling - mathematical equations of acoustic characteristics
of sound
54
MIDI for Music
55
the storage of a minute’s worth of digitally precise and clear CD quality
music directly on a computer disc might take 10 MB of memory. The MIDI
file is just a digital representation of the sequence of notes with information
about pitch, duration, voice, etc., and that takes much less memory than the
digitally recorded image of the complex sound.
Other practical benefits include the ability to transpose music without
changing its duration, to change its tempo without changing its pitch, or
change the synthetic instruments used to perform the piece of music. Draw-
backs include the inability to easily include a recorded voice part or played
instrument along with the MIDI sequenced sound, but on the other hand,
the music can be easily synchronized with multimedia events in a produc-
tion.
56
Phonograph Cartridge
The movement of a coil of
wire in a magnetic field gen-
erates a voltage according to
Faraday’s law. The tracking of a
groove on a vinyl record by the
needle on a phonograph car-
tridge may cause a tiny coil to
move in a magnetic field, gener-
ating an electrical image of the
57
Chapter 3
Audio compact disk
Compact Disc Audio
62
63
Laser for Compact Discs
The detection of the binary data stored in the form of pits on the com-
pact disc is done with the use of a semiconductor laser. The laser is focused
to a diameter of about 0.8 mm at the bottom of the disc, but is further
focused to about 1.7 micrometers as it passes through the clear plastic sub-
strate to strike the reflective layer.
The Philips CQL10 laser has a wavelength of 790 nm in air. The depth
of the pits is about a quarter of the wavelength of this laser in the substrate
material.
64
Polarizing Prism
65
Photodiode Detection
Laser light from the reflective layer of the disc returns through the quar-
ter-wave plate. This causes it to reflect in the beam-splitter so that it reaches
the photodiode for detection. However, if the beam strikes one of the pits,
which are about a quarter- wavelength in depth, the light is out of phase
with the light reflecting from the unaltered plane around it and tends to
cancel it. This produces enough change in light level to be detected by the
photodiode, and to be coded as the 0’s and 1’s of binary data.
Laser Beam Positioning
In order to be reliably decoded, the laser beam must be focused within
about 0.5 micrometers of the reflective surface, but the location of the bot-
tom of the disc may be uncertain by about 0.5 mm during rotation. To keep
the beam focused, a positioning coil drives the focusing lens up or down
in response to an error voltage from the detector. One scheme uses a cylin-
drical lens arrangement to focus light on the detector. When the beam is
properly focused, it projects a round beam and a zero error voltage results.
66
Digital Sampling
For the purpose of storing audio information in digital form, like a com-
pact disc, the normal continuous wave audio signal (analog) must be con-
verted to digital form (analog-to-digital) conversion. Below is an example of
a D/A conversion using digits 0-9, but practical schemes store the numbers
in binary form. The number of bits in the binary sampler determines the
accuracy with which the analog signal can be represented in digital form.
From this crude picture of digitizing in steps, perhaps you can appreciate
the industry standard of 16 bit sampling in which the voltage is sampled into
65,536 steps. In addition to the number of steps, the rate of sampling also af-
fects the fidelity of representation of the analog waveform. The standard sam-
pling rate is 44.1 kHz, so the assignment of one of 65, 536 values to the signal is
done 44,100 times per second. If this recorded data is read from the CD in real
time, you are processing 1.4 million bits of information per second (1.4 Mbps).
67
Implication of Number of Bits
68
Bits and Dynamic Range
69
Digital Data on a Compact Disc
Binary data (0’s and 1’s) are encoded on a compact disc in the form of
pits in the plastic substrate which are then coated with an aluminum film to
make them reflective.
The data is detected by a laser beam which tracks the concentric circular
lines of pits. The pits are 0.8 to 3 micrometers long and the rows are sepa-
rated by 1.6 micrometers.
70
Compact Disc Drive Details
In a compact disc player, a laser beam must track a spiral row of pits
which are 0.5 micrometers wide with track spacing 1.6 micrometers. Track-
ing is aided by a three-beam laser arrangement. In addition to staying on
the track, which is much narrower than the 100 micrometer groove sepa-
ration on a vinyl record, the rotation speed must be adjusted as the beam
tracks inward or outward. A linear speed of 1.25 m/s is maintained by in-
creasing the rotation speed from 3.5 to 8 revolutions per second as the beam
tracks inward toward the center of the disc.
71
Laser Tracking on the CD
is centered on the track. If they are unequal, then their difference can be
used to generate an error voltage to correct the tracking. The illustration of
the side beam positions is not to scale; they deviate about 20 micrometers
from the main beam.
72
Scaled Views of a Compact Disc
Data on a compact disc is stored in the form of pits in the plastic sub-
strate.
A reflective layer of aluminium is applied to reflect the laser beam. A protec-
tive coating is then applied to the top. The laser system reads the data from below.
73
Detection of CD Pits
The tracking
laser beam sees
the pits as raised
areas which are
about a quarter-
wavelength high
for the laser
light.
The reflected light from the pit is then 180° out of phase with the reflec-
tion from the flat area, so the reflected light intensity drops as the beam
moves over a pit. The threshold of the photodiode detector can be adjusted
to switch on this light level change.
74
CD Response to Defects
75
Error-Correction of CD Signals
The data on a compact disc is encoded in such a way that some well- de-
veloped error-correction schemes can be used. A sophisticated error- cor-
rection code known as CIRC (cross interleave Reed-Solomon code) is used
to deal with both burst errors from dirt and scratches and random errors
from inaccurate cutting of the disc. The data on the disc are formatted in
frames which contain 408 bits of audio data and another 180 bits of data
which include parity and sync bits and a subcode. A given frame can con-
tain information from other frames and the correlation between frames can
be used to minimize errors. Errors on the disc could lead to some output
frequencies above 22kHz (half the sampling frequency of 44.1 kHz) which
could cause serious problems by “aliasing” down to audible frequencies. A
technique called oversampling is used to reduce such noise. Using a digital
filter to sample four times and average provides a 6-decibel improvement in
signal-to-noise ratio. For more details, see the references.
Data Encoding on Compact Discs
When the laser in a compact disc player sweeps over the track of pits
which represents the data, a transition from a flat area to a pit area or vice
versa is interpreted as a binary 1, and the absence of a transition in a time
interval called a clock cycle is interpreted as a binary 0. This kind of detec-
76
tion is called an NRZI code. The particular NRZI code used with compact
discs is EFM (eight-to-fourteen modulation) in which eight bits of data are
represented by fourteen channel bits. In addition to the actual digital sound
data, parity and sync bits and a subcode are also recorded on the disc in
“frames” . In a given frame, 408 bits of audio data are recorded with another
180 bits of data which permit a sophisticated error-correction code to be
used. A given frame can contain information from other frames and the
correlation between frames can be used to minimize errors. In addition to
detection, a significant amount of computation must be done to decode the
signal and prepare it for conversion back to analog form with a DAC.
77
Detection of Compact Disc Data
The pits which encode the digital data on a compact disc are tracked by
a laser. The reflected light from the pits is out of phase with that from the
surrounding area, so the reflected light intensity drops when the laser moves
over a pit area. The nature of a photodiode is such that it can be used as the
sensing element in a light-activated switch. It conducts an electric current
which is proportional to the light falling on it. The photodiode and switch
can be adjusted so that a transition to a pit area will switch it off, and a
transition from a pit area will switch it on. Either transition is interpreted
as a binary 1, while the absence of a transition in a given clock cycle is in-
terpreted as a binary zero. The data on the disc is encoded in a sophisticated
way, so that decoding is necessary before sending the digital signal repre-
senting the sound to a digital-to-analog converter (DAC) for reconversion
to analog form.
78
79
Cylindrical Lens for Positioning
80
CD Storage Capacity
A compact disc can store more than 6 billion bits of binary data. This is
equivalent to 782 megabytes, and at 2000 characters per page this is equiva-
lent to about 275,000 pages of text (Rossing). Because the analog-to-digital
conversion for making CD’s involves 16-bit sampling of sound waveforms
at 44.1 kHz, the amount of data involved in the recording of high-fidelity
sound is very large. The 12 cm diameter compact disc can hold 74 minutes
of digital audio which has a frequencies over the full audible range of 20-
20,000 Hz. The signal-to-noise ratio and the dynamic range can exceed 90
decibels.
81
Broadcast Signals
82
83
AM Radio
84
85
FM Radio
86
is used to modulate the frequency of the carrier wave transmitted from
the broadcast antenna of the radio station. This is in contrast to AM radio
where the signal is used to modulate the amplitude of the carrier.
The FM band of the electromagnetic spectrum is between 88 MHz and
108 MHz and the carrier waves for individual stations are separated by 200
kHz for a maximum of 100 stations. These FM stations have a 75 kHz maxi-
mum deviation from the center frequency, which leaves 25 kHz upper and
lower “gaurd bands” to minimize interaction with the adjacent frequency
band. This separation of the stations is much wider than that for AM sta-
tions, allowing the broadcast of a wider frequency band for higher fidelity
music broadcast. It also permits the use of sub-carriers which make possible
the broadcast of FM Stereo signals.
87
Frequency Modulation
88
Digital Surround Sound
89
Perceptual Encoding for Digital Sound
90
MP3 Digital Sound
MP3 stands for MPEG 1 Layer 3. MPEG is a compression type for digi-
tal data. MP3 is a variation of this format that allows sound files to be
compressed by 90 percent without major degradation of the quality of the
sound. The compressed audio file takes up so much less storage space than
on a regular compact disc or tape that it has become very convenient for
transfer on the Internet.
The real possibility for sound compression without audible loss comes
from the fact that the sampling for CDs contains far more than the nec-
essary data. Sixteen-bit digital sampling at 44.1 kHz gives you a stagger-
ing amount of information. From the audio CD you get about 1.4 million
bits per second of information, much more information than your ears can
process. To create the MP3 signal, the information from the CD format is
divided into frequency subbands. Then the signal in each subband is exam-
ined in the process of encoding to decide how many bits to allocate to it. The
process employs a “psychoacoustic model” to decide which subbands will be
recorded most accurately and which will be discriminated against. The idea
is that only that which can realistically be heard by the ear is kept.
The favorite visual metaphor is the “polar bear in the snow storm”.
Against a dark mountain on a clear day, you would have to paint the polar
91
bear with great definition. But if the polar bear is in a snowstorm, you don’t
have to provide as much detail, because you are not going to see much detail
anyway. By analogy, if a sound in a particular subband is going to be masked
out by other subbands so that you won’t hear it anyway, you might as well
save the bits you were going to use to record it. The “psychoacoustic model”
makes judgements about which sounds were going to be masked out.
Some model is applied in the encoding of the high-resolution digital
sound image to MP3, and that model is inevitably going to take out some
audible information. You can improve the model by encoding at a higher bit
rate, because you are putting in more information. Typical current bit rates
are 128, 160, 192, 256 and 320 kbps. Tests show that the accuracy increases
significantly up to 256 kbps with some current decoders, so 256 kbps is
perhaps a good comparison standard. At 256,000 bits of information per
second, you have reduced the 1.4 Mbps to about 18% - compression by more
than five to one. Of course you can get ten to one at 128 kbps, but you can’t
expect to get it without noticeable loss of sound quality.
92
Masking
93
Masking Curves
94
experience and of considerable significance to orchestration. It is easy to
create circumstances where a strong bass brass section can mask the softer,
higher frequency sounds of the woodwind section.
95
Audibility Threshold, Second Sound
96
you should be quite wary of using the 1 dB JND at all except for the assess-
ment of how much you should increase the dB level of the same sound to
produce an audible difference.
97
Calculation Details, Masking Threshold
98
Chapter 4
Electrical principal
Microphones
102
Dynamic Microphones
Advantages:
• Relatively cheap and rugged.
• Can be easily miniaturized.
Disadvantages:
The uniformity of response to different frequencies does not match that
of the ribbon or condenser microphones.
Principle: sound moves the cone and the attached coil of wire moves in
the field of a magnet. The generator effect produces a voltage which “im-
ages” the sound pressure variation - characterized as a pressure microphone.
103
The geometry of a dynamic microphone is like that of a tiny loudspeak-
er, and that is not just a coincidence. A dynamic microphone is essentially
the inverse of a dynamic loudspeaker. In a dynamic microphone, the sound
pressure variations move the cone, which moves the attached coil of wire
in a magnetic field, which generates a voltage. In the loudspeaker, the in-
verse happens: the electric current associated with the electrical image of the
sound is driven through the coil in the magnetic field, generating a force on
that coil. The coil moves in response to the audio signal, moving the cone
and producing sound in the air.
A small loudspeaker can be used as a dynamic microphone, and this fact
is exploited in the construction of small intercom systems. Depending upon
the position of the Talk-Listen switch, the device on either end of the inter-
com system can be used as a microphone or a loudspeaker. Of course, this
is not a high fidelity process, and for commercial dynamic microphones, the
device is optimized for use as a microphone, not a loudspeaker.
104
Ribbon Microphones
Principle: the air movement associated with the sound moves the metal-
lic ribbon in the magnetic field, generating an imaging voltage between the
ends of the ribbon which is proportional to the velocity of the ribbon - char-
acterized as a “velocity” microphone.
105
Advantages:
Adds “warmth” to the tone by accenting lows when close-miked.
Can be used to discriminate against distant low frequency noise in its
most common gradient form.
Disadvantages:
Accenting lows sometimes produces “boomy” bass.
Very susceptible to wind noise. Not suitable for outside use unless
very well shielded.
106
Condenser Microphones
107
Disadvantages:
• Expensive
• May pop and crack when close miked
Requires a battery or external power supply to bias the plates.
108
Crystal Microphone
109
Parabolic Microphone
110
Amplifiers
The task of an audio amplifier is to take a small signal and make it bigger
without making any other changes in it. This is a demanding task, because
111
sure that the amplifier can provide enough power to drive the existing loud-
speakers, but otherwise amplifiers are typically one of the most trouble-free
elements of a sound system.
112
Impedance Matching
In the early days of high fidelity music systems, it was crucial to pay at-
tention to the impedance matching of devices since loudspeakers were driv-
en by output transformers and the input power of microphones to preamps
was something that had to be optimized. The integrated solid state circuits
of modern amplifiers have largely removed that problem, so this section
just seeks to establish some perspective about when impedance matching is
a valid concern.
As a general rule, the maximum power transfer from an active device
like an amplifier or antenna driver to an external device occurs when the
impedance of the external device matches that of the source. That optimum
power is 50% of the total power when the impedance of the amplifier is
matched to that of the speaker. Improper impedance matching can lead to
excessive power use, distortion, and noise problems. The most serious prob-
lems occur when the impedance of the load is too low, requiring too much
power from the active device to drive the load at acceptable levels. On the
other hand, the prime consideration for an audio reproduction circuit is
high fidelity reproduction of the signal, and that does not require optimum
power transfer.
In modern electronics, the integrated circuits of an amplifier have at
113
their disposal hundreds to thousands of active transistor elements which
can with appropriate creative use of feedback make the performance of the
amplifier almost independent of the impedances of the input and output
devices within a reasonable range.
On the input side, the amplifier can be made to have almost arbitrarily
high input impedance, so in practice a microphone sees an impedance consid-
erably higher than its own impedance. Although that does not optimize power
transfer from the microphone, that is no longer a big issue since the amplifier
can take the input voltage and convert it to a larger voltage - the term currently
used is “bridging” to a larger image of the input voltage pattern.
On the output side, a loudspeaker may still have a nominal impedance
of something like 8 ohms, which formerly would have required having an
amplifier output stage carefully matched to 8 ohms. But now with the active
output circuitry of audio amplifiers, the effective output impedance may be
very low. The active circuitry controls the output voltage to the speaker so
that the appropriate power is delivered.
114
Matching Amplifier to Loudspeaker
115
To emphasize the oversimplification involved in the above model, it
should be noted that the loudspeaker is not a simple resistor - it contains a
coil or coils with significant inductance, and is typically composed of two
or three speakers with a crossover network that has capacitance and in-
ductance. So the impedance of the loudspeaker will inevitably vary with
frequency. The only present day amplifiers that would have a characteristic
output impedance like that shown would be those designed to operate with
“valve” or “vacuum tube” amplifiers.
Note that it is safer in terms of total power to go to higher impedance
speakers (series speakers), but more typical practice is to put speakers in paral-
lel, lowering the impedance. Note in the table above that lowering the imped-
ance below the output impedance of the amplifier not only reduces the output
power but increases the internally dissipated power in the amplifier.
116
This diagram shows the relationships used to obtain the power values
in the table above. Note that it assumes a resistive nature of both the loud-
speaker impedance and the internal impedance, neither of which is strictly
true.
117
Amplifier Distortion
118
Harmonic Distortion
In the diagram, the input is a single frequency (pure sine wave), but
the output waveform is clipped by the amplifier. The result is that harmonic
frequencies not present in the original signal are produced at the output
(harmonic distortion). This harmonic distortion contains only odd harmon-
ics if the clipping is symmetrical. For example, a geometrical square wave
119
has only odd harmonics, and as a signal is clipped, it approaches a square
wave rather than a sine wave.
The frequency spectrum at right is that measured at the output of a
particular amplifier driven above its rated power. The spectrum has a larger
amount of odd harmonic than even harmonic output, but the fact that even
harmonics are present suggests that the distortion was not symmetrical with
respect to the waveform.
An amplifier can be said to be linear if the output voltage is strictly
proportional to the input signal. Any nonlinearity, such as that arising from
the semiconductor devices themselves, will give rise to harmonic distortion.
Such defects in the performance of the devices can be minimized by using
negative feedback in the circuit so long as the output is not overdriven to
the point of clipping.
Plots of frequency spectra such as those illustrated here can be impor-
tant diagnostic and research tools. Converting a signal from a plot as a func-
tion of time to a plot as a function of frequency is called Fourier analysis,
and a common display is the Fast Fourier Transform or FFT of the signal.
120
Intermodulation Distortion
121
Dynamic Loudspeaker Principle
122
drive the coils of a loudspeaker. Having a “high fidelity” amplifier means
that you make it larger without changing any of its properties. Any changes
would be perceived as distortions of the sound since the human ear is amaz-
ingly sensitive to such changes. Once the amplifier has made the electrical
image large enough, it applies it to the voice coils of the loudspeaker, mak-
ing them vibrate with a pattern that follows the variations of the original
signal. The voice coil is attached to and drives the cone of the loudspeaker,
which in turn drives the air. This action on the air produces sound that
more-or-less reproduces the sound pressure variations of the original signal.
123
Loudspeaker Basics
124
The sound from the back of the The free cone speaker is very inef-
speaker cone will tend to cancel ficient at producing sound wave-
the sound from the front, espe- lengths longer than the diameter
cially for low frequencies. of the speaker.
125
Loudspeaker Details
126
127
Types of Enclosures
128
The nature of the enclosure can affect the efficiency and the directional-
ity of a loudspeaker. The use of horn loudspeakers can provide higher effi-
ciency and more directionality, but in extremes can reduce the fidelity of the
sound. Line array enclosures can provide some directionality.
The term “infinite baffle” is
often encountered in discussions of
loudspeaker installations. It visual-
izes a loudspeaker mounted in an
infinite plane with unlimited vol-
ume behind it, but in practical use
may refer to a loudspeaker mounted
in the surface of a flat wall with con-
siderable volume of air behind it.
Because of the elastic properties of
the loudpeaker suspension, it will still exhibit its natural free-cone reso-
nance, but will be free of the diffraction effects observed with a small box
speaker, and essentially free of the effects of the compression of the air be-
hind the loudspeaker cone.
129
Use of Multiple Drivers in Loudspeakers
130
Horn Loudspeakers
131
Line Array or Column Loudspeakers
132
Example of a line ar-
ray loudspeaker collection
which is ceiling-mounted
in an auditorium. It points
generally at the audience
and spreads the sound per-
pendicular to the array.
133
Directionality of Loudspeakers
134
135
Monaural and Stereo Signals
136
Surround Sound
137
Dolby Pro-Logic
138
AC-3 Digital Surround Sound
139
signal-handling capability to the channel with the greatest current demand.
AC-3 was originally developed for HDTV. AC stands for Audio Coding
and 3 is the generation of the design. The designation “Dolby digital” is
sometimes used as a name for this system.
140
Dolby Signal Processing
Dolby Stereo is the name given to the four-channel surround sound de-
veloped by Dolby Laboratories and introduced into movie theaters in the
70’s. It employed a matrix encoding scheme called Dolby Surround which
recorded four channels of information on two channels. The two channels
are decoded into L, R, Center and Surround upon playback. The center
channel is recorded identically on the left and right channels.
Riggs, Michael, Digital Surround Comes Home, Stereo Review, May
1995 p 62
Ranada, David, “Inside Dolby Digital”, Stereo Review 61, Oct 96 p81-84.
Digital Surround Sound
Digital surround refers to surround sound systems which employ dis-
crete digital recordings of five channels of sound information. Digital sur-
round sound has been introduced into movie theaters in a form called
Dolby Stereo Digital. At the heart of Dolby Stereo Digital is and encoding
scheme called AC-3. The AC-3 based systems are now often referred to as
just “Dolby digital” in the consumer market. It’s hard to tell which is chang-
ing faster: the technology or the terminology.
Dolby Stereo Digital uses a digital data stream running at 320 kilobits
141
per second. The HDTV and laserdisc version of Dolby Surround AC-3 Digi-
tal runs at 384 kilobits per second and dynamically allocates the bits to
the channel with the most demanding signal. Use is made of perceptual
encoding to decide which parts of the audio signal would not be heard and
therefore can be eliminated. The system provides a slight delay in the center
channel sound to achieve a more realistic experience of the sounds arriving
at the listeners location from the other speakers.
Riggs, Michael, Digital Surround Comes Home, Stereo Review, May
1995 p 62
Ranada, David, “Inside Dolby Digital”, Stereo Review 61, Oct 96 p81-84.
142
Perceptual Encoding for Digital Sound
143
MP3 Digital Sound
MP3 stands for MPEG 1 Layer 3. MPEG is a compression type for digi-
tal data. MP3 is a variation of this format that allows sound files to be
compressed by 90 percent without major degradation of the quality of the
sound. The compressed audio file takes up so much less storage space than
on a regular compact disc or tape that it has become very convenient for
transfer on the Internet.
The real possibility for sound compression without audible loss comes
from the fact that the sampling for CDs contains far more than the nec-
essary data. Sixteen-bit digital sampling at 44.1 kHz gives you a stagger-
ing amount of information. From the audio CD you get about 1.4 million
bits per second of information, much more information than your ears can
process. To create the MP3 signal, the information from the CD format is
divided into frequency subbands. Then the signal in each subband is exam-
ined in the process of encoding to decide how many bits to allocate to it. The
process employs a “psychoacoustic model” to decide which subbands will be
recorded most accurately and which will be discriminated against. The idea
is that only that which can realistically be heard by the ear is kept.
The favorite visual metaphor is the “polar bear in the snow storm”.
Against a dark mountain on a clear day, you would have to paint the polar
144
bear with great definition. But if the polar bear is in a snowstorm, you don’t
have to provide as much detail, because you are not going to see much detail
anyway. By analogy, if a sound in a particular subband is going to be masked
out by other subbands so that you won’t hear it anyway, you might as well
save the bits you were going to use to record it. The “psychoacoustic model”
makes judgements about which sounds were going to be masked out.
Some model is applied in the encoding of the high-resolution digital
sound image to MP3, and that model is inevitably going to take out some
audible information. You can improve the model by encoding at a higher bit
rate, because you are putting in more information. Typical current bit rates
are 128, 160, 192, 256 and 320 kbps. Tests show that the accuracy increases
significantly up to 256 kbps with some current decoders, so 256 kbps is
perhaps a good comparison standard. At 256,000 bits of information per
second, you have reduced the 1.4 Mbps to about 18% - compression by more
than five to one. Of course you can get ten to one at 128 kbps, but you can’t
expect to get it without noticeable loss of sound quality.
145
Chapter 5
Simplified model of sound system
Simplified Model:
Sound Reinforcement
Assume*:
The loudspeaker provides more sound to the listener than would other-
wise have been received, but it also produces sound at the location of the
microphone. This feedback to the microphone limits the amount of ampli-
fication, which can be used. Control of the feedback generally is the deter-
mining factor for the potential acoustic gain that can be achieved by a sound
reinforcement system.
The microphone creates an electrical image of the sound, which is ampli-
fied and used to drive a loudspeaker.
148
Inverse Square Law Assumption
149
Omnidirectional Assumption
150
Numerical Example:
Need for Amplification
By the inverse square law, a doubling of distance will drop the sound
intensity to 1/4, corresponding to a drop of 6 decibels. Note that in the table
at right, the distance is doubled in each successive step.
151
152
Numerical Example:
Need for Amplification
Level Distance
dB ft
80 2 Starting with 80 dB at two feet and using the
74 4 fact that every doubling of distance will drop the
68 8
level by 6 dB, we learn that a listener at 128 ft
62 16
would receive a sound intensity of only 44 dB!
56 32
50 64
44 128
153
Maximum Amplification Condition
154
The Limitation of Feedback
155
“Ringing the System”
When the gain on a sound amplification system is turned too high, the out-
put from the loudspeaker changes to an unpleasant, loud, usually high-pitched
sound. This is the result of too much feedback, but instead of reproducing
the sound being amplified, it usually produces a single pitch at the frequency
which is amplified the most by the sound system/room combination.
Note that an ideal sound system responds equally to all frequencies. This
not only gives a high fidelity reproduction of the sound, it also gives a higher
potential acoustic gain from the amplifier system. The horizontal dashed line
which represents the ideal system on the right above is still well short of the
feedback level when the other system begins to ring. The procedure for filtering
the frequency response to approach the ideal flat response is called equalization.
156
Increasing Potential Acoustic Gain
157
The first practical step which is usually taken when a portable sound
system rings from feedback is to move the loudspeaker farther from the mi-
crophone. The amount of anticipated improvement in the potential acoustic
gain can be modeled for the simplified amplification system. In a real audi-
torium, you cannot achieve as much improvement as that modeled amount
because of reverberation. In general, it does no good to move the loud-
speaker out past the critical distance at which the reverberant sound field
contributes as much to feedback as the direct sound field. As a practical
measure, this critical distance for the speakers must be determined by ex-
periment, moving the speakers farther out until you can no longer increase
the gain before feedback.
158
Critical Distance
for Speaker Placement
You can get more amplification of sound without the annoying ringing
by moving the speakers further from the microphone. However, this has
practical limits.
The direct sound field from a point source in an auditorium drops off
according to the inverse square law. To the extent that the speaker can be
considered to be a point source, then the feedback from that speaker to the
microphone is decreased by moving the speaker further away.
159
The reverberant sound field, on the other hand, more or less fills the en-
tire room and the contribution of the loudspeaker to the reverberation does
not decrease as you move the speaker further out from the microphone.
The critical distance is defined as the distance at which the reverberant
sound is equal in intensity to the direct sound. At distances greater than the
critical distance, the reverberant sound is dominant, and you get no further
increases in potential acoustic gain by moving the speakers further out.
160
Move Loudspeaker Closer to Listener
Moving the loudspeaker closer to the listener without changing the dis-
tance from the microphone will increase the available amplification. In most
practical applications, this means adding extra speakers which are closer to
the listener. However, this creates sound image problems, and the use of
digital delay of the signal to those extra speakers is recommended.
161
Move Loudspeaker Closer to Listener
One of the obvious ways to get more amplified sound to the listener is
to move the loudspeaker closer to the listener. The amount of anticipated
improvement in the potential acoustic gain can be modeled for the simpli-
fied amplification system. As a practical matter in larger auditoriums, this
means using additional speakers which are closer to the listener to add to
the sound from a main speaker cluster. A problem which arises is that the
signal from the amplifier to the distant speaker travels at the speed of light
whereas the direct sound from the source travels at the speed of sound. A
sound image problem results from the fact that the sound from the nearby
speaker reaches the listener before the sound from the visible source in the
front of the auditorium - your ear locates a sound partly by time of arrival
and therefore hears it coming from the speaker. The location conflict be-
tween your ears and eyes can be disconcerting. This is typically overcome
by using a digital delay for the sound signal going to the distant speakers.
162
Use of Digital Delay
To maintain the perception that the sound is coming from the front of
the auditorium, it is necessary to use digital delay to speakers under balco-
nies, etc., where they are much closer to the listener than the main speakers.
The signal to the speaker from the microphone travels at the speed of light,
and the sound to the listener would arrive first from the closest speaker.
Precedence has a strong localizing influence, and all the sound would seem
to be coming from the nearby speaker. With appropriate delays, the sound
to all listeners seems to come from the main speaker.
163
Equalization
One of the most powerful tools for increasing the potential acoustic gain
of a sound amplification system is the production of a “mirror image” filter
to level the frequency response of the system. The process of equalizing an
auditorium also improves the fidelity of the sound. The process of leveling
out the frequency response of the sound system removes the peaks which
will ring the system before sufficient gain is achieved.
164
White Noise
165
sound of steam escaping from an overheated steam boiler. The ear is aware
of a lot of high frequency sound in white noise since the ear is more sensi-
tive to high frequencies. Since each successive octave of frequency will have
twice as many Hz in its range, the power in white noise will increase by
a factor of two for each octave band. Twice the power corresponds to a 3
decibel increase, so white noise is said to increase 3 dB per octave in power.
Representing the differences between white and pink noise in dB
makes the difference seem less drastic.
166
Pink Noise
Pink noise, rather than white noise, is often the choice for testing and
equalizing rooms and auditoriums. Broad-band noise signals are desirable
for such testing.
Whereas white noise is defined as sound with equal power per Hz in fre-
quency, pink noise is filtered to give equal power per octave or equal power
per 1/3 octave. Since the number of Hz in each successive octave increases
by two, this means the power of pink noise per Hz of bandwidth decreases
by a factor of two or 3 decibels per octave.
Since pink noise has relatively more bass than white noise, it sounds
more like the roar of a waterfall than like the higher hissing sound of white
noise.
When pink noise is chosen for equalizing auditoriums, real-time analyz-
ers can be set up so that they display a straight horizontal line when they
receive pink noise. With pink noise input to the sound system, the response
167
curve can be adjusted to produce pink noise in the auditorium as measured
by the real-time analyzer. This provides optimum fidelity as well as increases
the potential acoustic gain of the sound amplification system.
With pink noise, the intensity is filtered to drop 30 dB over the 10 octave
audible frequency range.
The difference between pink noise and white noise is exaggerated in the
top illustration by the process of making the vertical axis linear with in-
tensity. The reason is that the ear is definitely not linear in its response to
sound. The sound intensity of pink noise drops by a factor of 1000 over the
audible frequency range, and that sounds very drastic. That drop should
be considered in light of the “rule of thumb” for loudness perception: the
fact that dropping the sound intensity by a factor of 10 or 10dB results in a
sound that is perceived to be half as loud to the human ear. If each 10dB of
drop results in a sound half as loud, then a 30dB drop will result in a sound
perceived as 1/8 as loud - significantly less, to be sure, but not as drastic as
the factor of 1/1000 in intensity would imply.
168
Glossary :
Inverse Square Law :
The sound intensity from a point source of sound will obey the inverse
square law if there are no reflections or reverberation.
Reverberation :
It is the collection of reflected sounds from the surfaces in an enclosure
like an auditorium.
Sound Synthesis :
Periodic electric signals can be converted into sound by amplifying them
and driving a loudspeaker with them.
Faraday’s Law:
Any change in the magnetic environment of a coil of wire will cause a
voltage (emf) to be “induced” in the coil.
169
Atmospheric Pressure
The surface of the earth is at the bottom of an atmospheric sea. The
standard atmospheric pressure is measured in various units:
Threshold of Hearing
Sound level measurements in decibels are generally referenced to a
standard threshold of hearing at 1000 Hz for the human ear which can be
stated in terms of sound intensity:
Sensitivity of Human Ear
The human ear can respond to minute pressure variations in the air if
they are in the audible frequency range, roughly 20 Hz - 20 kHz.
Audible Sound
Usually “sound” is used to mean sound which can be perceived by the
human ear, i.e., “sound” refers to audible sound unless otherwise classified.
Transverse Waves
For transverse waves the displacement of the medium is perpendicular
to the direction of propagation of the wave.
Longitudinal Waves
In longitudinal waves the displacement of the medium is parallel to the
propagation of the wave. A wave in a “slinky” is a good visualization.
170
Fourier Analysis and Synthesis
The mathematician Fourier proved that any continuous function could
be produced as an infinite sum of sine and cosine waves.
Fast Fourier Transforms
Fourier analysis of a periodic function refers to the extraction of the
series of sines and cosines which when superimposed will reproduce the
function.
Musical Intervals
The term musical interval refers to a step up or down in pitch which is
specified by the ratio of the frequencies involved.
Period: the time required to complete a full cycle, T in seconds/cycle
Frequency: the number of cycles per second, f in 1/seconds or Hertz
(Hz)
Amplitude :the maximum displacement from equilibrium A
FM Stereo Broadcast Band
The bandwidth assigned to each FM station is sufficiently wide to broad-
cast high-fidelity, stereo signals.
171
About the author
172
nected network by RDS alternative frequency, during that he hold master
degree from Georgia state university2003, after holding master degree he
became technical director for melody tv and did some of famous crystal
clear screen like melody drama , melody tuns , melody aflam , melody clas-
sic , melody trix and thin moved to modern sport and did first free to air
sport channel and it was channel no 1 in egypt 2007 as CTO , during that
he did his Owen business enoshmink technology 2005 it was very small
company till 2007 GN4me starts it promising project that spread 400 cin-
ema screen in 4 years and that project was at enoshmink tech hand by dr.
ibrahim elnoshokaty effort in that field by customize and R and D of sound
isolation material and did real time equalization for the auditorium that
made the experience of watching movie is good for the audience and did
first rear projection in egypt ( deep mail alex ) and did first auditorium at
elnajaf elasharf iraq and his company work in sound innovation like sound
treatment painting and it approved by admin capital city elmasa hotel and
it work to reduce reverb from the dome hall second enocrso that system
is monitoring and management system for public address system sharedin
is online plug and play radio and social media posting platform and the
awared he take .
During my PhD I obtained the following Certificates 2006 :
He holds a certificate from Georgia state University in U.S.A is awarded
of Excellence.
He holds a certificate from Georgia state University in U.S.A is awarded
of Distinction.
He holds a Bachelor of sound Engineering 1999.
173
HONOURS AND ACTVITIES
Appreciation Certificate:
Member of acoustical society of America.
Member of acoustical society of Egypt.
Member of international society of physics
Appreciation Certificate from Housing & Building National Research
Center 2013
Appreciation Certificate from Cofermetal 2012.
Appreciation Certificate from Armstrong 2011
Appreciation Certificate from Palestine Radio 2009
Appreciation Certificate from Egyptian Radio and Television Union 2007
Appreciation Certificate from Nugoom F.M 2007
Appreciation Certificate from Spin F.M 2007
Appreciation Certificate from Sout Elmadena 2007
Appreciation Certificate from Ecreso(fm transmitter manufactory) 2008
Appreciation Certificate from Modern sport 2005
Appreciation Certificate from Moldy 2002
174