Chapter 2
Computer System Organization
Prof. Yazeed Al-Sbou
Outlines
• 2.1 Processors
– CPU Organization
– Instruction execution
– Instruction Level Parallelism
– Processor level parallelism
• 2.2 Primary Memory
• 2.3 Secondary Memory
• 2.4 Input/Output
Basic Computer Organization
•contains circuits that direct and
• high speed coordinate proper sequence
temporary data • interprets each instruction and
storage area to apply the proper signals to the
support ALU and registers.
execution
activities. •performs the arithmetic
• Advantages:
• Shorter
and logical execution
references within the processor.
• Faster access •has no internal storage.
• Ease of
programming
Inside the Computer
power supply
I/O devices
fan
motherboard
The motherboard has three parts
– I/O connections
– Memory
– Processors
The processor
• Typical Computer is an
• Interconnected system of processors, memories,
buses, I/O devices. These are connected via
buses. These buses may be internal or external.
• Central Processing Unit (CPU) – most important
processor, and it is the ‘brain’ of the computer, it is
used to: Executes programs/instructions stored in
memory by fetching, examining, and executing
• It Consists of Arithmetic & Logic Unit (ALU), Control Unit
(CU), small fast memories/registers
1. ALU – performs logic and arithmetic operations e.g. add,
subtract, shift, rotate
2. Control Unit – control fetching instructions from memory,
determine type (decode) & direct data to right place, at the
right time, choose circuits & data
3. Registers – small high speed memory very close to the CPU
• General purpose – store data, control information, addresses
• Special purpose – only data or only addresses, e.g. PC, IR etc.
• These registers like Program Counter (PC), Instruction Register
(IR), Memory Address Register (MAR), Memory Data or
Buffer Register (MDR, MBR)
• Accumulator – main register that holds data CPU needs to
process, results of operations.
• PC – holds memory address of the next instruction to be
executed
• IR – holds the next instruction to be executed
• MAR – holds the address of the memory location referenced
• MDR – holds the data read from or to be written to memory
CPU Organization
• The CPU organization uses the von Neumann
machine model to perform arithmetic and logic
operations using what is called the Data Path. This
consists of:
– An Arithmetic Logic Unit (ALU): Performs additions,
subtractions, multiplications, divisions, comparisons, bit
shifting, … etc
– Registers (typically 1 to 32)
–Several buses connecting the pieces.
– Basically, it consists of lots of gates
Data Path of Neumann Machine of Addition
• The CPU (Data Path) performs additions, subtractions,
multiplications, divisions, comparisons, and bit shifting on
its inputs. The output can be then stored back into a register
which will be then stored in memory.
• Instructions may be categorized into one of the followings:
1. Register-memory instructions – load data from memory into
registers to be used as inputs for the ALU and allows the
output data to be stored back into memory.
2. Register-register instructions – do some operations on data in
registers and then store the result back in one of the used
registers.
• Data path – defines flow of data from memory to registers to
ALU for computation and back to registers and/or memory.
• Data path cycle – it is the process of executing or running
the above data path (flow) and directly affects the speed of
the machine. The faster the cycle is, the faster the machine
runs.
Instruction Execution Steps
• An instruction is a word defining a basic machine instruction
and its operands. As an example, for a register machine:
add r0, r1, r2.
• Instruction execution – fetch-decode-execute (FDE) cycle and
the CPU executes instructions in a series of small steps which
represent the central of the operation of all computers. These
steps are:
1. Fetch the next instruction from memory the into instruction
register
2. Change the program counter to point to next (i.e., following)
instruction
3. Determine the type of instruction just fetched
4. If the instruction uses a word in memory, determine its location
5. Execute the instruction
6. Go back to step 1 to begin executing following instruction
Design Principles for Modern
Computers
• This is a set of principles which are called “RISC
Design Principles”.
1. All instructions are directly executed by hardware
– Not interpreted by microinstructions
2. Maximize rate at which instructions are issued
– Minimize execution time
– Exploit parallelism for better performance
3. Instructions should be easy to decode
4. Only loads and stores should reference memory
5. Provide plenty of registers
Parallelism
• It is the process of doing two or more things
at once to get better performance.
– Instruction-Level Parallelism:
• Pipelining: Technique in which the execution of
several instructions is overlapped.
• Superscalar architectures:
– Using parallel pipelines
– Processor-level Parallelism:
• Array computers
• Multiprocessors
• Multicomputers
Instruction-Level Parallelism
• Fetching instructions from memory is a bottleneck
for the execution speed.
• To solve this, the instructions are stored in a set of
registers called Prefetch Buffers.
• Therefore, when an instruction is needed, it could
easily be taken from the Prefetch buffer rather
than waiting for a memory read to complete.
• This divides the instruction execution into two
steps: (i.e., fetching and execution).
Pipelining
• It is a type of Instruction-Level Parallelism.
• Using pipelining, the process of instruction
executing is often divided into many parts
or steps.
• All of these steps can run in parallel by a
piece of hardware.
• See the following Example.
Cont. Pipelining
• This is a five-stage pipeline
• Using pipelining, each instruction is broken into several
stages.
• Stages can operate concurrently PROVIDED WE HAVE
SEPARATE RESOURCES FOR EACH STAGE!
• Note: execution time for a single instruction is NOT
improved. Throughput of several instructions is improved.
Pipelining Benefits
• Performance: pipelining allows the trade off between:
– Speed or bandwidth (Millions Instructions per
Second)
– Latency/Execution time
• Completely hardware mechanism
• All modern machines are pipelined
– This was the key technique for advancing
performance in the 80’s
– In the 90’s the move was to multiple pipelines
• Beware, no benefit is totally free/good
– Problem: Watch for hazards!!!
Superscalar Architectures
• Dual five-stage pipelines with a common instruction fetch
unit:
– Each pipeline with its own hardware for each stage,
providing duplicate decoding and duplicate ALU's.
– The two instructions must not conflict over resource
usage (registers)
– The main pipeline is called u pipeline and the second
one is v pipeline
Superscalar Architectures
•Also refers to single pipeline with multiple functional units at
one or more stages of the pipeline.
•What are the benefits?
A superscalar processor with five functional units.
• Today’s definition of Superscalar architecture: Processor that
can issue multiple instructions per clock cycle (often four or
six).
Processor-Level Parallelism
• Many applications require fast computers, therefore high-
speed computers become a must to solve modern
problems.
• Due to CPUs keep getting faster and faster, computers
will run into the problems with the speed of light.
• To get these types of computers, Instruction-Level
parallelism may provide a little help using pipelining and
superscalar, but this high speed can be achieved by
designing computers with multiple CPUs.
• Processor-Level Parallelism refers to a computer system
with multiple processors:
1. Array computers
2. Multiprocessors
3. Multicomputers
Arrays Computers
– An array processor is an array which have a large number of
identical processors that perform the same sequence of
instructions on different sets of data. There are two methods that
have been used to execute large scientific problems:
1. Array processors: have one control unit that directs an array of
identical processors to execute the same instruction at the same
time, but on different data.
SIMD Processor:
Single Instruction
Multiple Data
An array of processor of the ILLIAC IV type.
Cont… array computers
2. Vector Processor:
• It looks like the array processors in executing a
sequence of operations on data elements,
efficiently. But unlike them in that all addition
operations are performed using single pipelined
adder.
• Both array processors and vector processors
work on an arrays of data. The result is another
array using the array processor and another
vector using the vector processor.
Multiprocessors
• Multiprocessor: is a system which consists of
multiple CPUs sharing a common memory.
Therefore, these CPUs must be controlled to avoid
conflict. Different implementations are possible.
a) A single-bus multiprocessor: multiple independent CPU's
sharing a common memory and other resources. This type
will face the problem of conflicts if CPUs try to access the
memory at the same time because they are using same bus
and the same resource.
b) To reduce the contention between the processors and
improve the performance, a multiprocessor with local
memories may be used to try to reduce the number of
conflicts because accessing to this special memory does not
need to use the main bus
Processor-Level Parallelism
a- Single-bus multiprocessor
b- Multiprocessor with local memories
Outline
• 2.1 Processors
• 2.2 Primary Memory
– Bits
– Memory Address
– Byte Ordering
– Error-correcting codes
– Cache memory
– Packaging
• 2.3 Secondary Memory
• 2.4Input/Output
Memory
• It is the part where the computer programs and
data are stored according to the Von Neumann
architecture. And without it no stored-programs or
data are available.
• The main memory is built from multiple DRAM
chips
– Dynamic Random Access Memory: using the
DRAM, individual storage locations in the
main memory may be accessed in any order and
at very high speed.
• Caches are smaller but faster memories that are
used for performance
– Caches are built out of SRAM technology
• Static Random Access Memory
• More expensive than DRAM (and less dense)
Memory Subsystem Organization
• Types of memory
– Read Only Memory (ROM): ROM chips are designed
for applications in which data is only read and retain
their data even when power to the chip is turned off.
• Masked ROM
• Programmable ROM (PROM)
• Erasable PROM (EPROM)
– Random Access Memory (RAM), or Read/Write
memory: used to store data that changes.
• Dynamic RAM
• Static RAM: once written, contents stay valid
– Cache memory
Processor-Memory Interconnections
Memory Addresses
• Memory is viewed as an array which consists of a
number of cells (or memory locations) to store
information.
• Each cell has an address, which is an index into the
array, where the programs can refer to (e.g., memory
of n cells will have addresses of (0 to n-1)). By
default, adjacent cells have successive addresses.
• All cells have the same number of bits (e.g., if a cell
consists of k bits, then it can hold any of the 2k
possible bit combinations).
As an example, the following Figures show three
ways for organizing a 96-bit memory.
• Memory addresses may be expressed in binary numbers,
which means that if an address has n bits, then the maximum
number of addressed cells is 2n. Therefore, how many bits do
we need to express each address in the previous memory
organization Figure.
• The number of bits in the address determines the
maximum number of addressable cells in the
memory.
• The address is independent of the number of bits
per cell.
• Therefore, the cell is the smallest addressable unit
which is standardized of 8-bit (byte) length.
• Bytes are grouped in words, e.g.,
– 32-bit word has 4 bytes/word
– 64-bit word has 8 bytes/word.
Error Correcting Codes
• Computer memories can make errors. These errors in memory
system may occur due to:
– Garbled bits when fetching or storing data
– Hard (permanent) errors: like manufacturing defects
– Soft (transient) errors: random, non-destructive, i.e., power supply
problems.
• Therefore, some memories use error-correcting and/or error-
detecting codes to detect and correct these errors.
• If these methods are used, additional bits are added for each
memory byte or word to ensure integrity of data. Then, each word
will contain:
– m-bit data
– r redundant or check bits
– Consequently, each word will have n = m + r bits (referred as
Codeword).
• In order to check if there are errors have occurred, Error
detection/correction can be done using Hamming distance.
• Hamming distance is defined as the number of 1’s in the
resulted EXCLUSIVE OR (XOR) of the two codewords.
Which means that if the Hamming distance is d, then it
will need d single-bit errors to change one to the another
word.
• As an example, the codewords 11110001 and 00110000.
these have a Hamming distance of 3. Therefore, we need 3
bits to convert one word to the other.
• Therefore, in order to check the words for error (error-
detecting), a single-bit called parity bit is added to the data.
• This bit is chosen so that the number of 1 bits is even (or
odd). Now if any bit is changed, the number of 1 bits is
odd (or even).
• Also, Hamming distance algorithm can be used to
construct error correcting codes for any word size.
• For that, the number of parity bits needed depends on the
word size. As shown in the following table.
• From this table, as the size of the word increases, the
percentage overhead decreases.
• Using the Hamming algorithm for constructing error-
correcting codes, r bits parity bits are added to m-bit word,
forming (r+m) bits new word. All bits with a number of
power 2 are parity bits, and the others are data. Each parity
bit checks specific bit positions. Parity bit is set so that the
number of 1s in the checked positions is even. Bit b is
checked by those bits b1, b2, …, bj such that b1 + b +, …+
bj = b.
• As an example, take the word 1111000010101110.
– The word size is m = 16 bits, which is 24 .
– This means that we have n = 4, then the number of parity check
bits required would be r = 4 + 1= 5 and the total size of the
memory word would be r + m = 16 + 5 = 21.
– The parity bits are in position 1, 2, 4, 8 and 16 (powers of 2). So
the first step is to write the word in positions 3, 5, 6, 7, 10, 11, 12,
13, 14, 15, 17, 18, 19, 20, and 21 leaving positions 1, 2, 4, 8, 16
blank.
– Bit 1 checks bits 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21 (take 1, skip 1 and
so on).
– Bit 2 checks bits 2, 3, 6, 7, 10, 11, 14, 15, 18, 19 (take 2, skip 2 and so
on).
– Bit 4 checks bits 4, 5, 6, 7, 12, 13, 14, 15, 20, 21 (take 4, skip 4 and so
on).
– Bit 8 checks bits 8, 9, 10, 11, 12, 13, 14, 15 (take 8, skip 8 and so on).
– Bit 16 checks bits 16, 17, 18, 19, 20, 21 (take 16, skip 16 and so on).
– For bit 1, the total of the bits it checks is 1 + 1 + 1 + 0 + 0 + 1 + 1 + 0
+ 1 + 0 = 6. This is even, so bit 1 is 0.
– For bit 2, the total of the bits it checks is 1 + 1 + 1 + 0 + 0 + 0 + 1 + 1
+ 1 = 6. This is even, so bit 2 is 0.
– For bit 4, the total of the bits it checks is 1 + 1 + 1 + 0 + 1 + 0 + 1 + 1
+ 0 = 6. This is even, so bit 4 is 0.
– For bit 8, the total of the bits it checks is 0 + 0 + 0 + 0 + 1 + 0 + 1 =
2. This is even, so bit 8 is 0.
– For bit 16, the total of the bits it checks is 0 + 1 + 1 + 1 + 0 = 3. This
is odd, so bit 16 is 1
• So the word is stored in memory as
• Now the grand total for bits 1, 2, 4, 8, and 16 will be even
• Bit 1 = 0 + 1 + 1 +1 + 0 + 0 + 1 + 1 + 0 + 1 + 0 = 6
Bit 2 = 0 + 1 + 1 + 1 + 0 + 0 + 0 + 1 + 1 + 1 = 6
Bit 4 = 0 + 1 + 1 + 1 + 0 + 1 + 0 + 1 + 1 + 0 = 6
Bit 8 = 0 + 0 + 0 + 0 + 0 + 1 + 0 + 1 = 2
Bit 16 = 1 + 0 + 1 + 1 + 1 + 0 = 4
• If the word somehow changes by altering bit 11 from 0
into 1.
• In order to check and detect the error to be able to correct it
– Bit 1 becomes 0 + 1 + 1 +1 + 0 + 1 + 1 + 1 + 0 + 1 + 0 = 7
which is odd
– Bit 2 becomes 0 + 1 + 1 + 1 + 0 + 1 + 0 + 1 + 1 + 1 = 7
which is odd
– Bit 4 becomes 0 + 1 + 1 + 1 + 0 + 1 + 0 + 1 + 1 + 0 = 6
which is even
– Bit 8 becomes 0 + 0 + 0 + 1 + 0 + 1 + 0 + 1 = 3 which is odd
– Bit 16 becomes 1 + 0 + 1 + 1 + 1 + 0 = 4 which is even.
• All bits must be even, so bits 1, 2, and 8 are
incorrect. Since 1 + 2 + 8 = 11, we know that bit 11 has
been changed. Hamming's code will not only let us know
that a bit has been changed but which bit has changed.
Cache Memory
• CPUs have always been faster than the memories.
• Chip manufacturers use pipelining and superscalar design to improve
CPU speed
• Memory designers have concentrated in increasing the capacity, but
speed remains relatively the same.
→ Problem: If CPU issues a memory word access request it will be idle
for multiple CPU cycles before getting the word (a lot of delays) (i.e.,
the slower the memory, the more cycles the CPU needs to wait).
• To deal with this kind of problems, there are two solutions:
1. Simply, is to just start memory READs when they are encountered
but continue executing and stall the CPU if an instruction tries to
use the memory word before it has arrived.
2. The other solution is to require compilers not to generate code to
use words before they have arrived. Often after a LOAD there is
nothing else to do, so the compiler is forced to insert NOP (no
operation) instructions, which do nothing, but occupy a slot and
waste time.
• Engineers can build fast memories, but they have to be
located on the CPU chip (going over the bus to memory is
slow). This makes the chip bigger and expensive. So the choice
is between a small amount of fast memory or a large amount of
slow memory.
• The best is to get large, fast and cheap memories !!!!!
• To achieve some of that, is to mix a small amount of fast
memory with a large amount of slow memory to get the speed of
the fast memory and the capacity of the large memory at
moderate prices. Here, the small fast memories are called Cache.
• The basic idea of Cache is simple, it keeps the most heavily used
memory words. When the CPU needs a word, it looks first in the
cache. If the word is not there, it checks the main memory. If a
substantial fraction of the words are in cache, the access time is
reduced.
• Success or failure depends on what fraction of the words
are in cache .
• Programs do not access their memories completely at
random. The next memory reference will be in the general
vicinity of the previous one.
• Except for branches and procedure calls, instructions are
fetched from consecutive locations in memory. While in
loops, a limited number of instructions are repeatedly
executed.
The Principle of Locality in Cache Memory
• It is when memory references are made in any short time interval
using a small fraction of the total memory. This principle forms
the basis of all caching memory systems.
• When a word is referenced, it and some of its neighbors are
brought from slow memory into the cache, so that the next time it
is required, it can be accessed quickly.
• If a word is read or written k times in a short interval, the
computer will need 1 reference to the slow memory (main) and
(k - 1) references to the fast memory (Cache). The larger k is, the
better the overall performance.
• If c is the cache access time, m is the main memory access time,
and h is the hit ratio, which is the fraction of all references that
can be satisfied out of cache and can be given by h = (k - 1)/k.
The miss ratio is given by (1 – h).
• The mean access time = c + (1 - h)*m.
1. If h 1, then the access time = c and all references are in
cache.
2. If h 0, then the access time = c + m, then a main memory
address is needed every time. First an unsuccessful time c to
check the cache, and then a time m to do the memory
reference.
• Due to this, main memory reference can be started in parallel
with cache search, so if a cache miss occurs, the memory cycle
is already on.
• Cache design should be compatible with the CPU performance.
If the CPU is of high-performance, the cache design
considerations are:
1. Cache size: the bigger the cache, the better the performance
but with higher cost.
2. The size of the cache line: e.g., a 16-KByte cache can be
divided into:
• 1024 lines of 16 bytes, or
• 2048 line of 8 bytes, or
• And so on…
3. The cache organization: Keeping tracking of which words
are still in use.
4. Are the instructions and the data kept in the same cache or
not?
1. Unified cache: both (instructions and the data ) are in
the same cache.
2. Split cache (Harvard Architecture): instructions and the
data are in different caches.
5. The last issue is the number of cache memories
Memory Packaging and Types
• Since early stages up to 90s, memory was manufactured and
provided in separate chips with typical sizes: from 1K-bit to
1M-bit.
• Now a days, a group of them are installed in groups of 8, 16 or
32 on printed boards. Therefore, current memories come on a
single plug in board in either a SIMM (1-sided) or DIMM (2-
sided) package.
• Single Inline Memory Module (SIMM): row of connectors on
one side
– Typical SIMM board has 72 connectors on one side of the board and
delivers 32 bits at once
• Dual Inline Memory Module (DIMM): row of connectors on
both sides
– Typical DIMM board has 84 connectors on each side of the board with
a total of 168 connectors and delivers 64 bits at once
– Commonly used today
A single inline
memory
module
(SIMM)
holding 256
MB. Two of
the chips
control the
Two types of SIMM's have been in
SIMM. general use:
1. 30-pin SIMM's have 8-bit data
buses;
2. 72-pin SIMM's have 32-bit
data buses.
Dual Inline Memory Module
512MB, 168-pin DIMM, SDRAM,
PC133 memory module
SDRAM: Synchronous Dynamic RAM.
a variant of DRAM in which the memory
speed is synchronized with the clock pulse
from the CPU.
enables the SDRAM to pipeline read and
write requests.
Outline
2.1 Processors
2.2 Primary Memory
2.3 Secondary Memory
– Magnetic Disks
– Floppy Disks
– SCSI Disks
– RAID
– CD-ROMs
– DVDs
Secondary Memory
• The main memory is always too small and it is usually
not enough for people’s usage. Therefore, additional
storage memories are required.
• The traditional solution for this is Memory Hierarchy.
Memory Hierarchy
Upper Level
faster
Instr. Operands
Blocks
Pages
Files
Larger
Lower Level
A five-level memory hierarchy
• These levels are:
1. At the top are the CPU registers, which can be
accessed at full CPU speed.
2. Next comes the cache memory, which ranges from 32
KB to a few megabytes
3. Next is main memory, from 16 MB to tens and
hundreds of gigabytes.
4. Then come magnetic disks for permanent storage.
5. Finally, there is magnetic tape and optical disks for
archival storage.
• As we move down the hierarchy, the access
time gets bigger:
1. CPU registers can be accessed in a few
nanoseconds.
2. Cache memories take a small multiple of CPU
registers.
3. Main memory needs a few tens of nanoseconds.
4. Disk access times are at least 10 msec,
5. Tape and optical disk times are measured in
seconds.
• But the storage capacity increases as we go
down:
1. CPU registers store 128 bytes,
2. Caches a few megabytes,
3. Main memories tens to thousands megabytes,
4. Magnetic disks a few gigabytes to tens of gigabytes.
5. Tapes and optical disks are kept off-line, so they have
unlimited capacity.
• Number of bits per dollar also increases as we
go down the hierarchy:
– Main memory is measured in dollars/megabyte,
– magnetic disk storage in pennies/megabyte, and
– magnetic tape in dollars/gigabyte or less.
Magnetic Disks
• Long term and large-size nonvolatile storage: come in floppy and
hard configurations.
• A magnetic disk consists of one or more aluminum platters with a
thin magnetizable coating/film. They are 3 to 12 cm in diameter
and less than 3 cm for notebook computers.
• A disk head containing an induction coil floats just over the
surface; resting on a cushion of air (it touches the surface on floppy
disks).
• When a current passes through the head, it magnetizes the surface
just beneath the head, aligning the magnetic particles facing left or
right depending on the polarity of the drive current.
• When the head passes over a magnetizes area, a positive or
negative current is induced in the head, making it possible to read
back the previously stored bits. Thus as the platter rotates under
the head, a stream of bits can be written and later read back.
• The circular sequence of bits written as the disk makes a complete
rotation is called a track.
• Each track is divided up into fixed-length sectors, containing 512 data
bytes, preceded by a preamble that allows the head to be able to
identify start of track and sector and synchronized before reading or
writing.
• Following the data is an ECC, either Hamming code
or Reed Solomon Code.
• Between consecutive sectors is an intersector gap.
• Disks have movable arms of moving in and out to
different radial locations and distances. At each
radial distance, a different track can be written.
• Tracks are series of concentric circles about the
spindle. Disks have between 5000 and 10,000 tracks
per cm, so track widths are 1 to 2 microns. These,
disks have a density of 50,000 to 100,000 bits/cm.
• Disks have multiple platters stacked vertically. Each surface has its
own arm and head.
• All the arms are ganged together so they move to different radial
positions all at once. The set of tracks at a given radial position is
called a Cylinder.
• The platters are often 2 sided, and most hard drives have multiple
platters, therefore, modern PC disks posses 6 to 12 platters/drive,
which provide 12 to 24 recording surfaces.
• To read or write a sector, the arm is moved to the right radial
position called seek. This takes 5 to 10 msec except 1 msec in
consecutive tracks (seek time).
• Rotational latency is a delay until the desired sector rotates
under the head. Disks rotate at 3600, 5400, or 7200, or 10,800
RPM, so the average delay (half a rotation) is 3 to 6 msec.
• Transfer time depends on the linear density and rotation
speed. With transfer rates of 20 to 40 MB/sec, a 512-byte sector
takes 13 to 26 micro seconds.
• Modern disks divide the surface into
zones. Each zone has an equally sized
number of sectors. This number
increased as moving outward which
will increase the drive capacity.
A disk with five zones. Each zone
has many tracks.
• Therefore, a magnetic-disk system consists of one or more disks
mounted on a common spindle. Each drive has three key parts:
– disk: assembly of disk platters
– disk drive: electromechanical mechanism that spins the disk
and moves the read/write heads
– disk controller: electronic circuitry chip that controls the
operation of the system (drive). Some of the controller's tasks
include
1. Accepting commands from the software, such as READ, WRITE,
and FORMAT,
2. Controlling the arm motion,
3. Detecting and correcting errors, and
4. Converting 8-bit bytes read from memory into a serial bit stream
and vice versa.
5. Remapping of bad sectors.
Floppy Disks
• Diskette or Floppy disk is a small removable medium,
discovered by IBM to record maintenance information.
• They have similar general characteristics of the magnetic
disks (hard disks) discussed earlier. BUT, floppy disk heads
touch the surface , so the media and heads wear out.
• So to solve the problem of tearing and wearing, personal
computers retract the heads and stop the rotation when a
drive is not reading or writing.
• When the next read or write command is encountered, there
is a delay of half a second while the motor gets up to speed.
• Mostly, they are not used in these days in modern
computers.
• Following is a comparison between some types of them:
Parameters
5.25” LD 5.25”HD 3.5” LD 3..5” HD
5.25 5.25 3.5 3.5
Size (Inches)
K360 M 1.2 K 720 M1.44
Capacity (Bytes)
Tracks 40 80 80 80
9 15 9 18
Sectors/tracks
Heads 2 2 2 2
300 360 300 300
Rotations/Min
(kbps) Data rate 250 500 250 500
Type Flexible Flexible Rigid Rigid
SCSI Disks
• Most disk units are designed to
connect to standard buses.
• SCSI (Small Computer System
Interface) disks use a different
interface and typically enjoy higher
data transfer rates. Various versions
use either 8 or 16-bit parallel
connections and achieve hundreds
of megabytes per second.
• Because they have high rates, they
are used in the UNIX workstations,
Macintoshes, network servers.
• SCSI controllers can be used as a
bus to manage a wide range of
devices up to 7, not just disk drives.
Each SCSI device has an ID and Some of the possible SCSI parameters.
devices can be chained to be
handled by a single controller.
• SCSI controllers are common on
servers and high-end workstations.
RAID
• CPU speeds are growing fast.
• Therefore, the gap between the CPU and the disk performance
has become larger over time.
• To improve disk performance and reliability, parallelism may
be used.
• Achieving fast and reliable storage based on multiple disks: a
technology called Redundant Array of Inexpensive/Independent
Disks (RAID) has been proposed in 1988.
• RAID is a box which consists of a set of disks hidden behind a
controller that appears to the host computer as a single logical
large disk drived by OS.
• Because SCSI disks have a good performance and low price,
RAID may consist of RAID SCSI controller in addition to a box
of SCSI disks (which appear as a single large disk).
• The data are distributed over the drives to allow parallel
operation.
• Several schemes have been proposed to arrange the
disks.
• The most common one is the 6 levels/configurations
(not a hierarchy), RAID level 0 to RAID level 5.
• Terminology:
– Data striping: A single large file is stored in several disk
units by breaking the file up into a number of smaller and
storing those pieces on different disks.
• These six different categorizations are simply as
follows:
• RAID level 0: Figure (a)
– Disks are divided into strips of k sectors.
– Data is striped across each disk sequentially
in round-robin fashion: k sectors to disk 1, the
next k to disk 2, etc,
– It works better with large requests.
• RAID level 1: Figure (b)
–Same organization as level 0, but with
mirrored disks (duplication): 2 copies of each
strip on separate disks.
– Data is always written to both drives, but
when reading, either drive can be used.
– In cases of fault or drive crash, Recovery is
simple: swap faulty disk & re-mirror with no
down time.
– Expensive.
• RAID level 2: Figure (c)
– This level works on a word or byte basis:
single byte/word.
– Splitting each byte into a pair of 4 bits, then
add 3 parity bits (1, 2 and 4) to each one to
form 7 bit word.
– 7 bit word is written over 7 disks which must
be synchronized with each others.
– Error correction calculated across
corresponding bits on disks
• RAID level 3 - Simplifies level 2, Figure (d)
– Single parity bit is added for each data word and
written to a parity drive.
– Here drives should also be synchronized.
– If any drive is crashed, 1-bit error correction is
used because the position of errored bit is known.
–Data on failed drive can be reconstructed from
surviving data and parity info
– Very high transfer rates
• RAID level 4 - Figure (e).
– It works with stripes.
– A strip-to-strip parity written onto extra drive.
– It extends level 0 by adding a parity strip on an
extra drive. This allows for full recovery if a drive
fails from the parity drive. This organization
places a heavy load on the parity drive.
• RAID level 5 - Figure (f). It uses the same
approach as level 4, but
– Parity striped across all disks
– Round robin allocation for parity stripe
– Avoids RAID 4 bottleneck at parity disk
– Commonly used in network servers
Optical Storage: CD-ROMs
• Due to their large capacity and low price, optical disks
are widely used for storage and distributing books,
movies, software,….
• The dimensions of CDs (Compact Disk) are 120 mm
diameter, 1.2 mm thickness, with 15 mm hole in the
middle.
• CDs are prepared to store data by a high power laser to
burn 0.8 micron holes in a coated glass disk to produce
bumps.
• Then, a molten polycarbonate is used to cover the
surface. Then, a thin aluminum layer is deposited on the
polycarbonate topped by a protective lacquer.
• The burned areas are called pits, and the unburned areas
between the pits are called lands.
• These pits may be used to
record 0s and lands to
record 1s or vice versa.
• The pits and lands are
written in a continuous
spiral starting near the
hole and going out 32 mm
to the edge with 22,188
revolutions. The total
length is 5.6 km.
• The rotation of the CD is
reduced as the head moves Recording structure of
further from the center to a Compact Disk or
CD-ROM.
keep the linear density of
the data the same along
the entire length of the
track. Rotational speed
varies from 530 to 200
RPM.
• In 1984, CD-ROMs appeared.
• CD-ROMs have the same physical size as the audio
CDs and mechanically and optically compatible with
them.
• The basic format of the CD-ROMs is based on encoding
every byte in a 14-bit symbol. A group of 42 consecutive
symbols forms a 588-bit frame.
• Every frame has 192 data bits (24 bytes). The other 396
bits are used for control and error correction.
• At a higher level, every 98 frames are grouped and called
a sector.
Logical data layout on a CD-ROM.
• Every CD-ROM sector starts with a 16-byte preamble which
recognizes the starting of the sector, contains the sector
number and the mode.
• Two modes allow a tradeoff between reliable storage of bytes
and efficient storage of video or audio data, bypassing error
correction.
• Mode 1 stores 2048 byte data + 288 bytes error correction
• Mode 2 combines the data and the error correction in 2336 byte
data field. This mode is used for applications that the ECC is not
important.
• Single-speed CD-ROM drives operates at 75 sectors/sec which
provide a data rate of 156,300 bytes/sec in mode 1 and
175,2000 bytes/sec in mode 2.
• A standard CD has a storage capacity of 74 minutes with 650
Mbyte.
• In 1986, graphics, audio, video,… multimedia application can
be merged and saved to the same CD-ROM.
• All information written to the CD-ROMs could not be erased.
CD Recordable
• In the 1990s, CD recorders were available as a back up
medium.
• CD-Recordable (CD-R) written data are stored permanently.
• The 1st CD-Rs were similar to the regular CD-ROMs with 120
mm polycarbonate blanks, except that they were gold colored on
top instead of silver for the reflective layer.
• Here, the reflections from the pits and the lands are simulated by
adding a dye layer between the polycarbonate and the reflective
layer.
• The added dye layer is transparent which allows the laser light to
pass through and reflects off the reflective layer.
• To write on CD-R,
– The laser is firstly turned up to high power (8-16 mW)
– When the beams hits a spot of the dye layer, it will be heated and create a dark spot.
• When the CD-R is read back, the photo-detector detects a
difference between the dark and transparent areas. This difference
is interpreted as the difference between the pits and lands.
• A new format of the CD-R and CD-ROM XA, which allows tracks
to be written incrementally (rather than all at once). This
organization requires each track to have its own table of contents
(VTOC). The reader searches for the most recent TOC and uses
this as the index to the CD contents.
• Because the track on a CD must be written in a single, continuous
process, the computer writing to the CD must be able to provide
data at the required rate for a sustained period of time.
CD Re-Writable
• For some applications, people are still in need for rewritable
CDs.
• To achieve that, other technology which uses the same size
media as CD-R is called CD-RW. But, for the recording layer,
it uses different layer called alloy of silver, indium, and
tellurium.
• CD-RW drives have three different laser powers:
1. High power: to melt the alloy to convert it from high-reflectivity to
low-reflectivity to represent a pit.
2. Medium power to melt the alloy to represent a land
3. Low power to sense the state of the alloy (for reading).
• The reasons why the CD-RW has not replaced the CD-R:
1. CD-RW blanks are more expensive
2. For security applications, the CD-R can not accidentally be erased,
while the CD-RW can be.
DVD
• DVD (Digital Versatile (or Video) Disk) uses the same general
design and storage techniques as normal CDs, but uses
– Smaller pits (0.4 microns vs. 0.8 microns for CDs),
– A tighter spiral (closer tracks) (0.74 microns vs. 1.6 microns for CDs), and
– A higher frequency red laser (can focus the light to a smaller spot) (i.e.,
0.65 microns vs. 0.78 microns for CDs).
• Due to these differences, a single DVD can store 7 times more
than a normal CD, i.e., up to 4.7 GB with data rate of 1.4MB/sec
compared to 150KB/sec of the CDs.
• Using some compression techniques, a 4.7 GB can hold 133
minutes of full motion, full screen and high resolution video.
• To increase the DVD storage capacity other than the 4.7 GB
which is a single-sided, single-layer format, other formats have
been defined:
1. 8.5 GB which is a single-sided, dual-layer format
2. 9.4 GB which is a double-sided, single-layer format
3. 17 GB which is a double-sided, dual-layer format.
Blu-Ray
• The successor to DVD is the Blu-Ray.
• It differs from the DVD in the followings:
1. It uses blue laser instead of red in DVD
2. This laser allows to focus more accurately than
red
3. Therefore, smaller pits and lands can be
achieved
4. Single-sided Blu-ray disks hold about 25 GB,
and double-sided holds 50 GB with data rate of
4.5 MB/sec.
5. These data rates are still insignificant compared
to those of magnetic disks.
Summary
• Memory is an important component in
computers
– Capacity and speed characteristics determine
performance of computer
• Primary storage
– Large main memory implemented with dynamic memory
chips
• Secondary storage in the form of magnetic and
optical disks provides the largest capacity in
memory.
• The data rates of optical disks are insignificant
compared to those of magnetic disks.
Outline
2.1 Processors 2.4 Input/Output
2.2 Primary Memory – Buses
2.3 Secondary Memory – Terminals
– Mouse
– Printers
– Telecommunication
equipments
– Character Codes
Input/Output
• The computer system’s I/O architecture is its interface
to the outside world. A set of I/O modules interface to
the processor and memory via the system bus.
• I/O: is a subsystem of components that moves coded
data between external devices and a host system.
– Interfaces to external components (keyboard,
scanner, disk, printers, …)
– Cabling or communication links between host
system and its peripherals.
Buses
• The logical arrangement and structure of a any PC
is as follows.
• It can be seen that one single bus has been used to
connect the CPU, memory and the I/O devices.
• Some systems may possess two or more buses.
Logical structure of a simple personal computer.
• Buses carry data between the major components of a
computer system.
– Shared resource requires a controller that handles the bus
communications.
• Controllers are either part of the main system board or are
located on plug-in circuit boards.
• The controllers are connected to its corresponding device by
a cable.
• Controllers are used to control the I/O device and switch
bus access to it.
• Some systems allow controllers to access the main memory
of the system. This allows data transfer without intervention
by the CPU and is called Direct Memory Access (DMA):
– When the transfer is complete, the controller signals the CPU by
sending it an interrupt which forces the CPU to hang up its
current program and to execute an interrupt handler to take any
action needed and tell the OS that the I/O is finished.
– When the interrupt handler is finished, the CPU carries on its
original program which has been suspended due to the Interrupt.
• Now if the bus is needed by the CPU and the I/O at the
same time, what will be the case?
• Arbitration procedures are used to mediate bus requests
when the bus is busy. It is common for I/O devices to have
preference to bus access because they may lose data if made
to wait.
– Cycle stealing: the DMA uses memory cycles that would
otherwise be used by the CPU.
• Bus technology has changed over the years, and each
new idea brings up the issue of compatibility.
– The early PC had an 8-bit ISA (Industry Standard
Architecture) bus. Then it was modified to be 16-bits.
– Then an EISA bus was a backward compatible bus using a 32-
bit data path.
– The more popular bus of the EISA is the Peripheral
Component Interconnect (PCI) should be considered the
successor to EISA.
Terminals
• The computer terminals are two parts:
1. Keyboard:
• Attaches to a PC communicate using a
serial protocol.
• Is connected to a dedicated processor
called the keyboard controller.
– detects the pressing events and generates an
interrupt on the main processor which then
handles the key event.
2. Monitor
• Cathode Ray Tube (CRT) Monitors
• Flat panel displays
• Video RAM
Cathode Ray Tube Monitors
• CRT:
– Electron gun shoots an
electron beam against the
phosphorescent screen
– The beam scan across the
face of the tube, one scan
line at a time. A typical
monitor scans at least 500
lines to display a full screen,
and does this at least 60
times each second.
– Because CRT images are
displayed one line at a time,
it is called a raster scan
device. (a) Cross section of a CRT
• Color monitors have (b) CRT scanning pattern
three electron guns (red,
green, blue)
• Horizontal and vertical sweeping is controlled
by applying a voltage to its corresponding
plates.
• The voltage applied to the grid causes the
corresponding bit pattern to appear on the
screen.
• This allows for the binary signals to be
converted into a visual display consisting of
bright and dark spots.
Flat Panel Displays
• CRT are very heavy to be used in notebook
computers.
• Therefore, another technology is used based on the
LCD (Liquid Crystal Display) technology.
• Liquid Crystal Display technology
– Uses an electrical field to change the optical properties
of a liquid crystal by changing the molecular alignment.
– This allows control of the intensity of light passing
through the crystal.
– This can be used to construct flat panel displays which
use a lighted panel behind the liquid crystal layer.
• Many types of displays are based on this
technology like the Twisted Nematic (TN), and
active matrix display.
(a) The construction of an LCD
screen.
(b) Twisted Nematic display: the
grooves on the rear and front
plates are perpendicular to one
another. CRT Monitor LCD Monitor
Video RAM
• Video RAM: is a special memory set aside to store the
digital data required to drive and refresh the display unit.
– Located on the display controller.
• Each pixel is represented by a data value in the video RAM
– common representation of a pixel uses a 24-bit (3 byte) RGB value
(one byte for the intensity of each of the three colors) allowing 224
(16 million) colors.
– indexed color: most common reduced color scheme uses 8-bit
number to represent a color ( i.e., palette of 256 RGB values). This
will reduce the required memory by 2/3.
• Video requires a high data rates to be transferred to the
display.
• Therefore, special buses are needed like the Accelerated
Graphics Port (AGP), which has a data rate of 252 MB/sec
and with different speed versions.
Mice
1. Mechanical mice
– motion of the wheels is turned into electrical signals measuring
movement in two directions.
2. An optical mouse
– It has no wheels and uses a LED to reflect light from the mouse
pad: the reflections are picked up by a photo detector which
determines how far the mouse has moved in each direction.
3. Opto-mechanical mouse.
• The mouse transmits movement information to the computer
through its tail. Usually three bytes are sent each time the
mouse detects a movement above some minimum (the
minimum is called a Mickey). The bytes represent x and y
change (signed) and the state of the mouse buttons. Software
running under the operating system control translates the
mouse data into cursor movement or other application specific
actions.
Printers
• Monochrome printers
– Matrix printer: print
head with 7-24
electromagnetically
activable needles
– Inkjet printers: movable
head which holds an ink
cartridge
• thermal inkjet (Bubble
jet) printers
• Piezoelectric inkjet.
– Laser printer chamber: • (a) The letter “A” on a 5 x 7
higher quality, great matrix.
speed, flexibility, and • (b) The letter “A” printed with
moderate cost. 24 overlapping needles.
Operation of a laser printer.
Color printers
– Produce colors by combining three colors: cyan,
yellow, and magenta pigments. These colors are the
result when red, blue and green are absorbed
instead of reflected. They are essentially the inverse
of the RGB color screen used to produce color
using light. These are called CYMK printers (K
stands for black) because mixing the colors does
not produce a true black.
– The total set of colors produced is called gamut.
– Converting a colored image from the screen to an
identical printed one has some problems, therefore,
requires special calibration, sophisticated software
in addition to expertise.
• Five technologies of colored printers are
commonly used :
1. Color ink jet printers: this kind uses four cartridges ( C,
M, Y, and K). Good quality and medium cost.
2. Solid ink jet printers: which uses four solid blocks of
special waxy ink which are melted to hot ink and then
sprayed onto paper.
3. Color laser printers: it works in the same way of the
normal laser one but with four different toners.
4. Wax printer: which uses a wide ribbon of four-color
wax with thousands of heating elements to melt the wax
and then fused to the paper in a form of pixels (CMYK).
5. Dye sublimation printer: CMYK dyes pass over
thermal head which contains thousands of programmable
heating elements, then the dyes are vaporized and
absorbed by a special paper. Each element can produce
256 colors depending on the temperature.
Telecommunications Equipment
• Modem (modulator/demodulator) :
– Used to send digital data across an analog telephone
network.
– Modulates the digital data by changing the binary
information into an analog signal and on the
receiving end demodulates the analog signal to
recover the binary data.
– Use amplitude, frequency, and phase changes to
encode the binary data.
• The simplest scheme transmits
an audio signal that switches
between two volumes (AM) or
tones (FM) to represent the 0's
and 1's. Faster modems use a
phase switching technique to
increase the number of bits that
can be transmitted per second,
the bit-rate. This is different (in
some cases) from the baud rate
which is the number of signal
changes possible per second.
• Full-duplex communication (2-
way) is possible if different
frequencies are used for transmit
and receive.
• Modems are limited to 57,600
bits per second due to the Transmission of the binary number
characteristics of common 01001010000100 over a telephone line bit
telephone systems which restrict by bit.
the frequency of transmission (a) Two-level signal. (b) Amplitude
between the substation and your modulation. (c) Frequency modulation.
phone. (d) Phase modulation.
Digital Subscriber Lines
• Because the modem has a maximum data rate of
56 kbps, services with more data rates are
required called broadband to be used in the
Internet access.
• The reason why the modem is very slow
compared with others, because it uses the same
telephone network which is based on human
voices where all the incoming data are restricted
to 3-4KHz by a filter.
• By removing the frequency filter, telephone
companies can offer high speed data lines as part of
the original telephone system. This is called DSL
(Digital Subscriber Lines). The higher frequency
spectrum allows the equivalent of about 250 modems
to utilize a single telephone line.
Operation of ADSL.
A typical ADSL equipment configuration.
Internet over Cable
• Cable companies also offer broadband access by utilizing extra
bandwidth already present in the cable connections to peoples
homes. Everyone in a neighborhood shares a single wire that is
connected to a headend (similar to a telephone substation) that
has very high bandwidth to the cable companies main office.
The bandwidth of the local cable is 750 MHz, of which about
200 MHz is available for data. The ADSL connection provides
only about 1 MHz, but it is not shared. The quality of Internet
service through cable is thus dependent on your neighbors'
usage.
Frequency allocation in a typical cable TV system used for Internet
access
Character Codes
• Every computer has a set of characters that it
uses. These include, numbers, letters, signs,…
etc.
• For the computer to be able to understand these
characters, each one is assigned to a number.
The process of transferring or mapping is called
character code.
• For computers to be able to communicate with
each others, they should use the same code.
• There are two common character codes:
1. American Standard Code for Information Interchange
(ASCII)
2. Unicode
ASCII Character Codes
• American Standard Code for Information Interchange (ASCII)
– It is a standard character code used by most PC's since the 1980's
– It assigns a 7 bit number to each of the encodable characters,
which allows to 128 characters to be used.
– ASCII characters are two types:
1. ASCII control characters for data transmission (non-printing).
2. ASCII printing characters (numbers, letters, signs,… etc.)
The ASCII Character set
Unicode Character Codes
• PC's systems soon added another 128 characters, to provide
graphics on text-based displays. Officially the ASCII set
was extended to Latin-1(8-bits) which included symbols and
letters with diacritical marks.
• To support international character sets, the ASCII system
has been extended to the Unicode Character Set (16 bit)
called a code point.
• This uses a variable length code that can represent any
character or symbol by a unique number. Various encoding
schemes are used; UTF-8 uses the byte values 00 through
7F to represent the ASCII characters. UCS-2 is a fixed size
16-bit code used to represent only the most common
characters.