Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
67 views160 pages

Coa Unit 4 Digital Notes

Uploaded by

23102208
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
67 views160 pages

Coa Unit 4 Digital Notes

Uploaded by

23102208
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 160

1

2
Please read this disclaimer before proceeding:
This document is confidential and intended solely for the educational purpose of RMK

Group of Educational Institutions. If you have received this document through email in

error, please notify the system manager. This document contains proprietary

information and is intended only to the respective group / learning community as

intended. If you are not the addressee you should not disseminate, distribute or copy

through e-mail. Please notify the sender immediately by e-mail if you have received

this document by mistake and delete this document from your system. If you are not

the intended recipient you are notified that disclosing, copying, distributing or taking

any action in reliance on the contents of this information is strictly prohibited.


22CS302-
COMPUTER
ORGANIZATION AND
ARCHITECTURE
DEPARTMENT OF COMPUTER SCIENCE AND
ENGINEERING

BATCH 2023-2027 & II-YEAR

Created By,
Dr.S.MUTHUSUNDARI, Associate Professor, CSE, R.M.D.E.C
Mrs.A.TAMIZHARASI, Assistant Professor, CSE,R.M.D.E.C
Mrs.J.GEETHAPRIYA, Assistant Professor, CSE,R.M.D.E.C

Date: 21.08.2023
Table of Contents
Sl. No. Contents Page No.

1 Contents 5

2 Course Objectives 6

3 Pre Requisites (Course Name with Code) 8

4 Syllabus (With Subject Code, Name, LTPC details) 10

5 Course Outcomes (6) 12

6 CO-PO/PSO Mapping 14
Lecture Plan (S.No., Topic, No. of Periods, Proposed
7 date, Actual Lecture Date, pertaining CO, Taxonomy 16
level, Mode of Delivery)
8 Activity based learning 18
Lecture Notes ( with Links to Videos, e-book reference,
9 20
PPTs, Quiz and any other learning materials )
Assignments ( For higher level learning and Evaluation
10 124
- Examples: Case study, Comprehensive design, etc.,)
11 Part A Q & A (with K level and CO) 127

12 Part B Qs (with K level and CO) 146


Supportive online Certification courses (NPTEL,
13 148
Swayam, Coursera, Udemy, etc.,)
14 Real time Applications in day to day life and to Industry 150
Contents beyond the Syllabus ( COE related Value
15 152
added courses)
16 Assessment Schedule ( Proposed Date & Actual Date) 154

17 Prescribed Text Books & Reference Books 156

18 Mini Project 158


Course Objectives

6
COURSE OBJECTIVES
To describe the basic principles and operations of digital computers.
To design arithmetic and logic unit for various fixed and floating point
operations
To construct pipeline architectures for RISC processors.
To explain various memory systems & I/O interfacings
To discuss parallel processor and multi-processor architectures
PRE REQUISITES
PRE REQUISITES

❖ 22EC101 Digital Principles and Systems Design ( Lab


Integrated)
Syllabus
Syllabus
22CS302 COMPUTER ORGANIZATION AND ARCHITECTURE LTP C
(Common to CSE, ADS and CSD) 3 00 3
OBJECTIVES:

The Course will enable learners to:


To describe the basic principles and operations of digital computers.
To design arithmetic and logic unit for various fixed and floating point operations
To construct pipeline architectures for RISC processors.
To explain various memory systems & I/O interfacings
To discuss parallel processor and multi-processor architectures.
UNIT I COMPUTER FUNDAMENTALS 9
Computer Types - Functional Units – Basic Operational Concepts - Number Representation
and Arithmetic Operations - Performance Measurements- Instruction Set Architecture:
Memory Locations and Addresses - Instructions and Instruction Sequencing - Addressing
Modes
UNIT II COMPUTER ARITHMETIC 9
Addition and Subtraction of Signed Numbers - Design of Fast Adders - Multiplication of
Unsigned Numbers - Multiplication of Signed Numbers - Fast Multiplication - Integer Division
- Floating-Point Numbers and Operations
UNIT III BASIC PROCESSING UNIT AND PIPELINING 9
Basic Processing Unit: Concepts - Instruction Execution - Hardware Components -
Instruction Fetch and Execution Steps - Control Signals - Hardwired Control Pipelining - Basic
Concept - Pipeline Organization - Pipelining Issues - Data Dependencies - Memory Delays -
Branch Delays - Resource Limitations - Performance Evaluation - Superscalar Operation.
UNIT IV I/O AND MEMORY 9
Input/Output Organization: Bus Structure - Bus Operation - Arbitration The Memory System:
Basic Concepts - Semiconductor RAM Memories - Read-only Memories - Direct Memory
Access - Memory Hierarchy - Cache Memories - Performance Considerations - Virtual
Memory - Memory Management Requirements Memory Management Requirements -
Secondary Storage.
UNIT V PARALLEL PROCESSING AND MULTICORE COMPUTERS 9
Parallel Processing: Use of Multiple Processors - Symmetric Multiprocessors - Cache
Coherence - Multithreading and Chip Multiprocessors - Clusters - Nonuniform Memory Access
Computers - Vector Computation - Multicore Organization.
TOTAL: 45 PERIODS
TEXTBOOKS:
1.Carl Hamacher, Zvonko Vranesic, Safwat Zaky, Computer organization, Tata McGraw Hill,
Sixth edition, 2012.
2.David A. Patterson and John L. Hennessy Computer Organization and Design -The
Hardware / Software Interface 5th edition, Morgan Kaufmann, 2013.
Course Outcomes
Course Outcomes
Course Description Knowledge
Outcomes Level

Explain the basic principles and operations of


CO1 K2
digital computers

Design Arithmetic and Logic Unit to perform fixed


CO2 K3
and floating point operations
Develop pipeline architectures for RISC
CO3 K3
Processors .
Summarize Various Memory systems & I/O
CO4 K4
interfacings.

Recognize Parallel Processor and Multi Processor


CO5 K5
Architectures.

Knowledge Level Description

K6 Evaluation

K5 Synthesis

K4 Analysis

K3 Application

K2 Comprehension

K1 Knowledge
CO – PO/PSO Mapping
CO – PO /PSO Mapping Matrix
CO PO PO PO PO PO PO PO PO PO PO PO PO PSO PSO PS0
1 2 3 4 5 6 7 8 9 10 11 12 1 2 3

1 3 2 1 1 3

2 3 3 2 2 3

3 3 3 1 1 3

4 3 3 1 1 3

5 3 3 1 1 3

6 2 2 1 1 3
Lecture Plan

16
Lecture Plan – Unit 4– I/O AND MEMORY
Sl. Topic Numb Proposed Actual CO Taxo Mode
No er of Date Lectur nomy of
. Perio e Date Level Deliver
ds y
Input/Output
Organization: Bus PPT /
1 Structure - Bus 1 CO4 K2 Chalk &
Operation, Talk
Arbitration
PPT /
2 Interface Circuits 1 CO4 K2 Chalk &
Talk

Interconnection PPT /
3 Standards - USB, 1 CO4 K2 Chalk &
SATA Talk

The Memory
System: Basic PPT /
4 Concepts - 1 CO4 K2 Chalk &
Semiconductor Talk
RAM Memories
PPT /
Read-only
5 1 CO4 K4 Chalk &
Memories Talk
Direct Memory PPT /
6 Access - Memory 1 CO4 K4 Chalk &
Hierarchy Talk
Cache Memories - PPT /
7. Performance 1 CO4 K3 Chalk &
Considerations - Talk
Virtual
Memory - PPT /
8. Memory 1 CO4 K3 Chalk &
Management Talk
Requirements
Secondary PPT /
Storage.
9. 1 CO4 K3 Chalk &
Talk
Activity Based
Learning

18
CROSS WORD PUZZLE – MEMORY AND I/O

Down: Across:
1. Time that elapsed between the initiation of an 3. correspondence between the main memory
operation and the completion of that operation. blocks and those in the cache is specified
(e.g.) the time between Read and the MFC 6. Main memory block can be placed in any cache
signal. 2. is used to send or receive data block position
having group of 8 bits or 16 bits 8. A single large file is stored in several separate
simultaneously disk units by breaking the file up into a number
4. transfer of large blocks of data at high speed, of smaller pieces and storing these pieces on
without continuous intervention by processor different disks.
5. Processor originates most memory access 10. Indicates whether the block has been
cycles, so the DMA controller can be said to modified
‘steal memory cycles’ from the processor 11. when the cache is full and a memory word
7. cache and main memory location updated that is not in the cache is reference, the cache
simultaneously control h/w decides which block to be removed
9. Standard I/O Interface 12. recently executed instructions is likely to be
13. arbitration process performed by executed again very soon
14. instructions in close proximity to a recently
executed inst. are likely to be executed soon
15. feature enhances the connection of new
device at any time, while the system is
operation
Lecture Notes

20
1. BUS STRUCTURE
o Bus connects processor and I/O devices.
o The three types of busses are: address bus, data bus and control
bus.

o Each I/O device has a unique address.


o Processor places address in address bus, device recognizes and
responds to the commands issued on the control lines.
o Processor requests either a read or write and data is transferred
over the data lines.

P ro c e s s o r M e m o ry

B us

I / O d ev ic e 1 I / O d ev ic e n

F ig u r e 4 . 1 . A s in g le - b u s s tr u c tu r e .

Types of I/O transfer:


1. Memory mapped I/O:
o I/O & memory share the same address space.
o Any machine instruction can access the memory and I/O
device.

o (e.g.) MOVE DATAIN , R0 R0 Keyboard


o MOVE R0, DATAIN Display  R0
o Simple method.
A dd r ess li nes
B us Da ta li nes
Co nt ro l li ne s

Addddress
r ess C
Coo nt
ntrol
ro l Da ta a nd I/O
deco
d eco dder
er cir cui ts sta tus r eg iste rs interface

In pu t ddevi
evice
ce

Figure 4.2. I/O int erf ace for an input device.

2. Separate Address space


o Separate I/O instructions
o Separate address space for I/O devices & memory
o Fewer address lines
o Not necessarily a separate bus. The same address bus is used
but, control lines tell whether the requested R/W is an I/O
operation or memory operation.

o Address decoder recognizes its address


o Data register holds data to be transferred
o Status register holds information relevant to the operation of
the I/O device
o Because of the difference in speed between I/O devices &
processor buffers are used.

3. Program controlled I/O


Move #LINE,R0 Initialize memory pointer.
WAITK TestBit #0,STATUS Test SIN.
Branch=0 WAITK Wait for characterto be entered.
Move DATAIN,R1 Readcharacter.
WAITD TestBit #1,STATUS Test SOUT.
Branch=0 WAITD Wait for display to becomeready.
Move R1,DATAOUT Sendcharacterto display.
Move R1,(R0)+ Store characterandadvance pointer.
Compare #$0D,R1 Check if CarriageReturn.
Branch0 WAITK If not, get anothercharacter.
Move #$0A,DATAOUT Otherwise,sendLine Feed.
Call PROCESS Call a subroutineto process
the inputline.

Figure 4.4 A program that reads one line from the keyboard stores it in memory buffer, and echoes it back to the display.
Consider a task that reads in character input from a keyboard and
produces character output on a display screen. A simple way of performing
such I/O tasks is to use a method known as program- controlledI/O. The
rate of data transfer from the keyboard to a computer is limited by the
typing speed of the user, which is unlikely to exceed a few characters per
second. The rate of output transfers from the computer to the display is
much higher. It is determined by the rate at which characters can be
transmitted over the link between the computer and the display device,
typically several thousand characters per second. However, this is still much
slower than the speed of a processor that can execute many millions of
instructions per second. The difference in speed between the processor
and I/O devices creates the need for mechanisms to synchronize the
transfer of data between them.

D AT A IN

D A TA O U T

S TA T U S D IRQ K I RQ SOUT S IN

C O N T RO L DEN KEN

7 6 5 4 3 2 1 0

F ig u r e 4 . 3 . R eg is te r s in k e y b o a r d a n d d is p la y in te r fa c e s .

A solution to this problem is as follows: On output, the processor sends


the first character and then waits for a signal from the display that the
character has been received. It then sends the second character, and so
on. Input is sent from the keyboard in a similar way; the processor waits
for a signal indicating that a character key has been struck and that its
code is available in some buffer register associated with the keyboard.
Then the processor proceeds to read that code.
The keyboard and the display are separate devices as shown in Figure
2.19. The action of striking a key on the keyboard does not automatically
cause the corresponding character to be displayed on the screen. One
block of instructions in the I/O program transfers the character into the
processor, and another associated block of instructions causes the
character to be displayed.

Consider the problem of moving a character code from the keyboard to


the processor :
o Processor continuously checks the status flag (SIN) (i.e) polls
the device.
o When a key is pressed in the keyboard , the char is placed in
the buffer and the SIN flag is set to 1. The processor
continuously checks the SIN flag, when it is 1, it read the char
from the buffer and resets the SIN flag to 0.
o Similarly to display a char on to the display unit, the processor
places a char on the display buffer when the SOUT flag is 0 and
sets the SOUT flag to 1. The display unit reads the char and
displays it & resets the SOUT flag to 0.
o Disadvantage: continuous processor involvement
The buffer registers DATAIN and DATAOUT and the status flags SIN and
SOUT are part of circuitry commonly known as a device interface. The
circuitry for each device is connected to the processor via a bus, as
indicated in Figure 2.19.
The main idea is that the processor monitors the status flag by executing
a short waitloopand proceeds to transfer the input data when SIN is set to
1 as a result of a key being struck. The Input operation resets SIN to 0.
An analogous sequence of operations is used for transferring output to the
display.
We have assumed that the addresses issued by the processor to access
instructions and operands always refer to memory locations. Many
computers use an arrangement called memory-mapped I /O in which
some memory address values are used to refer to peripheral device buffer
registers, such as DATAIN and DATAOUT. Thus, no special instructions are
needed to access the contents of these registers; data can be transferred
between these registers and the processor using instructions that we have
already discussed, such as Move, Load, or Store. For example, the
contents of the keyboard character buffer DATAIN can be transferred to
register R1 in the processor by the instruction

MoveByte DATAIN,R1
Similarly, the contents of register R1 can be transferred to DATAOUT by
the instruction
MoveByte R1,DATAOUT
The status flags SIN and SOUT are automatically cleared when the buffer
registers DATAIN and DATAOUT are referenced, respectively.

Program-controlled I/O requires continuous involvement of the processor


in the I/O activities. Almost all of the execution time for the program in
Figure 2.20 is accounted for in the two wait loops, while the processor
waits for a character to be struck or for the display to become available. It
is desirable to avoid wasting processor execution time in this situation.
Other I/O techniques, based on the use of interrupts, may be used to
improve the utilization of the processor.

4. Interrupts
o I/O device sends a special signal (interrupt) over the bus
whenever it is ready for a data transfer operation.

5. Direct Memory Access (DMA)


o Used for high speed I/O devices
o Device interface transfers data directly to or from the memory
o Processor not continuously involved

2. BUS ARBITRATION
There are occasions when two or more entities contend for the use of a
single resource in a computer system. For example, two devices may need
to access a given slave at the same time. In such cases, it is necessary to
decide which device will access the slave first. The decision is usually
made in an arbitration process performed by an arbiter circuit. The
arbitration process starts by each device sending a request to use the
shared resource. The arbiter associates priorities with individual requests.
If it receives two requests at the same time, it grants the use of the slave
to the device having the higher priority first.
The device that initiates data transfer requests on the bus is the bus
master - the processor. It is possible that several devices in a computer
system need to be bus masters to transfer data. Since the bus is a single
shared facility, it is essential to provide orderly access to it by the bus
masters.

• A device that wishes to use the bus sends a request to the arbiter.
• When multiple requests arrive at the same time, the arbiter selects
one request and grants the bus to the corresponding device.
• For some devices, a delay in gaining access to the bus may lead to
an error. Such devices must be given high priority.
• If there is no particular urgency among requests, the arbiter may
grant the bus using a simple round-robin scheme.

• Bus arbitration:
o The device that is allowed to initiate data transfers on the
bus at any given time is called the bus master.
o When the current master relinquishes control of the bus,
another device can acquire this status.
o Bus arbitration is the process by which the next device to
become the bus master is selected and bus mastership is
transferred to it.

o Two approaches:
▪ Centralized arbitration
▪ Distributed arbitration

o Centralized arbitration:
There are two Bus-request lines, BR1 and BR2, and two Bus-grant lines,
BG1 and BG2, connecting the arbiter to the masters. A master requests
use of the bus by activating its Bus-request line. If a single Bus-request is
activated, the arbiter activates the corresponding Bus-grant. This indicates
to the selected master that it may now use the bus for transferring data.
When the transfer is completed, that master deactivates its Bus-request,
and the arbiter deactivates its Bus-grant.
Figure 7.9 illustrates a possible sequence of events for the case of three
masters.
• Assume that master 1 has the highest priority, followed by
the others in increasing numerical order.

• Master 2 sends a request to use the bus first. Since there


are no other requests, the arbiter grants the bus to this
master by asserting BG2.
• When master 2 completes its data transfer operation, it
releases the bus by deactivating BR2. By that time, both
masters 1 and 3 have activated their request lines.
• Since device 1 has a higher priority, the arbiter activates
BG1 after it deactivates BG2, thus granting the bus to
master 1.
• Later, when master 1 releases the bus by deactivating BR1,
the arbiter deactivates BG1 and activates BG3 to grant the
bus to master 3.
• Note that the bus is granted to master 1 before master 3
even though master 3 activated its request line before
master 1.

o Distributed arbitration:
o All devices waiting to use the bus have equal responsibility in
carrying out the arbitration process., without using a central
arbiter.

o Each device on the bus is assigned a 4-bit identification number.


o When one or more devices request the bus, they assert the
START_ARBITRATION signal and place their 4-bit ID numbers
on the lines, ARB0 through ARB3.
o A winner is selected as a result of the interaction among the
signals transmitted over these lines by all contenders. The net
outcome is that the code on the four lines represents the
request that has the highest ID number.
oThe connection performs an OR function in which logic 1 wins.
E.g.
o Assume that two devices, A and B, having ID numbers 5 and 6,
respectively are requesting the use of the bus. Device A
transmits the pattern 0101, and device B transmits the pattern
0110.
o The code seen by both devices is 0111.
o Each device compares the pattern on the arbitration lines to its
own ID, starting from the most significant bit. If it detects a
difference at any bit position, it disables the lines at that bit position and
for all lower-order bits. It does so by placing a 0 at the input of these
lines.

V cc

A RB 3
A RB 2

A RB 1

A RB 0

Start-Arbitration

O.C.

0 1 0 1 0 1 1 1

Interface circuit
for device A

Figure 4.22. A distributed arbitration scheme.

3. BUS OPERATIONS
o The processor , main memory, and I/O devices can be
interconnected by means of a common bus whose primary function
is to provide a communications path for the transfer of data.
o A bus protocol, is the set of rules that govern the behavior of
various devices connected to the bus as to when to place
information on the bus, assert control signals, and so on.
o The bus lines used for transferring data are of three types: data ,
address and control.

o The control signals :


o Specify whether a read or a write operation is to be
performed. A single R/W line is used that specifies read
when set to 1 and write when set to 0.

o For bulk data transfer, the required size of data is indicated.


o They also carry timing information.
o The device that initiates data transfers by issuing read or write
commands on the bus; hence, it may be called an initiator or a
master. Normally, the processor acts as the master, but other
devices with DMA capability may also become bus masters.
o The device addressed by the master is referred to as slave or
target.

o Busses are broadly classified as:


o Synchronous bus
o Asynchronous bus

1. Synchronous bus:
o All devices derive timing information from a common clock line.
Equal spaced intervals constitute a bus cycle during which one data
transfer can take place.

Sequence of events during a read operation:


o Refer fig. 4.23
o At time t0, the master places the device address on the address
lines and sends an appropriate command on the control lines. (read
, length of operands to be read)

o The clock pulse width, t1 – t0,


o must be longer than the maximum propagation delay
between two devices connected to the bus.
o It also has to be long enough to allow all devices to decode
the address and control signals so that the addressed device
can respond at time t1.
o The addressed slave places the requested input data on the data
lines at time t1.
o At the end of the clock cycle, at time t2, the master strobes
(captures) the data on the data lines into its input buffer.
o The period , t2 – t1 must be greater than the maximum
propagation time on the bus plus the setup time of the input buffer
register of the master.
Time

Bu s clock

Ad dre ss an d
comm and

Data

t0 t1 t2

Bu s cycle

Figure 4.23. Timing of an input transfer on a synchronousbus.

o Refer fig 4.24


o The master sends the address and command signals on the rising
edge at the beginning of the clock period 1 ( t0).
o But these signals appear on the bus until t AM, due to the delay in
the bus driver circuit.

o A little later at t AS , the signals reach the slave.


o The slave decodes the address and at t1 sends the requested data.
o The data signals do not appear on the bus until t DS.
o These data signals travel toward the master and arrive at t DM .
o At time t2 the master loads the data into its input buffer. The
period t2 – t DM is the setup time for the master’s input buffer.
o The data must continue to be valid after t2 for a period equal to
the hold time of that buffer.

Multiple-Cycle transfers: Refer Fig. 4.25


o Problems in a single cycle transfer:
o Because a transfer has to be completed within one clock
cycle, the clock period , t2 – t0 , must be chosen to a
accommodate the longest delay on the bus and the slowest
device interface.
o Processor has no way of determining whether the addressed
device has actually responded.

o Solution:
o Control signals are introduced that represent a response from the
device. These signals inform the master that the slave has
recognized its address & that it is ready to participate in a data-
transfer operation.
T im e

1 2 3 4

C lo c k

A d d re s s

C om m and

D a ta

S la v e - r e a d y

F ig u re 4 . 2 5 . A n in p u t tr a n s f e r u s in g m u lt ip le c lo c k c y c le s

o A high frequency clock signal is used such that a complete data


transfer cycle would span several clock cycles. The number of clock
cycles involved vary from one device to another.
o at clock cycle 1, the master sends address & command information
on the bus, requesting a read operation.
o At the beginning of the clock cycle 2, slave receives this information
and decodes it.
o At the beginning of clock cycle 2, it makes a decision to respond
and begins to access the requested data.
o The data becomes ready and it is placed on the bus at clock cycle 3.
At the same time , the slave asserts a control signal Slave-Ready.
o The master which has been waiting for this signal, strobes the data
into its input buffer at the end of clock cycle 3.
o The bus transfer operation is now complete, and the master may
send a new address to start a new transfer in clock cycle 4.
o The Slave-Ready signal is an acknowledgement from the slave to
the master, confirming that valid data have been sent.
Advantages of Slave-Ready signal:

1. This signal allows the duration of the data transfer to change


from one device to another.
2. if the addressed device does not respond at all, the master
waits from some predefined maximum number of clock
cycles, then aborts the operation. This could be the result of
an incorrect address or a device malfunction.

2. Asynchronous bus:
o An alternate scheme for controlling data transfers on the bus is
based on the use of handshake between the master and slave. The
common clock is replaced by two timing control lines, Master-Ready
and Slave-Ready.

Handshake protocol:
o The master places the address and command information on the
bus.
o Then it indicates to all devices that it has done so by activating the
Master-ready line.
o This causes all devices on the bus to decode the address.
o The selected slave performs the required operation and informs the
processor by activating the Slave-Ready line.
o The master waits for Slave-Ready to become asserted before it
removes its signals from the bus.
Sequence of events for input data transfer:
Time

Address
andcommand

Master-ready

Slave-ready

Data

t0 t1 t2 t3 t4 t5

Bus cycle
Figure 4.26. Handshake co ntr ol o f data transfer during an inpu t ope ration.

t0 → the master places the address and command information on the


bus , and all devices on the bus begin to decode this information.

T1 → the master sets the Master-Ready line to 1. This informs the I/O
devices that the address and command information is ready. The delay
t1 – t0 is needed for skew that may occur on the bus. T1 – t0 should
be larger than the maximum possible bus skew.
When the address information arrives at any device, it is decoded by
the interface circuitry. Time should be allowed to decode the address.

T2 → the selected slave performs the required operation by placing


the data from its data register on the data lines. At the same time , it
sets the Slave-Ready signal to 1. If extra delays are introduced by the
interface circuitry before it places the data on the bus, the slave must
delay the Slave-Ready signal accordingly. The period t2 – t1 depends
on the distance between master and the slave and on the delays
introduced by the slave’s circuitry.

T3 → the slave signal arrives at the master, indicating that the input
data are available on the bus. The master should allow for bus skew.
It must also allow for the setup time needed by its input buffer. After a
delay equivalent to the maximum bus skew and the minimum setup
time, the master strobes the data into its input buffer. At the same
time it drops the Master-Ready signal , indicating that it has received
the data.

T4 → The master removes the address & command information from


the bus. The delay between t3 and t4 is needed for bus skew.

T5 → when the device interface receives a 1 to 0 transition of the


Master-Ready signal, it removes the data and the Slave-Ready signal
from the bus. This completes the data transfer.

Time

Address
and command

Data

Master-ready

Slave-ready

t0 t1 t2 t3 t4 t5

Bus cycle

Figure 4.27. Handshake control of data transfer during an output operation.

Sequence of events for the output operation:


Skew → skew occurs when two signals simultaneously transmitted
from one source arrive at the destination at different times. This
happens because different lines of the bus may have different
propagation speeds.
Full handshake → a change of state in one signal is followed by a
change in the other signal.

Advantage of asynchronous bus:


1. handshake process eliminates the need for synchronization of
the sender and the receiver clocks, thus simplifying timing
design.
2. when delays (propagation , setup time) changes, the timing of
data transfer adjusts automatically based on new conditions.
But for synchronous bus clock circuitry must be designed
carefully to ensure proper synchronization.
3. the rate of data transfer on an asynchronous bus is controlled
by a handshake which is limited by the fact that each transfer
involves two round-trip delays. But in synchronous buses ,
faster data rates are achieved because it needs only one end-
to-end propagation.

4. INTERFACE CIRCUITS
An input/output interface consists of the circuitry required to connect an
input/Output device to a computer bus. On one side of the interface we
have the bus signals for address, data and control and on the other side,
a data path with its associated controls to transfer data between the
interface and the input/output device. This side is called port. The port
can be classified as serial port or parallel port.

Functions of Input/Output Interface:


1. Provides a storage buffer for at least one word of data (or) one
byte in the case of byte-oriented devices.
2. Contains status flags that can be accessed by the processor to
determine whether the buffer is full.
3. Contains address-decoding circuitry to determine when it is being
addressed by processor.
4. Generates the appropriate timing signals required by the bus
control scheme.
5. Performs any format conversion bus and input/output device that
may be necessary to transfer data between the

5.4.1 Parallel Port


Parallel port is used to send or receive data having group of 8 bits or 16
bits simultaneously. According to hardware and control signal
requirements, parallel ports are classified as input port used to receive the
data and output port used to send the data. Figure 4.28 shows the
Hardware components needed for connecting a keyboard to a processor.

Figure 4.28 shows the hardware components needed for connecting a


keyboard to a processor. When a key is pressed, its switch closes and
establishes a path for an electrical signal. This signal is detected by an
encoder circuit that generates the ASCII code for the corresponding
character. The output of the encoder consists of the bits that represent
the encoded character and a control signal called valid, which indicates
that a key is pressed. This information is sent to the interface circuit
where the character is loaded into the DATAIN register and the SIN flag is
set. The interface circuit is connected to an asynchronous bus on which
transfers are controlled using the handshake signals Master-ready and
Slave-ready.

Figure 4.29 shows a suitable circuit for an input interface. The output lines
of the DATAIN register are connected to the data lines of the bus, which
are turned on when the processor issues a read instruction with the
address that selects this register. The SIN signal generated by a status
flag circuit is sent to the bus to the bit D0, which means it will appear as
bit 0 of the status register. An address decoder is used to select the input
interface when the high-order 31 bits of an address correspond to any of
the addresses assigned to this interface. Address bit A0 determines
whether the status or the data registers is to be read when the Master-
ready signal is active. The control handshake is accomplished by activating
the Slave-ready signal when either Read-status or Read-data is equal to 1.
Figure 4.30 shows the possible implementation of the status flag circuit.
Figure 4.31 shows an output interface that can be used to connect an
output device, such as a printer, to a processor. The printer operates
under the control of the handshake signals Valid and Idle in a manner
similar to the handshake used on the bus with the Master-ready and
slave-ready signals.
When it is ready to accept a character, the printer asserts its Idle
signal. The interface circuit can then place a new character on the data
lines and activate the Valid signal. In response, the printer starts
printing the new character and negates the Idle signal, which in turn
causes the interface to deactivate the Valid signal. The interface
contains a data register, DATAOUT, and a status flag, SOUT. The SOUT
flag is set to 1 when the printer is ready to accept another character,
and it is cleared to 0 when a new character is loaded into DATAOUT by
the processor.

Figure 4.34 shows a general purpose parallel interface circuit that can be
configured in a variety of ways. Data lines P7 through P0 can be used for
either input or output purposes. The DATAOUT register is connected to
these lines via three-state drivers that are controlled by a data direction
register, DDR. The processor can write any 8-bit pattern into DDR. For a
given bit, if the DDR value is 1, the corresponding data line acts as an
output line; otherwise, it acts as an input line.
Two lines, C1 and C2 are provided to control the interaction between the
interface circuit and the I/O device it serves. These lines are also
programmable. Line C2 is bidirectional to provide several different modes
of signaling, including the handshake. The ready and accept lines are the
hand-shake control lines on the processor bus side, and hence would be
connected to Master-ready and Slave-ready. The input signal My-address
should be connected to the output of an address decoder that recognizes
the address assigned to the interface. There are three register select lines,
allowing upto eight registers for various modes of operation. An interrupt
request output, is also provided.

Asynchronous Transmission
This approach uses a technique called start-stop transmission. Data are
organized in small groups of 6 to 8 bits, with a well-defined beginning and
end. In a typical arrangement, alphanumeric characters encoded in 8 bits
are transmitted as shown in Figure 7.16. The line connecting the
transmitter and the receiver is in the 1 state when idle. A character is
transmitted as a 0 bit, referred to as the Start bit, followed by 8 data bits
and 1 or 2 Stop bits. The Stop bits have a logic value of 1. The 1-to-0
transition at the beginning of the Start bit alerts the receiver that data
transmission is about to begin. Using its own clock, the receiver
determines the position of the next 8 bits, which it loads into its input
register. The Stop bits following the transmitted character, which are equal
to 1, ensure that the Start bit of the next character will be recognized.
When transmission stops, the line remains in the 1 state until another
character is transmitted.
Synchronous Transmission
In the start-stop scheme described above, the position of the 1-to-0
transition at the beginning of the start bit in Figure 7.16 is the key to
obtaining correct timing information. This scheme is useful only where the
speed of transmission is sufficiently low and the conditions on the
transmission link are such that the square waveforms shown in the figure
maintain their shape. For higher speed a more reliable method is needed
for the receiver to recover the timing information. In synchronous
transmission, the receiver generates a clock that is synchronized to that of
the transmitter by observing successive 1-to-0 and 0-to-1 transitions in
the received signal. It adjusts the position of the active edge of the clock
to be in the center of the bit position. A variety of encoding schemes are
used to ensure that enough signal transitions occur to enable the receiver
to generate a synchronized clock and to maintain synchronization. Once
synchronization is achieved, data transmission can continue indefinitely.
Encoded data are usually transmitted in large blocks consisting of several
hundreds or several thousands of bits. The beginning and end of each
block are marked by appropriate codes, and data within a block are
organized according to an agreed upon set of rules. Synchronous
transmission enables very high data transfer rates.
5. Standard I/O Interfaces (USB)
A standard I/O Interface is required to fit the I/O device with an Interface
circuit. The processor bus is the bus defined by the signals on the
processor chip itself. The devices that require a very high speed
connection to the processor such as the main memory, may be connected
directly to this bus. The bridge connects two buses, which translates the
signals and protocols of one bus into another. The bridge circuit
introduces a small delay in data transfer between processor and the
devices.

We have three widely used Bus standards. They are,


• PCI(Peripheral Component Inter Connect) - PCI defines an
expansion bus on the motherboard.
• SCSI(Small Computer System Interface) - SCSI bus is a high
speed parallel bus intended for devices such as disk and video
display.
• USB(Universal Serial Bus) - USB uses a serial transmission to
suit the needs of equipment ranging from keyboard keyboard to
game control to internal connection.
• SCSI and USB are used for connecting additional devices both
inside and outside the computer box.

3. USB – Universal Serial Bus


• USB supports 3 speed of operation. They are,
✓ Low speed (1.5Mb/s)
✓ Full speed (12mb/s)
✓ High speed ( 480mb/s)
• The USB has been designed to meet the key objectives. They are,
✓ It provides a simple, low cost & easy to use interconnection s/m
that overcomes the difficulties due to the limited number of I/O
ports available on a computer.
✓ It accommodates a wide range of data transfer characteristics
for I/O devices including telephone & Internet connections.
✓ Enhance user convenience through ‘Plug & Play’ mode of
operation.

Port Limitation:-
✓ Normally the system has a few limited ports.
✓ To add new ports, the user must open the computer box to gain
access to the internal expansion bus & install a new interface
card.
✓ The user may also need to know to configure the device & the
s/w.

Merits of USB:-
✓ USB helps to add many devices to a computer system at any
time without opening the computer box.

Device Characteristics:-
✓ The kinds of devices that may be connected to a computer
cover a wide range of functionality.
✓ The speed, volume & timing constrains associated with data
transfer to & from devices varies significantly.

Eg:1 Keyboard
✓ Since the event of pressing a key is not synchronized to any
other event in a computer system, the data generated by
keyboard are called asynchronous.
✓ The data generated from keyboard depends upon the speed of
the human operator which is about 100bytes/sec.
Requirements for sampled Voice:-
✓ It is important to maintain precise time (delay) in the sampling
& replay process.
✓ A high degree of jitter (Variability in sampling time) is
unacceptable.

Eg-3:Data transfer for Image & Video:-


✓ The transfer of images & video require higher bandwidth.
✓ The bandwidth is the total data transfer capacity of a
communication channel.
✓ To maintain high picture quality, The image should be
represented by about 160kb, & it is transmitted 30 times per
second for a total bandwidth if 44MB/s.

Plug & Play:-


• The main objective of USB is that it provides a plug & play
capability.
• The plug & play feature enhances the connection of new device at
any time, while the system is operation.
• The system should,
➢ Detect the existence of the new device automatically.
➢ Identify the appropriate device driver s/w.
➢ Establish the appropriate addresses.
➢ Establish the logical connection for communication.
• The USB is also hot-pluggable, which means a device can be
plugged into or removed from a USB port while power is turned on.
USB Architecture:-
• USB has a serial bus format which satisfies the low-cost & flexibility
requirements.
• Clock & data information are encoded together & transmitted as a
single signal.
• There are no limitations on clock frequency or distance arising from
data skew, & hence it is possible to provide a high data transfer
bandwidth by using a high clock frequency.
• To accommodate a large no. of devices that can be added /
removed at any time, the USB has the tree structure.
• Each node of the tree has a device called “hub‟, which acts as an
intermediate control point between host & I/O devices.
• At the root of the tree, the “root hub‟ connects the entire tree to the
host computer.
• The leaves of the tree are the I/O devices being served (for
example, keyboard, Internet connection, speaker, or digital TV),
which are called functions in USB terminology.

Fig. 4.43 :USB Tree Structure


• The tree structure enables many devices to be connected while
using only simple point-to-point serial links.
• Each hub has a number of ports where devices may be connected,
including other hubs.
• In normal operation, a hub copies a message that it receives from
its upstream connection to all its downstream ports.
• As a result, a message sent by the host computer is broadcast to all
I/O devices, but only the addressed device will respond to that
message.

Addressing :
• Each device on the USB, whether it is a hub or an I/O device, is
assigned a 7-bit address.
• This address is local to the USB tree and is not related in any way
to the addresses used on the processor bus.
• When a device is first connected to a hub, or when it is powered on,
it has the address 0.
• The hardware of the hub to which this device is connected is
capable of detecting that the device has been connected and it
records this fact as part of its own status information. Periodically,
the host polls each hub to collect status information and learn
about new devices that may have been added or disconnected.
When the host is informed that a new device has been connected,
it uses a sequence of commands to send a reset signal on the
corresponding hub port, read information from the device about its
capabilities, send configuration information to the device, and
assign the device a unique USB address. Once this sequence is
completed the device begins normal operation and responds only
to the new address.

• This is a key feature that gives the USB its plug-and-play capability.
• When a device is powered off, a similar procedure is followed. The
corresponding hub reports this fact to the USB system software,
which in turn updates its table. The USB software must maintain a
complete picture of the bus topology and the connected devices at
all times.

• Locations in the device to or from which data transfer can take

place, such as status, control, and data registers, are called


endpoints. They are identified by a 4-bit number, Actually, each 4-
bit value identifies a pair of endpoints, one for input and one for
output. Thus, a device may have up to 16 input/output pairs of
endpoints.
• A USB pipe, which is a bidirectional data transfer channel, is
connected to one such pair.

USB Protocols:
• All information transferred over the USB is organized in packets,
where a pack consists of one or more bytes of information.
• The information transferred on the USB can be divided into two
broad categories: control & data.
• Control packets perform such tasks as addressing a device to
initiate data transfer, acknowledging that data have been received
correctly, or indicating error.
• Data packets carry information that is delivered to a device. For
example, input and output data are transferred inside data packets.
• A packet consists of one or more fields containing different kinds of
information. The first field of any packet is called the packet
identifier, PID, which identifies the type of that packet.
• Control packets used for controlling data transfer operations are
called token packets.
• A token packet starts with the PID field, using one of two PID
values to distinguish between an IN packet and an OUT packet,
which control input and output transfers, respectively. The PID field
is followed by the 7-bit address of a device and the 4-bit endpoint
number within that device. The packet ends with 5 bits for error
checking, using a method called cyclic redundancy check (CRC).
The CRC bits are computed based on the contents of the address
and endpoint fields. By performing an inverse computation, the
receiving device can determine whether the packet has been
received correctly.

Isochronous traffic on USB


• One of the key objectives of the USB is to support the transfer of
isochronous data, such as sampled voice, in a simple manner.
Devices that generate or receive isochronous data require a time
reference to control the sampling process.
• Isochronous data are allowed only on full-speed and high-speed
links.

Electrical Characteristics
USB connections consist of four wires,
two carry power, +5 V and Ground, &
two carry data.
No separate power supply needed for simple devices.
Separate power supply for simple devices
Two methods are used to send data over a USB cable.
single-ended transmission
differential signaling
Single-ended transmission
When sending data at low speed, a high voltage relative to
Ground is transmitted on one of the two data wires to
represent a 0 and on the other to represent a 1. The
Ground wire carries the return current in both cases.
highly susceptible to noise. The voltage on the ground wire
is common to all the devices connected to the computer.

Differential signaling
The data signal is injected between two data wires twisted
together.

The ground wire is not involved.


The receiver senses the voltage difference between the two
signal wires directly, without reference to ground.
This arrangement is very effective in reducing the noise seen
by the receiver, because any noise injected on one of the
two wires of the twisted pair is also injected on the other.
Since the receiver is sensitive only to the voltage difference
between the two wires, the noise component is cancelled
out.

4. SATA - Serial Advanced Technology Attachment


an interface for transferring data between a computer's
central circuit board and storage devices.
In the early days of the personal computer, the bus of a popular
IBM computer called AT, which was based on Intel’s 8080
microprocessor bus, became an industry standard. It was named
ISA, for Industry Standard Architecture.
An enhanced version, including a definition of the basic software
needed to support disk drives, was later named ATA, for AT
Attachment bus.

A serial version of the same architecture became known as SATA,


which is now widely used as an interface for disks.
Like all standards, several versions of SATA have been developed
with added features and higher speeds.
The original parallel version has been renamed PATA, but it is no
longer used in new equipment.
The basic SATA connector has 7 pins, connecting two twisted pairs
and three ground wires.
Differential transmission is used, with clock frequencies ranging
from 1.5 to 6.0 Gigabits/s.
Some of the recent versions provide an isochronous transmission
feature to support audio and video devices.

Memory
5.6 Basic concepts
Ideal memory – fast, large and inexpensive.
16 bit computer generates 16 bit addresses 216 =210  2 6 memory.
location =2 6 k=64k memory. Loc.
32bit addresses =2 32 =2 30  2 2 =2 2 G memory locations = 4Giga memory
locations

40bit adds -- 2 40
=1 Tera loca
210 - Kilo 2 20 - Mega

2 30
- Giga 2 40 - Tera
P r o c e s s or M e m o ry
k - b it
a dd re ss b u s
M AA R
M R
n-bit
d a t a b us
U p t o 2k a d d r e s s a b l e l o
M DD R
M R catio n s

W ord le ng th = n bit s

C o ntroll in e s
( R / W , M F C , e t c .)

F ig ure 5.1.Co nn ectio no f th em em oryt oth eprocessor.

• Data transfer between the memory and the processor takes place
through the use of two processor registers, usually called MAR
(Memory Address Register) and MDR (Memory Data Register).
• If MAR is k bits long and MDR is n bits long, then the memory unit
may contain upto 2k addressable locations. During a memory cycle, n
bits of data are transferred between the memory and the processor.
This transfer takes place over the processor bus, which has k address
lines and n data lines.
• The bus also includes the control lines Read/Write (R/W) and Memory
Function Completed (MFC) for coordinating data transfers.
• The processor reads data from the memory by loading the address of
the required memory location into the MAR register and setting the
R/W line to 1. The memory responds by placing the data from the
addressed location onto the data lines, and confirms this action by
asserting the MFC signal. Upon receipt of the MFC signal, the
processor loads the data lines into the MDR register.
• The processor writes data into a memory location by loading the
address of this location into MAR and loading the data into MDR. It
indicates that a write operation is involved by setting the R/W line to 0.
• If a read or write operation involve consecutive address locations in
the main memory, then a “block transfer” operation can be performed
in which the only address sent to the memory is the one that identifies
the first location.
• Memory access may be synchronized using a clock, or they may be
controlled using special signals that control transfers on the bus.

Memory access time :


➢ Time that elapsed between the initiation of an operation and the
completion of that operation. (e.g.) the time between Read and the
MFC signal.

Memory cycle time:


➢ Which is the min. time delay required between the initiation of two
successive memory operations. (e.g.) time between the initiation of
two successive Read operations.

RAM: Any location can be accessed for a Read or Write operation in some
fixed amount of time that is independent of the location’s address.

1. Cache Memory 2. Virtual Memory


➢ Small, fast between main ➢ Address generated by processor is
memory & processor. Virtual or logical address.
➢ Reduces memory access ➢ Virtual address space mapped into
time. physical address space.
➢ It holds the currently active ➢ used to increase the apparent size
segments of a program and of physical memory.
their data.
7. Semiconductor RAM memories
1. Internal organization of memory chip: (fig 5.2)

➢ Memory cells are usually organized in the form of an array, in which


each cell is capable of storing one bit of information.
➢ Each row of cells constitutes a memory word, and all cells of a row
are connected to a common line referred to as the word line, which
is driven by the address decoder on the chip.
➢ The cells in each column are connected to a Sense/Write circuit by
two bit lines. The Sense/Write circuits are connected to the data
input/output lines of the chip.
➢ During a Read operation, these circuits’ sense, or read, the
information stored in the cells selected by a word line and transmit
this information to the output data lines.
➢ During a Write operation, the Sense/Write circuits receive input
information and store it in the cells of the selected word.
➢ The figure 5.2 is an example of a very small memory chip
consisting of 16 words of 8 bits each (i.e.) 16 8 organization.

= 2 4  2 3 bits = 2 4  8 bits --- 4 address lines -


2 4 locations, 8 bits each
4 address lines
8 data bits
+2 control signals - R/WR & CS, chip select – used in a multi chip memory
system.

14 + 2 external connections (Vcc & ground)


* 1k memory cells = 1024 memory cells = 210 = 2 7 2 3
= 2 7 locations 8 bits each 7 address lines
8 data lines
2 control signals

17 + 2 external connected + 2

b 7 b 7 b 1 b 1 b 0 b 0

W 0




FF FF
A 0 W 1




A 1
Address • • • • • • Memory
decode r • • • • • • c e l ls
A 2
• • • • • •
A 3

W 15



S ens e /W r i te c i S ens e/W ri te c i S ens e /W r i te c i R /W
r c u it r c u it r c u it
C S

D a t a i n p u /t o u t p u t l i n e s : b 7 b 1 b 0

Figure5.2.Organiza tionofbitce llsinamem oryc hip .

1k 1 format = 210  1 format


10 bit address
1 data line
2 control signals

13 +2
4M bit chip = 2 2  2 20
bits = 219  2 3 bits
19 address lines
8 data lines

27
+2 control signals

29 +2
5-b it ro w
a dd ress W0
W1
32  32
5 -b it
deco der m e m o ry cell
arra y
W3 1
S ens e/W rite
circu itry

10 - b it
address
32 -to -1
outp ut m ultip lexer R /W
and
input de m ultip le xer CS

5-b it colum n
address
D ata
input/ou tpu t

F ig ur e 5 .3 . O r gan iz ati on o f a 1 K  1 m em or y c hi p.

2. Static Memories :
Memories that consist of circuits capable of retaining their state as long as
power is applied are known as static memories.

(e.g.) Static Ram (SRAM) refer fig. 5.4


➢ Two inverters are cross connected to from a latch.
➢ The latch is connected to two bit lines by transistors T1 &
T2 . These transistors act as switches that can be opened
or closed under control of the word line. When the word
line is at ground level, the transistors are turned off and
the latch retains its state.
➢ For example, when the cell is in state 1, the logic value
at point X is 1 and at point Y is 0. This state is
maintained as long as the signal on the word line is at
ground level.

Read Operation:
• The word line is activated to close switches T1 & T2

If the cell is in state 1, the signal on bit line b is high and the signal
on bit line b’ is low. The opposite is true is the cell is in state 0.


Sense / Write circuits at the end of the bit lines monitor the state of
b and b’ and set the output accordingly.
Write Operation:
• The state of the cell is set by placing the appropriate value on bit
line b and its complement b’ and then activating the word line.
This forces the cell into the corresponding state. The required
signals are generated by the Sense/Write circuit.
b b

T1 T2
X Y

W o r d l in e

B i t l i ne s

F i g u r e 5 . 4 . A s t a t i c R A M c e l l.

CMOS Cell: (SRAM) complementary metal-oxide semiconductor. (Fig 5.5)


• Transistor pairs (T3, T5) and (T4, T6) form the inverters in the
latch.
• To maintain a state 1, the voltage at point X is maintained high by
having transistors T3 & T6 on, while T4 & T5 are off.
• A continuous power is needed for the cell to retain its state. If
power is interrupted, the cell’s contents will be lost. When the
power is restored, the latch will settle into a stable state, but it will
not necessarily be the same state the cell was in before the
interruption.

Disadvantages:
• volatile memory
• 6 transistors are needed for each cell, hence the size is large.
• High cost because of the number of transistors.

Advantages:
• Low power consumption because current flows in the cell only
when the cell is being accessed.
• Access times are very less and are used in applications where
speed is of critical concern.

b V s u p p ly b

T3 T4

T1 T2
X Y

T5 T6

W o r d l in e

B i t l in e s

F ig u r e 5 . 5 . A n e x a m p le o f a C M O S m e m o r y c e l l .

5.7.3 Dynamic RAM’s

Asynchronous DRAM:
Dynamic RAM -- Less expensive because simpler cells are used.
-- These cells do not retain their state indefinitely.

Dynamic RAM:
•Information is stored in a dynamic memory cell in the form of a
charge on a capacitor, and this charge can be maintained for only
tens of milliseconds.
•Since the cell is required to store information for a much longer
time, its contents must be periodically refreshed by restoring the
capacitor charge to its full value.
Bit line

Word line

T
C

Figure 5.6. A single-transistor dynamic memory cell

To store:
• The transistor T turned on and appropriate voltage is applied to
bit
line. This causes a known amount of charge to be stored in the
capacitor.
• After the transistor is turned off, capacitor begins to discharge.
This is caused by the capacitors own leakage resistance and by
the fact that the transistor continues to conduct a tiny amount of
current,

after it is turned off.


• Hence the information stored in the cell can be retrieved correctly
only if it is read before the charge on the capacitor drops below
some threshold value.
RAS
Row Addr. Strobe

Row
address Row 4096 (512 8)
latch decoder cell array

A20 - 9  A 8 - 0 Sense / Write CS


circuits
R/ W

Column
address Column
latch decoder

CAS D7 D0
Column Addr. Strobe

Figure 5.7. Internal organization of a 2M  8 dynamic memory chip.

To Read:
• The transistor on the selected cell is turned on.
• A sense amplifier connected to the bit line detects whether the
charge stored on the capacitor is above the threshold value.

• If the charge is above the threshold value


- drives the bit line to a full voltage that represents logic
value = 1.

- This recharges the capacitor to logic value = 1.


• Else if charge is less than the threshold
- Pulls the bit line to ground level.
- ensures no charge in capacitor.
- representing a value 0.
• Reading the cell automatically refreshes the contents of the entire
row.

[fig.5.7] Memory capacity = 16 megabit DRAM


= 16  220 bits
= 24  220 bits
Mega = 220
= 221  23 bits
= 2  220  23 bits
16 M = 224
= 2 M  8 bits

16 M bits = 16  220 bits


= 24  220 bits
= 22  210  22  210 bits
= 4k  4k array.

64M bit = 26  220 bits = 16 M  4 , 8M  8 , 4M  16


4k  4k =22  210  4k=212  4k=4096  4k=4096 22  210=4096  27  23
= 4096  512  8 bit
212 = 4096
29 = 512
4096 rows --- each row 512 bytes of data
Address bits needed = 12 bits (row selection)
Column selection = 9 bits
Total = 21 bits
• To reduce the number of pins needed for external connections, the
row and the column addresses are multiplexed.
• During a read or a write operation, the row address is applied first, it
is then loaded into the row address latch in response to a signal
pulse on the RAS input of the chip. Then the read is initiated.
• Shortly after the row address is loaded, the column address is
applied to the address pins and loaded into the column address latch
under the control of Column Address Strobe (CAS) signal.

Refresh circuit :
- each row of cell is accessed periodically automatically.
- Many chips incorporate this refresh facility within the
chips.
- The dynamic nature of these memory chips is invisible
to the user.
Memory device timing controlled asynchronously.
➢ Row Address Strobe (RAS), Column Address Strobe (CAS) signals
govern the timing.
➢ The processor must take into account the delay in the response
of the memory.
➢ Such memories are referred to as asynchronous DRAM’s.
Fast page mode: bulk transfer.
Row selection address A20-9 ---- placed
Column selection - bytes selection A8-0 - placed

8 bit data selected - D7-0


To access the other bytes in the same row without having to reselect the
row,

- place a latch between the Sense circuit and Column decoder.


- row address will load the latch with all the bits in the selected
row & column.
- then apply the next column address to place the different byte
on the data lines.
-transfer the bytes of a row in sequential order under the control
of successive CAS signals.
Advantage:
- transfers block of data at a much faster rate.
- Block transfer capacity – fast mode.

5.7.4 Synchronous DRAM


-Operations are directly synchronized with a clock signal.
[fig.5.8]
SDRAM’s modes of operation:
(i) burst operation ---block transfer capability.[fig.5.9]
Memory Latency: Amount of time to transfer a word of data to or from
the memory.
For Burst operations - time taken to transfer the 1st word of data.

Clock frequency = 100 MHZ.


1 1 1
Clock period = = = =10-8 sec
Clockfreq 100 10 6 HZ 8
10 HZ

−8

= 10 10 sec = 10  10-9 sec


10
10-3 --- ms = 10ns 10-6 - μs 10-9 = ns
1st word gets transferred 5 clock cycles after the assertion of the RAS
signal.
Latency = 5  10ns = 50ns
Refresh
counter

Row
address Row
decoder Cell array
latch
Row/Column
address
Column Column
address Read/Write
counter decoder circuits & latches

Clock
RA S
Mode register
CA S and Data input Data output
register register
R/ W timing control
CS

Figure 5.8. Synchronous DRAM. Data

C lock

R/W R

AS C

A S

A dd r es s Row Col

D a ta D 0 D 1 D 2 D3

F i g u r e 5 .9 . B u r s tr e a d o f l e n g t h 4 i n a n S D RA M .

Bandwidth – number of bits or bytes that can be transferred in one


second.
Bandwidth depends on:
- speed of the memory.
- Speed of access to the stored data.
- Number of bits that can be accessed in parallel.
- Transfer capability of the links that connect the
memory and the processor. ( i.e. speed of the bus)
Bandwidth = rate at which data are transferred * width of the data bus
Double Data Rate SDRAM (DDR SDRAM)
*Standard SDRAM
• data transfers are performed on the rising edge of the
clock signal.

• Block transfers.
DDR SDRAM
o Transfers data on both edges of the clock.
o Bandwidth is doubled for long burst transfers.
o Cells are organized in two banks.
o each bank can be accessed separately.
o Consecutive words of a given block are stored in different
banks.
o Such interleaving of words allows simultaneous access to two
words that are transferred on successive edges of the clock.

o Block transfers

5.7.5 Structure of Larger Memories


Static Memory System:
[fig. 5.10]
Memory capacity = 2M words of 32 bit each . Implemented using 512k 
8 static memory chip.

2M × 32 bits = 2 × 220 × 25 bits


= 21+2+1 × 219 × 23 =16 × 219 × 23 = 16 ×2 9 × 210 × 23
= 16 × 512k × 8 bits

512k × 8 = 512 × 210 × 23 = 2 9 × 210 × 23


21-bit
addresses 19-bit internal chip address
A0
A1

A 19
A 20

2-bit
decoder

512 K ´ 8
memory chip
D 31-24 D 23-16 D 15-8 D 7-0

512 K ´ 8 memory chip

19-bit 8-bit data


address input/output

Chip select

Figure 5.10. Organization of a 2M  32 memory module using 512K  8 static memory chips.

2M × 32 bits can be implemented using 16 512k ×8 Memory chips


512k= 2 × 2 = 2
9 10 19

4 × 219 = 22 × 219 =221 =2M


2M = 2 × 220 = 221
Totally 2M[ (ie) 4× 219 = 221 ] locations each of 32 bits (8×4)
Address. bits = 21 bits.

Each chip has a control input called chip select.

Address bits

2bits 19 bits
Used to select the chip ↑ ↑
row0 →00 used to access the specific byte location inside
row 1→01 each
chip of the selected row.

row 2 →10
row 3 → 11

Dynamic memory system:


-physical implementation is done in the form of memory
modules.(because of packing constraints, they come as separate modules)
- SIMM’s (Single In-line Memory Module)
- DIMM’s (Dual In-line memory modules)
It is an assembly of several memory chips on a separate small board that
plugs vertically into a single socket on the mother board.

Advantage:
- occupy smaller amount of space on motherboard.
- allows easy expansion by using larger modules in the same
socket.

5.7.6 Memory System consideration


Choice of RAM chips depends on the following factors:
→ cost
→speed
→power dissipation and
→size of the chip.

Static RAM’s Dynamic RAM’s


-very fast - high density is achievable, so small size.
-cost & size →high - cheap
- used in cache memory. – larger memories (e.g) Main Memory.
Memory Controller:
-used to reduce the number of address pins, the dynamic memory chips
use multiplexed address inputs.
- The address is divided into two parts.
High order lower order bits
Address Bits
↓ ↓
- selects a row in the cell array - selects a column.
- provided 1st - provided in the same pin
as high order.

- controlled by the RAS signal. – latched using CAS signal.


[fig.5.11]
• The processor gives all bits of the address at the same time.

• The required multiplexing is performed by a memory controller

circuit with the help of RAS & CAS timing signals.


• It also sends the R/W , CS signals to memory.
• When used with DRAM chips, which do not have self-refreshing
capability, the memory controller has to provide all the
information needed to control the refreshing process. It contains
a refresh counter that provides successive row addresses. Its
function is to cause the refreshing of all rows to be done within
the period specified for a particular device.
• Request signal →Processor tells memory operation is needed.
Synchronous DRAM → clock signal needed.

R/W → memory controller passes it to memory.


Data →connected directly between the processor & the memory.

R o w /C o l u m n
Addres s a d d r e ss

RAS
R/ W
CAS
M e m ory
Reques t c o n tro lle r R/ W
P r o ce ss o r CS M em o ry
C l o ck
C lo ck

D a ta

Figu re 5 .1 1 . U se of a m em ory c ont rolle r.


Refresh overhead
Given: SDRAM : memory access time = 64ms.
Clock freq. = 133MHz.
Time to read each row= 4 clock cycles.

Total rows = 8k.


Find refresh overhead ?

SDRAM with 8k rows = 8 × 210 = 23 × 210 = 213 rows = 8192 rows

Time taken to read each row = 4 clock cycles.

For 8192 rows = 8192 × 4 clock cycles.


= 32768 cycles.
Refresh operation → read operation.
Clock freq. = 133 MHz
1
Clock period = sec
13310 6
Time taken to refresh 8192 rows = time taken / cycles need to read
8192 row ×clock period.
1
= 32768 × sec
13310 6
= 246.375 × 10-6 sec.
= .246 ×10-3 sec.
= .246 ms
In SDRAM, a typical period for refreshing all rows = 64 ms.
0.246
Refresh overhead = = 0.0038 = 0.384% of the total time
64
available for memory access.
5.7.7 RAM BUS memory:

Memory performance depends on


o Latency & bandwidth.
Speed of transfer depends on
o Speed of memory device.
o Speed off bus.
o Number of bits transferred /sec (width)
Ways to increase the amount of data transfer on a speed limited bus:
- is to increase the width of the bus.
- by providing more data lines.
Disadvantage:
- expensive.
- requires lot of space on a motherboard.
Solution:
-RAM BUS technology.
Rambus technology:
-uses a narrow bus that is much faster.
-uses a fast signaling method which is used to transfer
information between chips.
- instead of using signals that have voltage levels of either 0 or
Vsupply. to represent the logic values, the signals have much smaller swings
around a reference voltage, Vref.

0 ← Vref. → 1
-0.3 ← 2v → +0.3
This type of signaling is known as differential signaling.
Advantages:
- small voltage swings make it possible to have short transition
times.
- allows high speed of transmission.
Disadvantage:
-special techniques needed for width design.
- needs special circuit interfaces.
-needs specially designed memory chips.
Memory chips:
-cell arrays.
-multiple banks of cell arrays are needed to access more than one
word at a time.
- circuitry needed to interfere to the Rambus channel is included on
the chip.

Rambus DRAM (RDRAM) Direct RDRAM


- 9 data lines - 18 data lines
↓ - transfers 2 bytes of data at a time.
8 data + 1 parity checking.

• Communication between the processor, or some other device that

can serve as a master, and RDRAM modules, which serve as slaves,


is carried out by means of packets transmitted on the data lines.

• There are three types of packets: request, acknowledge, and data.

• A request packet issued by the master indicates the type of

operation that is to be performed. It contains the address of the


desired memory location and includes an 8-bit count that specifies
the number of bytes involved in the transfer. The operation type
includes memory reads and writes, as well as reading and writing of
various control registers in the RDRAM chips.

• When the master issues a request packet, the addressed slave

responds by returning a positive acknowledgement packet if it can


immediately satisfy the request. Otherwise, the slave indicates that it
is busy by returning a negative acknowledgement packet, in which
case the master will try again.

• The number of bits in the request packet exceeds the number of

data lines, which means that several clock cycles are needed to
transmit the entire packet. Using a narrow communication link is
compensated by the very high rate of transmission.

Compare DDR SDRAM and RDRAM :


DDR SDRAM RDRAM
-open Standard -- proprietary design of Rambus Inc.
- cheap (free) -- costly.
-- user should pay a royalty.

5.8 Read only Memories (ROM)


SRAM & DRAM → volatile
Nonvolatile memory: (ROM)

-holds instruction to load the boot program from the disk.


-used in embedded systems.

5.8.1 ROM
[fig. 5.12]
-logic value 0 is stored if the transistor is connected to the ground at
point P, otherwise a1 is stored

- data are written into a ROM when it is manufactured.


Bit line

Word line

T
Not connected to sto re a 1
P Connected to sto re a 0

Figure 5.12. A RO M cell.

5.8.2 PROM
-allow data to be loaded by user.
-programmable ROM.
-before programming, all memory locations contain 0.
- A fuse is inserted at point P in fig. 5.12
- burning the fuse makes it 1.
- irreversible process.
PROM ROM
-flexible and convenient - cheap.
-faster and less expensive -cheap when prepared in large no’s.
approach to write - expensive write operation.

5.8.3 EPROM
-allows stored data to be erased and new data to be loaded.
- Erasable Programmable ROM.
-similar to ROM fig. 5.12
-but uses a special transistor.
[ As a normal transistor or as a disable transistor. (ie) always turned
off ]
-contents can be erased by dissipating the charges trapped in the
transistors of memory cells, this is done by exposing the chip to UV light.

5.8.4 EEPROM
Disadvantage of EPROM
-chip must be physically removed from the circuit for
reprogramming and entire contents are erased by UV light.
EEPROM

Advantage :
-programmed and erased electrically.
-do not have to be removed for erasure.
-selective erasure possible.
Disadvantage of EEPROM
-different voltages needed for erasing, writing and reading the stored data.

5. Flash memory
- a flash cell is based on a single transistor controlled by trapped charge
like EEPROM.

- In a EEPROM read or write the contents of a single cell can be done.


Flash device.
-read the contents of a single cell but write can be for an entire block of
cells.
-prior to writing, the blocks contents are erased.
- greater density so higher capacity and lower cost/bit.
-single power supply and consume less power.
-applications include hand-held comp, cell phones , digital cameras and
MP3 music players.

Larger memory modules consisting of a no. of chips are needed:


Two popular choices are (i) Flash cards (ii) Flash drives.
Flash cards:
-standard interface.
-a card is simply plugged into a conveniently accessible slot.
Flash drives:
-emulate disk drives (hard disks)
- fitted into standard disk bays.
-storage capacity low (< 1GB) but hard disks can be many GB’s
Advantages:

-solid state electronic devices so no movable parts.


-shorter, sleek and access time (ie) faster response.
-low power consumption (can be used for battery appliances)
-insensitive to vibration.
Disadvantages:
-smaller capacity.
-higher cost per bit.
-will deteriorate after it has been written a no. of times.(1million
times)
Hard disks→ extremely low cost /bit

5.9 Direct Memory Access (DMA)


Data transfer between processor and I/O is done either by polling or by
an interrupt request from I/O device.

Overhead in these methods:


o Several program instructions must be executed for each data word
transferred.
o Poll’s status register of the device.
o Instructions are needed for incrementing the memory address and
o Word count should be maintained.
o For interrupts, additional overhead of saving & restoring the
program information that is needed.

Solution:
o DMA – transfer of large blocks of data at high speed
o DMA – without continuous intervention by processor
DMA Controller:
o DMA transfers are performed by a control circuit that is part of the
I/O device interface called the DMA Controller.
o This performs the functions that would normally be carried out by
the processor when accessing the main memory.

Steps:
(i) To initiate the transfer of a block of words, the processor sends
the
o Starting address
o The number of words in the block and
o The direction of the transfer (R/W)
(ii) On receiving this information , the DMA Controller performs the
requested operation.

o For each word transferred , DMA controller provides


o The memory address &
o All the bus signals that control data transfer.
o For block data transfer, the DMA controller
o Increments the memory address for successive words
&

o Keeps track of the number of transfers.


(iii) When the entire block has been transferred , the DMA controller
informs the processor by raising an interrupt signal.

31 30 1 0

Status and contro l

IRQ Done
IE R/ W

Starting address

Word count

Figure 4.18. Reg isters in a DMA interface

o When the DMA controller has completed transferring a


block of data and is ready to receive another command,
it sets the Done flag to 1.
o When the IE flag is set to 1, the DMA Controller raises an
interrupt.
o Finally, the controller sets IRQ to 1, when it has
requested an interrupt.
o The status register can also be used to record other
information, such as whether the transfer took place
correctly or errors occurred.

Multiple devices in a DMA controller:


o Refer fig. 4.19
o Disk controller controls two disks, provides DMA capabilities and
provides two channels.
o It can perform two independent DMA operations as if each disk had
its own DMA controller.
o The register needed to store memory address, the word count are
duplicated.

Bus Access:
o Memory access by the processor and the DMA Controllers are
interwoven.
o Requests by DMA devices for using the bus are always given higher
priority than processor requests.

Ma in
P ro ce sso r
cessor m e mo ry

S yste m ubs

D isk /D M A DM A
con troller co ntr ol le r
ntroller P rinter
ri nte r K eyb o
oaard
rd

Dis
D iskk Dis
D iskk Network
Interfa ce

Figure 4. 19. Use of DMA controllers in a compu ter system.

o Two types of bus access:


(i) Block or burst mode:
The DMA controller may be given exclusive access to the
main memory to transfer a block of data without interruption.
This is known as block or burst mode.

(ii) Cycle stealing:


Processor originates most memory access cycles, so the
DMA controller can be said to ‘steal memory cycles’ from the
processor. This technique is called cycle stealing. This allows
the DMA controller to transfer one data word at a time, after
which it must return control of the busses to the CPU. The
CPU merely delays its operation for one memory cycle to
allow the direct memory I/O transfer to “steal” one memory
cycle.

o Conflicts:
o A conflict may arise if both the processor & DMA controller or
two DMA controllers try to use the bus at the same time to
access the main memory. To resolve these conflicts, an
arbitration procedure is implemented on the bus to coordinate
the activities of all devices that requested memory transfer.

o Bus arbitration:
o The device that is allowed to initiate data transfers on the bus
at any given time is called the bus master.
o When the current master relinquishes control of the bus,
another device can acquire this status.
o Bus arbitration is the process by which the next device to
become the bus master is selected and bus mastership is
transferred to it.
5.10 Cache Memories
-makes the main memory appear to the processor to be faster than it
really is.
Locality of reference:
-most of the execution time is spent on routines in which many
instructions are executed repeatedly.

(eg) loop, nested loops, procedures.


-many inst. in localized areas of the programs are executed repeatedly
during some time period and the remainder of the program is accessed
relatively infrequently.

temporal locality spacial locality


-recently executed instructions is - instructions in close proximity to a
likely to be executed again very recently executed inst. are likely to
soon be executed soon.

-when an item is first needed, it will - instead of fetching 1 item from


be brought into the cache and it will the main memory, it is useful to
remain until it is needed again. fetch several items that reside at
adjacent address as well (i.e.) block
– set of contiguous
P ro c e s s o r

R eg is te r s
In cr ea si ng In c re a s i ng In cr e a si ng
si ze sp e e d co st p e r b it
P ri m a ry L 1
ca ch e

S eco
e con nd
d a ry
a Lry2
chee L 2
ca ch

M a in
m em o ry

M a g n e tic d isk
se co n d a ry
m e m o ry F ig u r e 5 .1 3 . M e m o r y h ie r a r ch y.

Main
Processor Cache memory

Figure 5.14. Use of a cache memory.

-mapping function → correspondence between the main memory


blocks and those in the cache is specified by a mapping function

- replacement algorithm→ when the cache is full and a memory


word that is not in the cache is reference, the cache control h/w decides
which block to be removed, to create space for the new block that
contains the referenced word. These rules are called the replacement
algorithm

Write Hit
- Write operation:
1. Write –through:
- cache and main memory location updated simultaneously.
2. Write – back or copy-back
- uses Dirty or Modified bit per block
- update only the cache location and mark
- it as updated with an associated flag bit.
-main memory is updated only when this block is removed
from cache to make room for a new block.
Write Hit :
Write through
➢ Simpler
➢ Result in unnecessary write operations, when a given cache is
updated several times

Write Back
➢ Results in unnecessary write when only a few words are updated
but writes the entire block

Road Operation:
When a read miss occurs ,
Method 1:
➢ A block of words that contains the requested word is copied from
the main memory into the cache
➢ After the entire block is loaded into the cache, the requested word
is forwarded to the processor
Method 2: ( Load through or early restart )
➢ The requested word is first forwarded to the processor as soon as it
is read form the main memory

➢ Reduces the processor’s waiting period


➢ Complex circuit

Write Miss:
1.Write through
➢ Written directly into the main memory

2. Write Back
➢ Block containing the addressed word is 1st brought into the cache
and then the desired word in the cache in overwritten with the new
information.

1. MAPPING FUNCTIONS :
➢ Correspondence between the main memory blocks and those in the
cache is specified by a mapping formula

E.g ., Cache Size = 128 blocks of 16 words each


= 128 X 16 words = 2048 words
= 211 words = 2 X 210 words
= 27 words ( i.e ) 27 blocks of 16 words each

Main memory → 16 bit addresses


(i.e ) 216 words = 26 x 210 words. = 64 K words.

 Total no.of blocks in main memory = 64K /16 = 4 k blocks


 Main memory has 4K (212 ) blocks of 16 words each

1. Direct Mapping:
➢ Simplest method: not flexible ( fixed mapping )
➢ Block ‘ j ‘ of the main memory is mapped onto block j mod
128 of the cache
e.g., Main memory block ----- Mapped to ---- Cache Block
NO

0,128,256 →0
1,129, 257 →1
➢ Block j of main memory maps onto block j modulo 128 of
the cache

➢ 4: one of 16 words. (each block has 16=24 words)


➢ 7: points to a particular block in the cache (128=27)
➢ 5: 5 tag bits are compared with the tag bits associated with
its location in the cache. Identify which of the 32 blocks that
are resident in the cache (4096/128).

Contention Problems
➢ More than one block mapped onto the same cache block
position
➢ Same cache block may be replaced even when other blocks
are empty
e.g ., branch from block 1 to block 129

M ain
m e m ory

Blo ck 0

Blo ck 1

C ache Blo ck 12
12 77
ta g
Blo ck 00 Blo ck 12
12 88
ta g
Blo ck 11 Blo ck 12
12 99

ta g
Blo ck 12 7 Blo ck 25
25 55

Blo ck 25
25 66

Blo ck 25
25 77
Fi gure 5.15 . Di rect-m apped c ache.

Blo ck 44 09
09 55
Ta g Blo ck W ord
5 7 4 M ain m e mo ry a dd ress

⚫ Tag: 11101
⚫ Block: 1111111=127, in the 127th block of the cache
Word:1100=12, the 12th word of the 127th block in the cache
➢ Position of a block in cache is determined by its address
Memory Address => 16 bits
➢ Stored as tag bits determines the exact block no.within the
cache block

(0 or 128 or 256 .. )
➢ Identify which of the 32 blocks that are mapped into this
cache position are currently resident in the cache.

Memory Access:
(i) 7 bit cache block field of the address finds the cache
block number

(ii) The high-order 5 bits of the memory address are


compared with the tag field.

If they match ,
➢ The desired word is in that block
➢Block read by using low order 4 bits
Else,

➢ Block not found in cache


➢ Loaded from main menu

2. Associative Mapping:
Advantages :
➢ More flexible method
➢ Main memory block can be placed in any cache block
position
➢ Space in cache used more effectively
➢ Word field - 4: one of 16 words. (each block has 16=24
words)
➢ Tag field - 12: 12 tag bits Identify which of the 4096 blocks
that are resident in the cache 4096=212.
M a in
m e m o ry

B lo c k 00

B lo c k 11

C ache
ta g
B lo c k 00
ta g
B lo c k 11

B lo c k i

ta g
B lo c k 1 2 7

T ag W o rd
B lo cc kk 44 009955
12 4
M a in m e m o r y a d d r e s s

F i g u r e 5 . 1 6 . A s s o c i a t iv e - m a p p e d c a c h e .

⚫ Tag: 111011111111
⚫ Word:1100=12, the 12th word of a block in the cache

Disadvantages :
➢ If the cache is full, and if a new block has to be brought into
the cache, an existing block has to be replaced by the
replacement algorithm
➢ Cost of associative mapping is higher than the cost of direct
mapping, because all 128 tags must be searched to
determine whether the given block is in the cache or not.
This search is called ASSOCIATIVE SEARCH.

3. Set- Associative Mapping :


➢ Combination of the direct and associative mapping
➢ Blocks of cache grouped into sets
➢ Mapping allows a block of the main memory to reside in any
block of a specified set
➢ Contention problem reduced – few choices for block
placement
➢ Hard ware cost reduced by decreasing the size of associative
search ( only done within a set )
➢ Word field - 4: one of 16 words. (each block has 16=24
words)
➢ Set field - 6: points to a particular set in the cache
(128/2=64=26)
➢ Tag field - 6: 6 tag bits is used to check if the desired block
is present (4096/64=26).
M ai n
m e m o ry

Bl
Bl ock 0
ock 0

Bl ock 1

C a ch e
ta g
Bl ock 00
Se t 0 Bl ock
ock 63
ta g
Bl ock 11
Bl ock
ock 64
ta g
Bl ock 22
Se t 1
Bl ock
ock 65
ta g
Bl ock 33

Bl ock 1 27
ta g
Bl ock 1 26
Se t 63
Bl ock 1 28
ta g
Bl ock 1 27
Bl ock 1 29

Ta g Set W o rd

6 6 4 M ain m e mo ry a dd re ss

Bl ock 4 09 5

Fi gure 5.17. Set-as soc ia ti ve -mapp ed cac he with two bl ock s per set.

Ta g S e t W o rd
6 6 4 Mainmemoryaddres s

1 1 1 0 1 1 ,1 1 1 1 1 1 ,1 1 0 0

⚫ Tag: 111011
⚫ Set: 111111=63, in the 63th set of the cache
⚫ Word:1100=12, the 12th word of the 63th set in the cache
e.g., 2 blocks per set

Total no. of blocks in cache =128


 No. of sets = 128 /2 = 64 sets ( 26 sets )
Memory address = 16 bits.
Main Memory:
Capacity = 64K words = 216 words
Total no.of sets in Main menu = 216 / 26 = 210 sets
Memory blocks 0,64,128,…..4032 map into cache set 0 and they can
occupy either of the 02 block positions within this set

Extreme Conditions:
➢ 128 blocks per set requires no set bits. => fully associative
technique( 12 tag bits )

➢ One block per set => Direct mapping


➢ A Cache that has K blocks per set is referred to as a K-way set-
associative cache

CONTROL BITS:
Valid Bit:
➢ Every block has a valid bit
➢ Indicates whether the block contains valid data

Dirty Bit:
➢ Indicates whether the block has been modified

Valid Bit:
➢ Initialized with 0, when power is initially applied
➢ Set to 1, when the block in cache is loaded from main memory
➢ When main memory block is updated directly bypassing the cache
memory, then a check is made to determine whether the block is in
cache, if it is found valid bit is cleared to 0 ( ensures stale data will
not exist in the cache )

Use if Valid Bits :


➢ During DMA transfer, between MAIN MEMORY & Disk
o Processor updates cache block
o Cache uses write-back protocol
o So data in MAIN MEMORY not the latest
o DMA transfer does not use processor or cache

➢ Solution :
o Flush the cache by forcing the dirty data to be written back
to the memory before the DMA transfer takes place.

2. Replacement Algorithm
Direct Mapped Cache → position of each block is predetermined, so no
need of replacement strategy

Objective:-
To keep the blocks in the cache that are likely to be referenced in the
near future. (property of locality of reference)

1. Least Recently Used Algorithm:


2. Farthest
3. Random

1. Least Recently Used Algorithm (LRU):


• Overwrite a block that has not been referenced for a long time.
• Because there is a high probability that the blocks that have been
referenced recently will be referenced again. (temporal locality)
Method:
• 2 bit counter for each block is maintained.
• When a hit occurs, the counter is set to 0. All other block counters
are incremented.
• When a miss occurs, the newly loaded blocks counter is set to 0. All
other block counters are incremented.
• When the cache is full,
o The block with the highest counter value is replaced.
o New block loaded from Main memory.
o Its counter is set to 0.
Problem:
• Performs poorly when accesses are made to sequential elements of
an array that is slightly larger than the cache.

2. Farthest:
•Replace the block that is farthest from the current block.
Problem:
• Does not take into account the recent pattern of access to blocks in
the cache.
3.Random:
•Choose a random block for replacement
Advantage:

• Practically proven to be effective


• Simple algorithm

5.11 PERFORMANCE CONSIDERATIONS:


Two key factors
1. Performance
2. Cost

A common measure of success is the price / performance ratio


1. Memory Hierarchy - Short access time
2. Interleaving
3. Hit rate & Miss Penalty 
4. Cache on processor chips
5. Other Enchancements
➢ Write Buffer
➢ Prefetching
➢ Lockup- Free Cache

5.11.1 INTERLEAVING :
-Main memory is a collection of physically separate modules, with its own
Adds. Buffer register ( ABR ) & Data Buffer Register ( DBR )
-Memory operations may proceed in more than one module at the same
time

-Rate of data transfer increased.

Method 1

- refer fig -5.25 (a)


mbits k bits
k bits mbits MM add ress
Add ress in mo dule Module
Mo dule Add ress in mo dule MM add ress

ABR DBR ABR DBR ABR DBR


ABR DBR ABR DBR ABR DBR
Module Module Module
Module Module Module 0 i 2k - 1
0 i n- 1

(b) Consecuti ve words in consecutive mo dules


(a) Consecuti ve words in a mo dule
Figure 5.25. Addressing multiple-module memory sy st ems .

-Successive memory locations are found in a single module


Adv : 1. Only one module is involved in bulk data transfer

2. At the same time, DMA may access memory

Method 2
- refer fig 5.25 (b)
-Consecutive addresses are located in successive modules.
Disadvantages :
-Bulk transfer keeps several modules busy at any one time
Advantages :

- Faster data transfer


- More than one word can be transferred in parallel
- Higher memory utilization

Given:
Cache = 8 word / blocks
Hardware properties:

➢ To send an address to memory => 1 Cache cycle


➢ Slow DRAM Chip
➢ First word accessed in => 8 cc
➢ Subsequent words => 4 cc / word
➢ To send 1 word to the Cache => 1 cc
 Total time needed to load the desired block ( 8 words )into the
cache
( using single memory module )
= 1+8+ (7X4 ) +1 = 38 cycles.

For 4 Interleaved modules:


➢ Using high order bits, each module has one word of data in its DBR
after 8 CC
➢ These words are transferred to the Cache, one word at a time
during the next 4 CC
➢ During this time the next word in each module is accessed. Then it
takes another
4 CC to transfer these words to the Cache
Total time needed to load the block =>1 +8+4+4 = 17 CC
Interleaving reduces the block transfer time by more than a factor of 2

5.11.2 Hit Rate & Miss Penalty :


Hit rate = No. of hits / Total no. of attempted accesses
Miss rate = No. of misses / Total no. of attempted accesses
Miss penalty= Extra time needed to bring the desired information into
the cache
h= hit rate, M= Miss penalty, C= Time to access Cache

Average Access time for processor tavg = hc + ( 1-h ) M

Problem :
1. Given
Main memory access = 10 clock cycles
Cache size = 8 words block
Time to load a block from main memory to cache = 17 cycles –
30% of inst. in the program perform a read or write
Hit rates in cache for Inst = 0.95 for Data =0.9

Solution :
Total memory access = 100 inst + 30 data = 130 memory access for
every 100 inst.

Cache access = 1 clock cycle


Access time for processor with cache = hc + ( 1-h ) M
= [ (0.95x1)+ (0.05x17) ] +
[ (0.9x1)+(0.1x17) ]
For 100 Inst, = 100[ ( 0.95x1) + ( 0.05 x17) ] + 30 [ (0.9x1)
+(0.1x17) ] -------------(1)
Access time for Processor without Cache = 130 x 10 = 1300
(2)

Speed up = Time without Cache = Equ (2)


= 5.04
Time with Cache Equ ( 1)

=> computer with cache performs 5 times faster –Compare the


processor
performance with an ideal hit rate = 100 %
Time for ideal Cache = ( 1x1) x 130 = 130

Time with Cache


Speed up =--------------------------------- = 1.98
Time for ideal Cache

Ideal cache is almost 2 times faster.

How can the hit rate be improved ?


1. Make the Cache larger ------→ increased cost
2.Increase block size while keeping the total cache size constant, to
take advantage of

spacial locality
Adv. - Parallel access to blocks in an interleaved memory - many
words can be

accessed in parallel
Disadv. – If the block is very large, some items may not be
referenced before the
block is replaced. This increases miss penalty
3.Miss penalty can be reduced if the load through approach is used
when loading new
blocks into the Cache.
4. Cache on Processor chip and on a separate chip
Same Chip - Speed – Space on the processor is limited which limits
the size of the
Cache
Separate Chip – Delay

5. Common data and instruction cache


➢ Better hit rate
➢ Flexible
6. Separate Data & Instruction Cache
Adv:–

➢ Access both caches at the same time


➢ Leads to parallelism
➢Better performance
Disadv:-

➢ Complex circuitry
7. Two levels of cache
L1 – small , faster cache
L2 – slower , larger hit rate
H1 → hit rate of L1 cache
H2 → hit rate of L2 cache
C1 → time to access L1 cache
C2 → time to access L2 cache
M → time to access main memory
Average access time for the processor t avg = h1 * c1 + (1-h1)*h2*c2
+ (1-h1)*(1-h2) * M
Enhancements to reduce the miss rate / penalty:
1. Using Write buffer:
➢ In write through protocol, processor writes to cache memory &
main memory, but instead of writing into the main memory
processor writes into the ‘Buffer’. Processor does not wait for
main memory write to complete.
➢ Read requests should be serviced immediately. Read is given
priority over write.
➢ Read request may be to a data found in Buffer, so read request
checks buffer if not in cache.
➢ During Read Miss, if cache memory is full & if it replaces a block
with dirty bit, cache memory block should be written to main
memory, then new block from main memory is brought to cache
memory.
➢ When dirty blocks are to be written to main memory, instead of
writing it into main memory, it is written to the buffer, read
request is processed and then latter written to the main
memory.

2. Using Prefetch instructions


➢ Prefetch instructions are inserted into the program either by the
programmer or compiler.
➢ When this instruction is executed, the next block will be copied
from main memory to cache memory, but the processor does
not wait for the operation to complete.

3. Lock up free cache


➢ When prefetching takes place, it stops other instructions to
access the cache until the prefetch is completed. A cache of this
type is said to be locked, while it services a miss.

➢ Solution : Cache miss is given priority over prefetching. A cache


that can support multiple outstanding misses is called lock up
free cache.

5.12 Virtual Memory


Problems:
• Physical main memory need not be as large as the address space
spanned by an address issued by the processor.

• Processor issues a 32-bit address (i.e) 232 memory locations


• Main memory capacity = 4G bytes, but a typical main memory size
is 1 GB.
• A program does not fit into the main memory, so the part of the
program that is not being executed are stored on the secondary

storage device.
• A program need not be aware of the limitations imposed by the
available main memory.

• Techniques that automatically move program and data block into


the physical main memory when they are required for execution
are called Virtual Memory Technique.
• The program & the processor reference an instruction and data by
using binary addresses that are independent of the available
physical memory. This binary address issued by the processor is

called Virtual or Logical address.


• When the requested logical address is found in the main memory,
the data is given to the processor. If it is not found, the requested
data should be brought from the hard disk to the main memory.
Memory Management Unit (MMU):
• Translates logical address into physical address
• If the requested data is not in the main memory , the MMU causes
the OS to bring the data into the main memory from disk.
• Transfer of data between the disk & main memory is performed
using the DMA scheme.

P ro c e s s o r

V ir tu a l a d d re s s

M e m o ry
D ata MMU M anagem ent
U nit
P h y sic a l a d d re s s

C a ch e

D a ta P h ys ic a l a d d re ss

M a in m e m ory

D M A tra n s fe r

D is k s to ra g e

F ig u r e 5 . 2 6 . V i rt u a l m e m o r y o r g a n iz a ti o n .

Address Translation:
• Programs & data are composed of fixed length units called pages.
• Pages consists of a block of words that occupy contiguous locations
in main memory.
• Pages are the basic unit that is moved between main memory &
disk.

• Virtual Memory mechanism :


o Bridges the size & speed gap between the main memory &
secondary storage
o Implemented in part by software technique.
o Similar in concept to cache memory
o Pages → set of words
• Cache Memory:
o Bridges the size & speed gap between processor & main
memory.

o Implemented in hardware
o Blocks → set of words
• Virtual / Logical address:
• Information about the main memory location of each page is kept
in a page table.

• Page table consists of main memory address & logical address &
current status of the page.
• An area in main memory that can hold one page is called a page
frame.
• Starting address of the page table is kept in a page table base
register
• Page table address = page no. in virtual address + [ page table
base register]
• The calculated address gives the starting address of the page in
main memory.
• There is an entry in the page table for every page in the program.
• It also includes some control bits – status of the page in main
memory.

• Control bits are:


o Validity bit
o Modified bit – determines whether the page should be
written back to the disk before it is removed from main
memory to make room for another page.

o Accessing bit – R/W protection


• Page table also resides in main memory.
• A copy of the portion of the page table can be accommodated
within the MMU (i.e.) the page table entries of the most recently
accessed page.
• A small cache called the Translation Look-aside Buffer (TLB) is
incorporated into the MMU to store the page table.
• TLB includes:
o Virtual address of the entry (i.e.) Virtual page no.
o Control bits
o Page frame no. in main memory.
• TLB uses associative mapping

V ir tu a l a d d re s s fro m p ro c e s s o r

P a g e ta b le b a s e re g is te r

P a g e ta b le a d d re s s V ir tu a l p a g e n u m b e r O ffs e t

+
P A G E T A B LE

C o n tro l P a g e fra m e
b it s in m e m o ry P a g e f ra m e O f fs e t

F ig u r e 5 .2 7 . V ir tu a l- m e m o r y a d d r e s s tr a n s la tio n . P h y s ic a l a d d re s s in m a in m e m o r y

Translation process:
• Processor gives virtual address to MMU.
• MMU looks into the TLB for the virtual address entry.
• If found (hit) then the physical address i.e. physical page no. in
main memory is got, and main memory or cache memory is
accessed for data, which is then forwarded to the processor.
• If not found (miss) then the MMU gets the page table from main
memory, updates the TLB and then accesses the main memory.
Page Fault:
• When the requested page is not found in the main memory , a
page fault is said to have occurred.
• A whole page must be brought from the disk into the main memory
• MMU asks the OS to raise an interrupt.
o Processing of the active task is interrupted
o Control is transferred to the OS
o OS copies the requested page from disk to main memory
o And returns control to the interrupted task.
Replacement Algorithm:
• If a new page is brought from the disk when the main memory is
full, it replaces one of the resident pages.

• Least Recently used Algorithm is used.


• Modified pages should be written back to the disk before it is
removed from the main memory.
• Write through protocol not useful for virtual memory. To
reduce the address translation time:
• Because of the locality of reference, it is likely that many successive
translations involve addresses on the same page.
• So one or more special registers that retain virtual page no. &
physical page frame of the most recently performed translation is
used.

• Information in these registers are accessed more quickly than TLB.


Virt ual address from processo r

Virt ual page nu mber Offset

TLB

Virt ual page Control Page frame


num ber bits in mem ory

No
=?

Yes

Miss

Hit

Page frame Offset

Figure 5.28. Use of an associative-mapped TLB. Physica l address in m ain memo ry

Memory Management Requirements


• More than one user/ program use the computer at the same time.
• Physical main memory should be shared by many users
• User space (or system space or Virtual address space) is a physical
main memory location where user application programs reside.

• A separate page table for each user program is maintained.


• MMU uses page table base register to find the address of page
table.
• OS loads this with the starting address of the currently active page
table.
• Only the pages that belong to one of these spaces are accessible at
any given time.
• Protection:
o User program can not access page table
o Processor has 2 states.
▪ Supervisor → when OS routines are executed
▪ User state → when user programs are executed,
certain machine instructions cannot be executed.
These instructions are called privileged instructions
which include such operations as modifying the page
table base register.
o One application program accesses certain pages belonging
to another program.
▪ OS allows this by causing these entries in both page
tables
▪ i.e. shared pages will have entries in 2 different page
tables.

▪ Additional control bits to set access privileges needed.

5.13 – SECONDARY STORAGE


Main limitation – cost per bit of stored information
I. Magnetic Hard Disks:
o One or more magnetic disk system mounted on a common spindle
o A thin magnetic film is deposited on both the sides
o Magnetized surfaces move in close proximity to read / write heads
o Disks rotate at a uniform speed
o Each R/W head consists of a magnetic yoke and a magnetizing coil To
store

o Apply current pulses of suitable polarity to the magnetizing coil


o This causes the magnetization of the film in the area under the
head to switch to a direction parallel to the applied field
o Changes in the magnetic field caused by the movement of the film
relative to the yoke induce a voltage in the coil, which now serves
as a sense coil

o A control circuitry senses the change in voltage under the head


o Binary status 0 & 1 are represented by tow opposite states of
magnetization
o A voltage is induced in the head only at o to 1 & o to 0 transitions
in the bit stream
o A long string of Zeros and Ones causes an induced voltage only at
the beginning and end of the string
o The no. of consecutive Zeros and Ones is determined by a clock
Clock

o Clock must provide information for synchronization


o Clocking information is now combined with data
o Simple technique called phase encoding or Manchester encoding
( also called self clocking scheme )
o Changes in magnetization occur for each data bit, which is
guaranteed at the midpoint of each bit period; thus providing the
clocking information

Disadvantages
o Poor bit-storage density (ie) space required to represent each bit
must be large enough to accommodate two changes in
magnetization
o Read / Write heads must be maintained at a very small distance
from the moving disk surface
To activate high bit densities &
Reliable read/write operation
o Air pressure builds between the disk surface and the head forces
the head away from the surface

A flexible spring keeps it in place


Solution:
The disks & read /write heads are placed in a sealed, air-filtered
enclosures. This approach is known as Winchester Technology
Advantages of Winchester Technology:
o Read /write heads can operate closer to the magnetized track
surfaces, because dust problem found in unsealed assemblies are
absent
o Since the head is clear, more density of data can be achieved along
the track & tracks in turn be closer to each other so larger capacity
o Data integrity is grater since storage medium is not exposed to
contaminating elements.

Read / Write head:


o Movable
o One head per surface
o Can move radially across the disk to access individual tracks
Three key parts:

o Assembly of disk platters called the disk


o Disk drive:
Electromechanical mechanism
Spins the disk & moves the Read / Write head
o Disk controller:
Electronic circuit that controls the operation of the
system

ORGANISATION AND ACCESSING OF DATA ON A DISK:


o Each surface is divided into concentric tracks and each track is
divided into sectors
o The set of corresponding tracks on all surfaces of a stack of disks
forms a logical cylinder
o The data on all tracks of cylinder can be accessed without moving
the read/write heads
o Data are accessed by specifying the surface number , track
number & Sector number

o Read / Write operations start at sector boundaries

Storage :
o Data bits are stored serially on each track
o Data is preceded by a sector header that contains addressing
information used to find the desired sector on the selected track
o Following the data, there are additional bits that constitute an
error-correction code ( ECC ) used to detect and correct errors
o To distinguish between two consecutive sectors, there is a small
inter-sector gap

Formatting :
o Physically divides the disk into tracks and sectors

Disk Controller :
o Keeps track of defective sectors
o Formatting information – sector header, ECC bits & inter-sector
gaps.

We can increase the storage density by placing more sectors on outer


tracks, which have longer circumference, at the expense of more
complicated access citcuitry ( used in larger disks )

ACCESS TIME:
1. Seek time : - Time required to move the Read / Write head to the
proper track –
It depends on initial position of the head relative to
the track specified in the address.

2. Rotational Delay or Latency Time :


This is the amount of time that elapses after the head
is positioned over the correct track until the starting
position of the addressed sector passed under the
Read/ Write head

Access Time = Seek time + Latency time


Given :

No.of data recording surfaces = 20


No.of tracks / surface = 15, 000
Average no. of sectors/track = 400
No. of bytes per sector = 512 bytes
Solution
Total Capacity = 20 X 15,000 X 400 X 512
 60 X 109
 60 Giga bytes

Data Buffer / Cache


o A disk drive is connected to the computer system by a standard bus,
such as SCSI ( Small Computer System Interface )
o SCSI bus is capable of transferring data at much higher rates than
the rate at which data can be read from disk tracks
o A buffer between disk & bus is used to store a few megabytes of
data

o Disk → Buffer → MM
Slow Fast
Rotational speed Speed of bus
of the disk

Disk Controller (DC):


o Interface between the disk drive and the bus that connects it to the
rest of the system
o Uses a DMA scheme to transfer data between the disk & the MM
o Transfers are From / To the data buffer
o OS initiates the transfer by issuing Read / Write request
o Special registers used are
MM Address --- Address of the 1st MM location of the block of
words

involved in the transfer


Disk Address --- Sector number, track number, containing
the desired word

Word Count --- No. of words in the block to be transferred


o OS issues logical address
o Disk controller keeps track of bad sectors and substitutes other
sectors

o Functions of D.C
1. Seek
2. Read -- serially read from disk assembled
into words and placed into the data buffer

3. Write
4. Error Checking – used while reading, if
error found informs OS
S/W & OS Implications :
o All data transfers involving disk are initiated by OS
o Booting – OS is loaded into the MM
-- ROM stores a small monitor program that can read &
write MM
locations and read one block of data stored on the disk at
address O . This
block is referred to as the Boot Block contains a loader
program.
-- After the boot block is loaded into MM it loads the main
parts of the OS

into the MM.


-- After the OS initiates a disk transfer, OS switches to
other tasks instead
of waiting
-- The disk controller informs the OS when the transfer is
completed by

raising an interrupt
-- OS can schedule overlapped I/O activities
DMA transfer on one disk

While Seek an another disk is done


FLOPPY DISKS :
➢ Smaller, simpler, cheaper disks
➢ Flexible, removable, plastic diskette
➢ Diskette enclosed in a plastic jacket
➢ Recording data is alone using phase or Manchester encoding –
single density
➢Double Density – More complex circuits in disk controller
Disadvantages:
➢ Smaller storage capacities
➢ Longer access time
➢ Higher failure rates
Larger Super-floppy disks – Zip disk – can store more than 100MB

RAID DISK ARRAYS:


➢ Storage system based on multiple disks
➢ Redundant Array of Inexpensive disks (RAID)
➢ Improves reliability of the overall system using multiple disks
Six different configurations of RAID:
RAID 
➢ Basic configuration
➢ A single large file is stored in several separate disk units by
breaking the file up into a number of smaller pieces and storing
these pieces on different disks. This is called data striping

Advantages:
➢ All disks can deliver their data in parallel, when the file is read
➢ Total transfer time =
Transfer time required in a single –disk system
No. of disks used in the array

➢ Access time not reduced- needed to locate the beginning of the


data on each disk
➢ Buffering reassembles the files before sending to the processor as
a single entity

➢ Simplest disk array operation


➢ Data flow time performance is improved

RAID 1
➢ Better reliability by storing identical copies of data on two disks
rather than just one the two disks are said to be mirrors of each
other

Advantages :
➢If one disk fails all R/W operations are directed to its mirror
Disadvantages :
➢ Costly way to improve the reliability because all disks are
duplicated
RAID 2, RAID 3 & RAID 4 :
➢ Achieves increased reliability thorough various parity checking
schemes without requiring a full duplication of disks
➢ All of the parity information is kept on one disk
RAID 5 :
➢ Makes use of a parity – based error recovery scheme
➢ Parity information is distributed among all disks, rather than being
stored on one disk

RAID 10:
➢ Combines features of RAID & RAID1
RAID has been redefined by the industry to refer to “independent “ disks.

ATA / EIDE Disks :


EIDE - Enhanced Integrated Drive Electronics
ATE - Advanced Technology Attachment

➢ Low price
➢ Separate controller is needed for each drive if two drives are to be
used concurrently to improve performance

SCSI disks :
➢ Has interface designed for connection to a standard SCSI bus
➢ More expensive, better performance
➢ Concurrent access can be made to multiple disk drives because the
drives interface are actively connected to the SCSI bys only when
the drive is ready for data transfer
Advantages :
➢ When large number of requests of a small file.
RAID disks :
➢ Excellent Performance
➢ Large & Reliable Storage
II OPTICAL DISKS
COMPACT DISCS ( C.Ds )
1. C.D Technology
o Laser light source
o A laser beam is directed onto the surface of the spinning
disk
o Physical indentations in the surface are arranged along
the tracks of the disk
o They reflect the focused beam towards photo-detector,
which detects the storage binary patterns
o Laser emits a light beam that is sharply focused on the
surface of the disk
o If a light beam is combined with another light beam of
the same kind a bright spot is placed
o If two light beams which are out of phase, they cancel
each other and a dark spot is placed.
Aluminium – reflects light
Acrylic -- Protective cover
o Laser source and the photo-detector are positioned below
the polycarbonate plastic
travels through reflects off travels back
to

o Light → Plastic → Aluminium layer →


Photo-detector
Three different Positions:
o When the light reflects solely from the pit or solely from the
land, the detector will see the reflected beam as a bright
spot
o When the beam moves through the edge where the pit
changes to the land, the reflected wave from the pit will be
180° out of phase with the wave reflected from the land,
canceling each other.
o Thus the detector will not see a reflected bean and will
detect a dark spot
o If each transition, detected as a dark spot is taken to denote
the binary value, and the flat portions represent OS
o The pattern is not a direct representation of the stored data.
Each byte of data is represented by a 14-bit code, which
provides considerable error checking

Storage:
o Pits are arranged along tracks on the surface of the disk
o There is just one physical track, spiraling from the middle of
the disk towards the outer edge

o But each circular path spanning 360° is separate track


o 15,000 tracks on a disk
o If the entire track is unraveled, it would be over 5Km long
Alu mi num Acrylic Label

Pit Land Pol ycarb onate pla stic

(a) Cross-secti on

Pit Land

Reflection Reflection

No refle ction

So urce Detecto r So urce Detector Sou rce Detector

(b) Tra nsitio n fr om pit to lan d

0100 10000 10001 00100 10

(c) Store d bin ary pattern

Figure 5.32. Optical disk.

2. CD-ROM :
Disadvantages of CDs
o Biggest problem is to ensure integrity of stored data
o Because the pits are very small, it is difficult to implement all the
pits perfectly. This leads to errors during reading

Solution:
o Provide additional bits to provide error checking and correction
o C.Ds with such capability is called C.D ROM
Data Storage

o Data is in stored the tracks in the form of blocks called sectors

Mode 1 Format :
16 byte 2048 bytes ---- ---- ---- -- 288 byts of error
header of data - correction scheme

--
Total 2352 bytes / sector – no. of sectors is more on the
longer outer tracks

Rotational Speeds:
o 1X → 75 sectors per second
o Provides a data rate of 153,600 bytes/ sec – uses mode / format
o Speed affects only data transfer rate but not the storage capacity

Disadvantages :
o Lower transfer rates than magnetic hard disks
o Longer seek time
o Advantage
o Small size
o Low cost
o Larger capacity
o Ease of handling as remarkable and transportable mass-storage
medium

o Faster access time than floppy disks and magnetic tapes

5. C.D Recordable(CD-R) :
CD ROM’s creation
o Master disk produced with high –power laser to burn holes that
correspond to pits

o A mold is then made from master


o Then molten poly carbonate plastic is injected into the mold to
make a CD that has the same pattern of holes as the master disk
CD-R
o Spiral track implemented during the manufacturing process
o A laser in a CD-R drive is used to burn pits into organic dye on the
track
o When a burned spot is heated beyond a critical temperature it
becomes opaque

o Such burned sports reflect less light when subsequently read


o Written data are stored permanently
o Unused portions of a disk can be used to store additional data at a
later time

4. CD-REWRITABLE
o Can be written multiple times by the user .
o Instead of an organic dye in the recording layer ,an alloy of
silver ,indium ,antimony& tellurium is used .
o If it is heated above its melting point ( 500C ) & then cooled down ,
it goes into an amorphous state in which it absorbs light .
o If it is heated to about 200C,& this temperature is maintained for
an extended period , a process known as annealing takes place ,
which leaves the alloy in a crystalline state , that allows the light to
pass through .

o These crystals can represent lands.


o If heated beyond 500C it represents pits.
o The stored data can be erased by using annealing process, which
returns the alloy to a crystal .
o A reflective material is placed above the recording layer to reflect
the light when the disk is read.

o Uses 3 different laser powers:


o Highest power used to record the pits.
o Middle power – used to put the alloy into its crystalline
state (erase power).

o Lowest power used to reading process.


o Limit on how many times a CD-RW disk can be rewritten =1000.
Advantages
o Low cost storage medium.
o Used for backup storage.

5 DVD TECNOLOGY
o DVD- Digital Versatile Disk technology.
o Great storage capability.
o A red light laser with a wave length of 635nm is used instead of the
infra red light laser used in CDs, with a wave length of 780nm.
o The shorter wavelength makes it possible to focus the light to a
smaller spot.

o Pits are smaller.


o Tracks are placed closer together.
o These improvements lead to a DVD capacity of 4.7G bytes.
o Two layered & two – sided disks.
Two layered disks
Method:1
o Translucent material acts as a semi reflector, this can be
programmed with pits to store data.
o The second layer is called reflective layer made up of
aluminum.
o Disk is read by focusing the light on the required layer.
o While reading layer 1 → translucent layer reflects light.
o The layer that is not read reflects light which will be detected
by circuits as noise.
Method:2
o Two single sided disks can be put together to form a
sandwich like structures, where the top disk is turned upside
down.
o This can be done for double layered disks also.
Speed:

o Access time is similar to CD’s


o DVD rotational speed is similar to CD’s
o But data transfer rate are much higher because of the
higher density of pits.

6 DVD_RAM:
Advantages:
o Rewritable version of DVD.
o Large storage capacity

Disadvantages:
o Higher price
o Slow writing speed
o Write verification is performed
o Reads the stored data
o Checks them against the original data

III MAGNETIC TAPE SYSTEM:


o Suited for off-line storage of large amounts of data.
o Used for backup purposes
o A magnetic film is deposited on a very thin 0.5 or 0.25 inch
wide plastic tape.
o Seven or 9 bits are recorded in parallel across the width of the
tape, perpendicular to the direction of motion.
o A separate R/W head is provided for each bit position on the
tape, all these bits can be read in parallel.

o One of the character bits is used as a parity bit

File File
mark File
mark
• •
• • 7 or 9
• •
• •
bits

File ga p R ecord R ecord R ecord R ecord


ga p ga p

Figure 5 .33 . O rganization of dat a on m agnet ic tape.

Data Organization on tapes:


o Data are grouped into records
o Records are separated by gaps
o Tape motion is stopped only when a record gap is under the
R/W heads
o Record gaps have no magnetization. They allow records to be
detected independently of the recorded data.
o Beginning of the file is the File Mark. It is a sequence of special
single or multiple characters.
o Preceding the File Mark is the File Gap, which is longer than the
inter-record gap.
o The first record after the File Mark is the header / identifier for
the file.

Operations on the Tape:


o Rewind, erase, write , forward , backward

Two methods of formatting & using tapes:


Method 1:

o Records are variable in length


o Efficiently uses the tape
o Does not permit updating or overwriting of records in place.
Method 2:

o Fixed length records


o Possible to update records
Catridge Tape System:

o Tape housed in a cassette


o Reading & writing is done by a helical scan system operating across
the tape, similar to that used in video cassette tape drives.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Assignments
1) One difference between a write-through cache and a write-back
cache can be in the time it takes to write. During the first cycle, we detect
whether a hit will occur, and during the second (assuming a hit) we actually write
the data. Let’s assume that 50% of the blocks are dirty for a write-back cache.
For this question, assume that the write buffer for the write through will never
stall the CPU (no penalty). Assume a cache read hit takes 1 clock cycle, the cache
miss penalty is 50 clock cycles, and a block write from the cache to main memory
takes 50 clock cycles. Finally, assume the instruction cache miss rate is 0.5% and
the data cache miss rate is 1%. Assuming that on average 26% and 9% of
instructions in the workload are loads and stores, respectively, estimate the
performance of a write-through cache with a two-cycle write versus a write-back
cache with a two-cycle write. [CO4, K3]

2) A block-set associative cache memory consists of 128 blocks divided into four
block sets . The main memory consists of 16,384 blocks and each block contains
256 eight bit words. [CO4, K3]

a. How many bits are required for addressing the main memory?
b. How many bits are needed to represent the TAG, SET and WORD
fields?

3) A 4-way set associative cache memory unit with a capacity of 16 KB is built using
a block size of 8 words. The word length is 32 bits. The size of the physical
address space is 4 GB. The number of bits for the TAG field is . [CO4, K3]
4) Consider a direct mapped cache with 8 cache blocks (0-7). If the memory block
requests are in the order- [CO4, K3]
3, 5, 2, 8, 0, 6, 3, 9, 16, 20, 17, 25, 18, 30, 24, 2, 63, 5, 82, 17, 24
Which of the following memory blocks will not be in the cache at
the end of the sequence?

1. 3
2. 18
3. 20
4. 30
Also, calculate the hit ratio and miss ratio.
5) Consider a fully associative cache with 8 cache blocks (0-7). The
memory block requests are in the order- [CO4, K3]

4, 3, 25, 8, 19, 6, 25, 8, 16, 35, 45, 22, 8, 3, 16, 25, 7


If LRU replacement policy is used, which cache block will have
memory block 7?
Also, calculate the hit ratio and miss ratio.

6) Two of the design choices in a cache are the row size (number of
bytes per row or line) and whether each row is organized as a single
block of data (direct mapped cache) or as more than one block (2-way
or 4-way set associative). The goal of a cache is to reduce overall
memory access time. Suppose that we are designing a cache and we
have a choice between a direct-mapped cache where each row has a
single 64-byte block of data, or a 2-way set associative cache where
each row has two 32-byte blocks of data. Which one would you
choose and why? Give a brief technical justification for your answer. If
the choice would make no difference in performance then explain why
not. [CO4, K3]
Part A – Questions &
Answers
1. What is temporal locality? [K2, CO4]
Temporal locality is a principle stating that if a data location is
referenced then it will tend to be referenced again soon which is
also called as locality in time.
Analogy: If you recently brought a book to your desk to look at,
you will probably need to look at it again soon.

2. What is spatial locality? [K2, CO4]


Spatial locality is a principle stating that if a data location is
referenced, data locations with nearby addresses will tend to be
referenced soon.

3. Define memory hierarchy. [K2, CO4]


Memory hierarchy is a structure that uses multiple levels of
memories, as the distance from the processor increases, the size of
the memories and the access time both increase.

4. Define block in memory hierarchy. [K2, CO4]


Block or line is the minimum unit of information that can be either
present or not present in a cache. For analogy, a book in a library.

5. What is hit rate in memory hierarchy? [K2, CO4]


If the data requested by the processor appears in some block in
the upper level, this is called a hit (analogous to your finding the
information in one of the books on your desk). The hit rate, or hit
ratio, is the fraction of memory accesses found in the upper level; it
is often used as a measure of the performance of the memory
hierarchy.
6. What is miss rate in memory hierarchy? [K2, CO4]
If the data requested by the processor is not found in the upper
level, the request is called a miss. The miss rate (1−hit rate) is the
fraction of memory accesses not found in the upper level.

7. Define hit time in memory hierarchy. [K2, CO4]


Hit time is the time to access the upper level of the memory
hierarchy, which includes the time needed to determine whether
the access is a hit or a miss (that is, the time needed to look
through the books on the desk).

8. Define miss penalty in memory hierarchy. [K2, CO4]


The miss penalty is the time to replace a block in the upper level
with the corresponding block from the lower level, plus the time to
deliver this block to the processor (or the time to get another book
from the shelves and place it on the desk). Because the upper level
is smaller and built using faster memory parts, the hit time will be
much smaller than the time to access the next level in the
hierarchy, which is the major component of the miss penalty. (The
time to examine the books on the desk is much smaller than the
time to get up and get a new book from the shelves.)

9. Define track in magnetic disks. [K2, CO4]


Track is one of thousands of concrete circles that makes up the
surface of a magnetic disk.
10.Define sector in magnetic disks. [K2, CO4]
Sector is one of the segments that make up a track on a magnetic
disk. A sector is the smallest amount of information that is read or
written on a disk.

11.Explain the term seek in magnetic disks. [K2, CO4]


seek is the process of positioning a read/write head over the proper
track on a disk.

12.Define rotational latency. [K2, CO4]


Rotational latency also called rotational delay. The time required for
the desired sector of a disk to rotate under the read/write head.
Usually assumed to be half the rotation time.

13.Define cache memory. [K2, CO4]


Cache is a safe place for hiding or storing things.

14.What is direct-mapped cache? [K2, CO4]


Cache structure is called direct mapped, since each memory
location is mapped directly to exactly one location in the cache. For
example, almost all direct-mapped caches use this mapping to find
a block:
(Block address) modulo (Number of blocks in the cache)

15.Define tag in memory hierarchy. [K2, CO4]


Tag is a field in a table used for a memory hierarchy that contains
the address information required to identify whether the associated
block in the hierarchy corresponds to a requested word.

16.Define valid bit. [K2, CO4]


A field in the tables of a memory hierarchy that indicates that the
associated block in the hierarchy contains valid data.

17.Explain cache miss. [K2, CO4]


Cache miss is a request for data from the cache that cannot be
filled because the data is not present in the cache.

18.Define write-through scheme. [K2, CO4]


Write-through is a scheme in which writes always update both the
cache and the next lower level of the memory hierarchy, ensuring
that data is always consistent between the two.

19.Define write buffer. [K2, CO4]


Write buffer is a queue that holds data while the data is waiting to
be written to memory.

20.Define write-back. [K2, CO4]


write-back is a scheme that handles writes by updating values only
to the block in the cache, then writing the modified block to the
lower level of the hierarchy when the

block is replaced.
21.Explain split cache scheme. [K2, CO4]
A scheme in which a level of the memory hierarchy is composed of
two independent caches that operate in parallel with each other,
with one handling instructions and one handling data.

22.Discuss fully associative cache. [K2, CO4]


Fully associative cache is a cache structure in which a block can be
placed in any location in the cache.

23.Define set-associative cache. [K2, CO4]


Set-associative cache is a cache that has a fixed number of
locations (at least two) where each block can be placed.

24.Explain LRU scheme. [K2, CO4]


Least recently used (LRU) is a replacement scheme in which the
block replaced is the one that has been unused for the longest time.

25.Define multi level cache. [K2, CO4]


Multilevel cache is a memory hierarchy with multiple levels of
caches, rather than just a cache and main memory.

26.Define global miss rate. [K2, CO4]


Global miss rate is as the fraction of references that miss in all
levels of a multilevel cache.

27.Define global miss rate. [K2, CO4]


Local miss rate is defined as the fraction of references to one level
of a cache that miss, used in multilevel hierarchies.
28.Define virtual memory. [K2, CO4]
Virtual memory is a technique that uses main memory as a “cache”
for secondary storage.

29.Explain Physical memory address. [K2, CO4]


Physical address is an address in main memory.

30.Explain protection mechanism in memory sharing. [K2, CO4]


Protection is a set of mechanisms for ensuring that multiple
processes sharing the processor, memory, or I/O devices cannot
interfere, intentionally or unintentionally, with one another by
reading or writing each other’s data. These mechanisms also isolate
the operating system from a user process.

31.Define page fault. [K2, CO4]


Page fault is an event that occurs when an accessed page is not
present in main memory.

32.Define virtual address. [K2, CO4]


Virtual address is an address that corresponds to a location in
virtual space and is translated by address mapping to a physical
address when memory is accessed.

33.Define address translation. [K2, CO4]


Address translation is also called as address mapping. The process
by which a virtual address used to an address used to access
memory.

34.Explain segmentation. [K2, CO4]


Segmentation is a variable-size address mapping scheme in which
an address consists of two parts: a segment number, which is
mapped to a physical address, and a segment off set.

35.Define page table. [K2, CO4]


Page table is a table containing the virtual to physical address
translations in a virtual memory system. The table, which is store in
memory, is typically indexed by the virtual page number; each
entry in the table contains the physical page number for that virtual
page if the page is currently in memory.

36.Define swap space. [K2, CO4]


Swap space is the space on the disk reserved for the full virtual
memory space of a process.

37.Explain reference bit. [K2, CO4]


Reference bit also called as use bit is field that is set whenever a
page is accessed and that is used to implement LRU or other
replacement schemes.

38.Define TLB. [K2, CO4]


Translation-look aside buffer is a cache that keeps track of recently
used address mappings to try to avoid an access to the page table.

39.What is a hit? [K2, CO4]


When CPU needs data, it immediately checks in cache memory
whether it has data or not. A cache hit describes the situation
where your site’s content is successfully served from the cache.

40.What is a miss? [K2, CO4]


When CPU needs data, it immediately checks in cache memory
whether it has data or not. A cache miss refers to the instance
when the memory is searched and the data isn’t found. When this
happens, the content is transferred and written into the cache.

41.Define hit rate. [K2, CO4]


Hit rate = No. of hits / Total no. of attempted accesses

42.Define miss rate. [K2, CO4]


Miss rate = No. of misses / Total no. of attempted accesses

43.What is miss penalty? [K2, CO4]


Miss penalty= Extra time needed to bring the desired information
into the cache

44. Compare SRAM and DRAM. [K3,CO4]


45. What is locality of reference? [K2, CO4]
Locality of reference:
-most of the execution time is spent on routines in which many
instructions are executed repeatedly. (eg) loop, nested loops,
procedures.
-many inst. in localized areas of the programs are executed repeatedly
during some time period and the remainder of the program is accessed
relatively infrequently.

temporal locality spacial locality


-recently executed instructions - instructions in close proximity to a
is likely to be executed again recently executed inst. are likely to
very soon be executed soon.

-when an item is first needed, - instead of fetching 1 item from


it will be brought into the cache the main memory, it is useful to
and it will remain until it is fetch several items that reside at
needed again. adjacent address as well (i.e.) block
– set of contiguous

46.What is temporal and special locality of reference ? [K2,


CO4]

Refer answer of Q 45

47.What do you mean by contention problem? When do you


get it? [K2, CO4]
In direct mapping cache memory, More than one block mapped onto
the same cache block position. Same cache block may be replaced
even when other blocks are empty e.g ., branch from block 1 to block

129. This problem is called contention problem.

48. What is memory interleaving? What are its advantages?


[K2, CO4]
-Main memory is a collection of physically separate modules, with its
own Adds. Buffer register ( ABR ) & Data Buffer Register ( DBR )
-Memory operations may proceed in more than one module at the
same time

- Rate of data transfer increased.

Method 1
- refer fig -5.25 (a)
mbits k bits
k bits mbits Add ress in mo dule Module MM add ress
Mo dule Add ress in mo dule MM add ress

ABR DBR ABR DBR ABR DBR


ABR DBR ABR DBR ABR DBR
Module Module Module
Module 0 i 2k - 1
Module Module
0 i n- 1

(b) Consecuti ve words in consecutive mo dules


(a) Consecuti ve words in a mo dule
Figure 5.25. Addressing multiple-module memory sy st ems .

-Successive memory locations are found in a single module


Adv : 1. Only one module is involved in bulk data transfer
2. At the same time, DMA may access memory
Method 2

- refer fig 5.25 (b)


-Consecutive addresses are located in successive modules.
Disadvantages :
-Bulk transfer keeps several modules busy at any one time
Advantages :

- Faster data transfer


- More than one word can be transferred in parallel
- Higher memory utilization

49.Compare memory mapped I/O and Separate Address


space. [K3, CO4]

Memory mapped I/O:


o I/O & memory share the same address space.
o Any machine instruction can access the memory and I/O
device.

o (e.g.) MOVE DATAIN , R0 R0 Keyboard


o MOVE R0, DATAIN Display  R0
o Simple method.
Separate Address space
o Separate I/O instructions
o Separate address space for I/O devices & memory
o Fewer address lines
o Not necessarily a separate bus. The same address bus is used
but, control lines tell whether the requested R/W is an I/O
operation or memory operation.
o Address decoder recognizes its address
o Data register holds data to be transferred
o Status register holds information relevant to the operation of
the I/O device
o Because of the difference in speed between I/O devices &
processor buffers are used.

50.Differentiate Program controlled I/O and Interrupt driven


I/O. [K3, CO4]
Program-controlled I/O requires continuous involvement of the
processor in the I/O activities. The difference in speed between the
processor and I/O devices creates the need for mechanisms to

synchronize the transfer of data between them.


Interrupt driven I/O: I/O device sends a special signal (interrupt) over
the bus whenever it is ready for a data transfer operation.

51. What is polling? [K3, CO4]


o When the I/O device sends an interrupt request , its status
registers IRQ bit is set to 1.
o Processor polls each I/O device and checks the IRQ bit of its status
register.

o The first device with IRQ = 1 , will be serviced.


o Disadvantage: - more time is spent on interrogating IRQ bits of
each device.

52.What is DMA? What are the advantages of DMA? [K3, CO4]


DMA – Direct Memory Access
Advantages:
• transfer of large blocks of data at high speed
• without continuous intervention by processor

53.What are the different modes of data transfer? [K3, CO4]


Two types of bus access:
(iii) Block or burst mode:
The DMA controller may be given exclusive access to the
main memory to transfer a block of data without interruption.
This is known as block or burst mode.

(iv) Cycle stealing:


Processor originates most memory access cycles, so the
DMA controller can be said to ‘steal memory cycles’ from the
processor. This technique is called cycle stealing. This allows
the DMA controller to transfer one data word at a time, after
which it must return control of the busses to the CPU. The
CPU merely delays its operation for one memory cycle to
allow the direct memory I/O transfer to “steal” one memory
cycle.

54.What is cycle stealing? [K3, CO4]


Refer Answer from Q 53

55.What are the two arbitration techniques in DMA? What is


its purpose? [K3, CO4]
A conflict may arise if both the processor & DMA controller or two DMA
controllers try to use the bus at the same time to access the main
memory. To resolve these conflicts, an arbitration procedure is
implemented on the bus to coordinate the activities of all devices that
requested memory transfer.

Bus arbitration:
• The device that is allowed to initiate data transfers on the
bus at any given time is called the bus master.
• When the current master relinquishes control of the bus,
another device can acquire this status.
• Bus arbitration is the process by which the next device to
become the bus master is selected and bus mastership is
transferred to it.

The two methods of arbitration are :


i. Centralized arbitration
ii. Distributed arbitration

56. How is disk access time calculated? [K3,CO4]


ACCESS TIME:
1.Seek time : - Time required to move the Read / Write head to the
proper track. It depends on initial position of the head relative to the
track specified in the address.
2.Rotational Delay or Latency Time : This is the amount of time that
elapses after the head is positioned over the correct track until the
starting position of the addressed sector passed under the Read/ Write
head
Access Time = Seek time + Latency time
57.How many memory chips are needed to construct 2M x 16
memory system using 512K x 8 static memory chips? [K3,
CO4]

8 numbers of 512K x 8 chips are needed to construct 2M x 16


memory. Four rows and two columns.

58.What is virtual memory and what are the benefits of virtual


memory? [K3, CO4]
The purpose of virtual memory is to enlarge the address space, the
set of addresses a program can utilize. For example, virtual
memory might contain twice as many addresses as main memory.
A program using all of virtual memory, therefore, would not be able
to fit in main memory all at once. Nevertheless, the computer could
execute such a program by copying into main memory those
portions of the program needed at any given point during
execution.

Benefits: a) Increases the address space


b) Gives an effect to the processor that the entire
program is available in the main memory.

59.What is meant by bus arbitration? [K3, CO4]


The possibility exists that several master or slave units connected
to a shared bus will request access to the bus at the same time.
A selection mechanism called bus arbitration is therefore required
to enable the current master, which we still refer to as bus
controller to decide among such competing requests. Daisy
chaining, Polling and Independent requesting are used.

60.What is the use of EEPROM? [K3, CO4]


Advantage :
-programmed and erased electrically.
-do not have to be removed for erasure.
-selective erasure possible.

61. State the hardware needed to implement the LRU in


replacement algorithm. [K3, CO4]
2 bit counter for each block is maintained. When a hit occurs, the
counter is set to 0. All other block counters are incremented. When
a miss occurs, the newly loaded blocks counter is set to 0. All other
block counters are incremented. When the cache is full,

o The block with the highest counter value is replaced.


o New block loaded from Main memory.
o Its counter is set to 0.

62.What is DDR SDRAM? [K3, CO4]


Double Data Rate SDRAM (DDR SDRAM)
o Transfers data on both edges of the clock.
o Bandwidth is doubled for long burst transfers.
o Cells are organized in two banks.
o each bank can be accessed separately.
o Consecutive words of a given block are stored in different banks.
oSuch interleaving of words allows simultaneous access to two
words that are transferred on successive edges of the clock.

o Block transfers

63.What is TLB? [K3, CO4]


Page table resides in main memory. A copy of the portion of the
page table can be accommodated within the Memory Management
Unit MMU (i.e.) the page table entries of the most recently
accessed page. A small cache called the Translation Look-aside
Buffer (TLB) is incorporated into the MMU to store the page table.
TLB includes:

o Virtual address of the entry (i.e.) Virtual page no.


o Control bits
o Page frame no. in main memory.
TLB uses associative mapping Processor gives virtual address to
MMU. MMU looks into the TLB for the virtual address entry

64.An address space is specified by 24 bits and the


corresponding memory space by 16 bits: How many words
are there in the (a) Virtual memory (b) Main memory. [K3,
CO4]
Number of words in Virtual memory = 224 Number of words in
Main memory = 216

65.Specify the different I/O transfer mechanisms available.


[K3, CO4]

(i) Serial transfer (ii) Parallel transfer

66.What does isochronous data stream mean? [K3, CO4]


Audio and video signals must be converted into a digital form
before it can be handled by the computer. This is accompanied by
sampling the analog signal periodically. The sampling process yields
a continuous stream of digitized samples that arrive at regular
intervals, synchronized with the sampling clock. Such a data stream
is called isochronous, meaning that successive events are
separated by equal periods of time.
67.List the different types of ROM. [K3, CO4]

(i) PROM (ii) EPROM (iii) EEPROM

68. What is the use of DMA? [K3, CO4]


Direct memory access (DMA) is a feature of modern computers that
allows certain hardware subsystems within the computer to access
system memory independently of the central processing unit (CPU).
With DMA, the CPU initiates the transfer, does other operations
while the transfer is in progress, and receives an interrupt from the
DMA controller when the operation is done. This feature is useful
any time the CPU cannot keep up with the rate of data transfer, or
where the CPU needs to perform useful work while waiting for a
relatively slow I/O data transfer.
Part B – Questions
Part B Questions
Q. Questions K CO
No. Level Mapping

1 Explain bus arbitration. K2 CO4

2 Discuss in detail synchronous and K2 CO4


asynchronous bus communications with neat timing
diagrams.

3 Discuss parallel interface with a neat diagram. K2 CO4

4 Discuss serial interface with a neat diagram. K2 CO4

5 Explain Standard I/O Interface circuits. K2 CO4

6 Explain memory hierarchy in detail. K2 CO4

7 Explain the various memory technologies in K2 CO4


detail.

8 Explain cache memory in detail. K2 CO4

9 Explain cache memory mapping techniques in K2 CO4


detail.

10 Describe in detail how cache memory K2 CO4


performance can be measured and improved.

11 Explain virtual memory management technique K2 CO4


in detail.

12 Explain the use of TLB in Virtual Memory? K2 CO4

13 Explain DMA. What are the various bus K2 CO4


arbitration techniques?

14 Explain the mapping function in cache memory to K2 CO4


determine how memory blocks are placed in a cache
Supportive Online
Certification
Courses

148
Supportive Online Certification Courses

1. Coursera: Computer Architecture

https://www.coursera.org/lecture/comparch/pipeline-basics-
omMiS

2. NPTEL: Computer architecture and Organization

https://nptel.ac.in/courses/106/105/106105163/

3. Udemy Design of a CPU

https://www.udemy.com/topic/computer_x0002_architecture/

149
Real time
applications in day
to day life and to
Industry

150
Real time applications in day to day life and to Industry

Home / Office Automation

Internet of Things

Cloud Storage

Data Centers

151
Content beyond
syllabus

152
Blu-ray Disc
The Blu-ray Disc (BD), often known simply as Blu-ray, is
a digital optical disc storage format. It is designed to
supersede the DVD format, and capable of storing several
hours of high-definition video (HDTV 720p and 1080p). The
main application of Blu-ray is as a medium for video material
such as feature films and for the physical distribution of video
games for the PlayStation 3, PlayStation 4, PlayStation 5,
Xbox One, and Xbox Series X.
The name "Blu-ray" refers to the blue laser (which is
actually a violet laser) used to read the disc, which allows
information to be stored at a greater density than is possible
with the longer_x0002_wavelength red laser used for DVDs.
The plastic disc is 120 millimetres (4.7 in) in diameter and 1.2
millimetres (0.047 in) thick, the same size as DVDs and CDs.
Conventional or pre-BD-XL Blu_x0002_ray Discs contain 25
GB per layer, with dual-layer discs (50 GB) being the industry
standard for feature-length video discs. Triple-layer discs (100
GB) and quadruple layer discs (128 GB) are available for BD-
XL re-writer drives.
High-definition (HD) video may be stored on Blu-ray Discs
with up to 1920×1080 pixel resolution, at 24 progressive or
50/60 interlaced frames per second. DVD_x0002_Video discs
were limited to a maximum resolution of 480i (NTSC,
720×480 pixels) or 576i (PAL, 720×576 pixels). Besides these
hardware specifications, Blu-ray is associated with a set of
multimedia formats

153
Assessment
Schedule

154
Assessment Schedule
(Proposed Date & Actual Date)

Sl. ASSESSMENT Proposed Actual Date


No. Date

1 FIRST INTERNAL ASSESSMENT 22.08.2024 22.08.2024


to
30.08.2024
2 SECOND INTERNAL 30.09.2024
ASSESSMENT to
08.10.2024
3 MODEL EXAMINATION 26.10.2024
to
08.11.2024
4 END SEMESTER EXAMINATION Tentatively
11.11.2024
Prescribed Text
Books and
Reference Books

156
Text Book and Reference Book

Text Books:
1. David A. Patterson and John L. Hennessy, Computer Organization and Design: The
Hardware/Software Interface, Fifth Edition, Morgan Kaufmann / Elsevier, 2014.
2. Carl Hamacher, Zvonko Vranesic, Safwat Zaky and Naraig Manjikian, Computer
Organization and Embedded Systems, Sixth Edition, Tata McGraw Hill, 2012.

Reference Books:
1. William Stallings, Computer Organization and Architecture – Designing for Performance,
Eighth Edition, Pearson Education, 2010.
2. John P. Hayes, Computer Architecture and Organization, Third Edition, Tata McGraw Hill,
2012.
3. John L. Hennessey and David A. Patterson, Computer Architecture – A Quantitative
Approach, Morgan Kaufmann / Elsevier Publishers, Fifth Edition, 2012.

EBOOK LINKS:
https://drive.google.com/file/d/1ZxZ7d5dVERbiCwb5Md5L137fWoMwOFBh/view?usp
=sharing

157
Mini Projects
Suggestion

158
Mini Project Suggestions

A Study of Recent Advances in Cache Memory.

Refer recent (2020, 2021) research articles and give a report


on the latest technologies and advancements in Cache
Memory used in recent processors. Few references are listed
below:

Reference:

1) https://ieeexplore.ieee.org/document/9431291

2) https://dl.acm.org/doi/abs/10.1145/3376920

3) https://ieeexplore.ieee.org/abstract/document/7019786

4) https://ieeexplore.ieee.org/document/839642
Thank you

Disclaimer:

This document is confidential and intended solely for the educational purpose of RMK Group of
Educational Institutions. If you have received this document through email in error, please notify the
system manager. This document contains proprietary information and is intended only to the
respective group / learning community as intended. If you are not the addressee you should not
disseminate, distribute or copy through e-mail. Please notify the sender immediately by e-mail if you
have received this document by mistake and delete this document from your system. If you are not
the intended recipient you are notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.

160

You might also like