Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
121 views18 pages

Section A

1. A nonlinear pipeline processor can be reconfigured to perform different functions over time through feedback and feedforward connections between stages, unlike a static linear pipeline. An example is given of a 3-stage pipeline with additional connections. 2. Multiprocessors have multiple processors built into a single computer system that share system resources for simultaneous processing. Multicomputers connect multiple independent computers over a network for distributed processing of problems. 3. Flynn's classifications categorize computers based on whether their instruction and data streams are single or multiple.

Uploaded by

Karo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
121 views18 pages

Section A

1. A nonlinear pipeline processor can be reconfigured to perform different functions over time through feedback and feedforward connections between stages, unlike a static linear pipeline. An example is given of a 3-stage pipeline with additional connections. 2. Multiprocessors have multiple processors built into a single computer system that share system resources for simultaneous processing. Multicomputers connect multiple independent computers over a network for distributed processing of problems. 3. Flynn's classifications categorize computers based on whether their instruction and data streams are single or multiple.

Uploaded by

Karo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 18

1. Explain nonlinear pipeline processor with suitable example.

2. Differentiate between multiprocessors and multi-computers.


3. Explain Flynn classifications of various computers based on
notions of instructions and data streams.
4. What are static and dynamic networks of multiprocessor?
Give two examples of both.
5. Differentiate between RISC and CISC processors. 

Ans 5
Differentiate between RICS and CICS processors

RISC
Reduced Instruction Set Computing or RISC is a form of microprocessor architecture that
uses a set of simple commands that are divided into numerous instructions. With one CLK cycle,
this architecture can achieve a low-level operation. This increases the device’s power efficiency
and makes it extremely useful for use in portable devices.

Architecture of RISC
This CPU design works on the principle of quick actions through a short set of instructions.
Every instruction completes a small task that results in the successful compilation of complex
commands quickly in a single cycle. The lengths and formats of the instructions are kept similar
to each other.

Applications of RISC
High-end applications such as telecommunication, video processing, and image processing
use RISC. This is a result of the quick response time and power efficacy.

CISC
Complex Instruction Set Computing or CISC uses a single instruction that can carry out
several operations. These are usually low-level operations like an arithmetic operation. The
various memory reference operations use a large number of addressing modes. As a result,
CISC uses a variable format. This saves the space required to store the memory and
consequentially the memory cost.

Architecture of CISC
The CPU design in CISC architecture uses a single command that executes various operations
in several steps. These compound instructions are implemented entirely in two to ten cycles and
thus require a longer execution time. The CISC architecture works with the HLL statements in
the firmware and uses a familiar path and cache for both data and instruction.

Applications of CISC
CISC architecture usually finds application in low-end applications such as security systems,
home automation, etc.

1. Number Of Addressing Modes


RISC
RISC has fewer addressing modes and most of the instructions in the instruction set have register to
register addressing mode.

CISC
CISC has many different addressing modes and can thus be used to represent higher level
programming language statements more efficiently.  

2. Microprogramming Unit
RISC
It is a hard-wired unit of programming.

CISC
It has a microprogramming unit.  

3. Examples
RISC
System/360, PDP-11, VAX, AMD, Motorola 68000, and desktop PCs on Intel x86 CPUs.  

CISC
DEC Alpha, AMD 29000, ARC, Atmel AVR, Blackfin, Intel i860 and i960, MIPS, Motorola 88000,
PA-RISC, power (including PowerPC), SuperH, SPARC and ARM.

4. Encoding Of Instructions
RISC
Fixed-length encodings of the instructions are used. Example: In IA32, generally all instructions are
encoded as 4 bytes.  

CISC
Variable-length encodings of the instructions are used. Example: IA32 instruction size can range
from 1 to 15 bytes.

5. Arithmetic And Logical Operations


RISC
Arithmetic and logical operations only use register operands.

CISC
Arithmetic and logical operations can be applied to both memory and register operands.  

6. The Instruction Set


RISC
The instruction set is reduced i.e it has only a few instructions in the instruction set. Many of
these instructions are very primitive.  
CISC
The instruction set has a variety of different instructions that can be used for complex
operations.  

7. Application
RISC
It is used in high-end applications such as video processing, telecommunications and image
processing.  

CISC
It is used in low-end applications such as security systems, home automation etc.  

8. Processor
RISC
Its processors have simple instructions taking about one clock cycle.  

CISC
Its processor has complex instructions that take up multiple clocks for execution.

9. Processor Pipelining
RISC
Its processors are highly pipelined.  

CISC
Processors are normally less pipelined or not pipelined at all.

10. Implementation programs


RISC
Implementation programs are exposed to machine level programs.

CISC
Implementation programs are hidden from machine level programs.

11. Complex Addressing Modes


RISC
Complex addressing modes are synthesized using software.  

CISC
It already supports complex addressing modes.  

12. Complexity
RISC
The complexity lies in the micro program.  

CISC
The complexity of RISC lies in the compiler that executes the program.  
13. Performance
RISC
Performance is optimized with more focus on software.

CISC
Performance is optimized with more focus on hardware.

14. External Memory


RISC
It does not require external memory for calculations.

CISC
It does require an external memory for calculations.  

15. Decoding Of Instructions


RISC
Decoding of instructions is simple.

CISC
Decoding of instructions is complex.

16. Code Expansion


RISC
Code expansion can be a problem.  

CISC
Code expansion is not a problem.  

17. Execution Time


RISC
Execution time is vey less.

CISC
Execution time is very high.

18. Registers
RISC
Multiple register sets are present.  

CISC
Only a single register set is present.

19. Memory Unit


RISC
It has no memory unit and uses a separate hardware to implement instructions.  
CISC
It has a memory unit to implement complex instructions.  

20. Program Size


RISC
RISC have a large program size.

CISC
CISC have a small program size.  

Ans 1
Explain nonlinear pipeline processor with suitable example.

A dynamic pipeline can be reconfigured to carry out variable functions at different times. The
traditional linear pipelines are static pipelines because they are used to carry out fixed functions.
A dynamic pipeline permit feed forward and feedback connections besides the streamline
connections. For this reason, some authors call such a structures as non-linear pipeline.

Output

Input Output
S1 S2 Sk

A three stage pipeline


This pipeline has three stages. Besides the streamline connections from S 1 to S2 and from S2
to S3, there is feed forward connection from S 2 to S3 and two feedback associations from S 3 to
S2 and from S3 to S1.
These feed forward and feedback connections make the scheduling of consecutive event into
the pipeline a non trivial task. With these connections, the output of the pipeline is not
necessarily from the last stage. In fact, following different dataflow model, one can use the same
pipeline to assess different functions.

The utilization pattern of successive stages in a synchronous pipeline is mentioned by


reservation table. The table is essentially a space time diagram depicting the precedence
relationship in using the pipeline stages. For a K- stage linear pipeline, ‗K‘ clock cycles are
needed to flow through the pipeline.
1 2 3 4

X
S1

S2 X

S3 X

X
S4

Reservation table of 4-Stage


Reservation table for a dynamic pipeline become more complex and interesting because a non-
linear pattern is followed. For a given non-linear pipeline configuration, multiple reservation
tables can be generated. Each reservation table will show evaluation of different function.

Each reservation table displays the time space flow of data through the pipeline for one function
evaluation. Different function may pursue different paths on the reservation table.

1 2 3 4 5 6 7 8

S1 X X X

S2 X
X
X X X
S3

Processing sequence
S1 S2 S1 S2 S3 S1 S3 S1
Reservation table for function ‗X‘

Differentiate between multiprocessors and multi-computers.

The computer systems that have more than two processors built inside of them for simultaneous
processing of all processors is called a multiprocessor while multicomputer is built by connecting
two or more processors in order to work jointly for solving problems. Computers having any of
these two types of processors have their own pros and cons that will be discussed here briefly.
Multiprocessors are fast and are easier to process while multicomputer is less easy to program.
Parallel computing is performed by multiprocessor while distributed computing is performed in
multicomputer. It is more complex and costly to build a multiprocessor while it is less costly to
build a multicomputer.

Definition
Multicomputer are a kind of system that are collection of several computers connected
together through a network.
Multiprocessors When in a single computer two or more than two CPUs are enclosed in such
a way that they share the same system buses and other I/O devices then it becomes a
multiprocessor computer.

Programming of processors
Programming of multicomputer is pretty difficult as distributed computing is used in this type of
systems.
Programming of such systems is pretty easy as compared to that of multicomputer.

Type of computing
Distributive computing is used in multicomputer
Parallel computing is used in multiprocessor systems.

Cost
Cost of such systems is lesser as compared to that of multiprocessing systems.
Cost of such systems is high as parallel computing technique is used in such systems.

Disadvantages
Programming of such systems takes more time and effort.
They have a complex structure hence their setup is difficult as compared to multicomputer.

Key Differences between Multiprocessor and Multicomputer


1. Multiprocessor is a single computer in which many processors exist. As against,
multicomputer has multiple autonomous computers.
2. There are many processing elements are used in the multiprocessor but they do not have
their private individual memories instead it shares a single memory. In contrast,
multicomputer has several processing elements along with its own memory and I/O
resources, rather than sharing the memory it implements the distributed memory.
3. Multiprocessor model needs proper communication between the processing elements
and memory for the effective allocation of resources. Contrariwise, there is no interaction
between the processing elements and memory resources is required.
4. Multiprocessors use a dynamic network in which the communication links can be
rearranged by setting the active switching unit of the system. On the contrary, the
multicomputer employs static network where the connection of switching units is fixed
and determined by direct point-to-point connections.
5. The microprocessor is referred to as the tightly coupled systems while multicomputer is
known as loosely coupled systems.
The Multiprocessor and multicomputer are the types of parallel computers where the
multiprocessor has numerous processing elements using shared memory. Conversely, in
multicomputer various autonomous computers are connected with each other and have their
own distributed memory.

FLYNN'S TAXONOMY

 The most popular taxonomy of computer architecture was defined by Flynn in 1966.
Flynn's classification scheme is based on the notion of a stream of information.. Two
types of information flow into a processor: instruction and data. The instruction stream is
defined as the sequence of instructions performed by the processing unit. The data
stream is defined as the data traffic exchanged between the memory and the processing
unit.
 According to Flynn's classification, either of the instruction or data streams can be single
or multiple. Computer architecture can be classified accordingly into the following four
distinct categories:

1. Single-instruction single-data streams (SISD)


2. Single-instruction multiple-data streams (SIMD)
3. Multiple-instruction single-data streams (MISD)
4. Multiple-instruction multiple-data streams (MIMD)

Single-Instruction Single-Data Streams (SISD):


Conventional single processor computers are classified as SISD systems. Each arithmetic
instruction initiates an operation on a data item taken from a single stream of data elements.
Vector processors such as the Cray-1 and its descendants are often classified as SIMD
machines, although they are more properly regarded as SISD machines. Vector processors
achieve their high performance by passing successive elements of vectors through separate
pieces of hardware dedicated to independent phases of a complex operation. Conventional
single-processor von Neumann computers are classified as SISD systems.

SISD Architecture

IS: Instruction Stream                                           DS: Data Stream


CU: Control Unit                                                   PU: Processing Unit
MU: Memory Unit

Single-Instruction Multiple-Data Streams (SIMD):


An SIMD computer consists of a single control unit, fetching instructions from an instruction
store. Some instructions are executed by the CU, but most are sent to the PUs (processing
units) for simultaneous execution. A PU consists of a processing element (PE), which is an ALU
with registers, and a private data memory (the PEM). Here, the basic features of SIMD model:

 Vector computers and special purpose computations


 One instruction stream, multiple data paths
 Distributed memory SIMD (MPP, DAP, CM-1 &2, Maspar)
 Shared memory SIMD (STARAN, vector computers)

Different Types of SIMD Models:- There are two different types of SIMD models:

1. Distributed memory model


2. Shared memory model

Distributed Memory Model: - In distributed SIMD model, there are number of PEs and each PE
has a local memory. Data or instructions are stored from host computer to control memory. The
instructions are then moved to the array control unit and data are transferred to the different
local memories through data bus from control memory. In array control unit all the instructions
are separated either scalar instructions or vector instructions. The scalar instructions are
executed in scalar processor. The vector instructions are broadcast to different processing
elements through broadcast bus. The data routing from one PE to another PE is done by data
routing network.

SIMD architecture with distributed memory


PE: Processing Element                                                   LM: Local Memory

Shared Memory Model: - In shared memory SIMD model instructions are stored by the host to
control memory and data are directly stored to the shared memory through data bus. Array
control unit is attached to the control memory. This array control unit separates scalar
instructions and vector instructions. Scalar instructions are transferred to the scalar processor for
executions. Vector instructions are transferred to different PEs by broadcast bus. The basic
difference from distributed memory model is that there is a common channel of alignment
network. Through this alignment network PEs and SMs can transfer data.
SIMD architecture with shared memory model
PE: Processing Element                                                 SM: Shared Memory

Multiple-Instruction Single-Data Streams (MISD):


There is no practical example and commercial use of MISD architecture. Here we just give a
block diagram in figure below of MISD architecture.

Block diagram of MISD architecture

There are few machines in this category, none that have been commercially successful or had
any impact on computational science. One type of system that fits the description of a MISD
computer is a systolic array, which is a network of small computing elements connected in a
regular grid. All the elements are controlled by a global clock. On each cycle, an element will
read a piece of data from one of its neighbours, perform a simple operation (e.g. add the
incoming element to a stored value), and prepare a value to be written to a neighbour on the
next step. One could make a case for pipelined vector processors fitting in this category, as well,
since each step of the pipeline corresponds to a different operation being performed to the data
as it flows past that stage in the pipe. There have been pipelined processors with programmable
stages, i.e. the function that is applied at each location in the pipeline could vary, although the
pipeline stage didn't fetch its operation from a local control memory so it would be difficult to
classify it as a "processor".

Multiple-Instruction Multiple-Data Streams (MIMD):


Multiple-instruction multiple-data streams (MIMD) parallel architectures are made of multiple
processors and multiple memory modules connected together via some interconnection
network. They fall into two broad categories: shared memory and message passing. Below two
figures illustrate the general architecture of these two categories.

Static interconnection networks

Static interconnection networks for elements of parallel systems (ex. processors, memories) are
based on fixed connections that can not be modified without a physical re-designing of a system.
Static interconnection networks can have many structures such as a linear structure (pipeline), a
matrix, a ring, a torus, a complete connection structure, a tree, a star, a hyper-cube.

In linear and matrix structures, processors are interconnected with their neighbours in a regular
structure on a plane. A torus is a matrix structure in which elements at the matrix borders are
connected in the frame of the same lines and columns. In a complete connection structure, all
elements (ex. processors) are directly interconnected (point-to-point), see next 3 figures.

a)

 
b)

Linear structure (pipeline) (a) and matrix structure (b) of interconnections in a parallel system.

A complete interconnection structure in a parallel system

In a tree structure, system elements are set in a hierarchical structure from the root to the
leaves, see the figure below. All elements of the tree (nodes) can be processors or only leaves
are processors and the rest of nodes are linking elements, which intermediate in transmissions.
If from one node, 2 or more connections go to different nodes towards the leaves - we say about
a binary or k-nary tree. If from one node, more than one connection goes to the neighbouring
node, we speak about a fat tree. A binary tree, in which in the direction of the root, the number
of connections between neighbouring nodes increases twice, provides a uniform transmission
throughput between the tree levels, a feature not available in a standard tree.
a)

b)

Tree structures in a parallel system: a) binary tree, b) fat tree

In a hypercube structure, processors are interconnected in a network, in which connections


between processors correspond to edges of a n-dimensional cube. The hypercube structure is
very advantageous since it provides a low network diameter equal to the degree of the cube.
The network diameter is the number of edges between the most distant nodes. . The network
diameter determines the number in intermediate transfers that have to be dine to send data
between the most distant nodes of a network. In this respect the hyperciubes have very good
properties, especially for a very large number of constituent nodes. Due to this hypercubes are
popular networks in existing parallel systems.

Cube Node Network Structure


number diameter
dimension · processor, ¾ node connection

0 1 0

1 2 1

2 4 2

3 8 3

4 16 4

3.2. Dynamic interconnection networks

Dynamic interconnection networks between processors enable changing (reconfiguring) of the


connection structure in a system. It can be done before or during parallel program execution. So,
we can speak about static or dynamic connection reconfiguration.

3.2.1. Bus networks

A bus is the simplest type od dynamic interconnection networks. It constitutes a common data
transfer path for many devices. Depending on the type of implemented transmissions we have
serial busses and parallel busses. The devices connected to a bus can be processors,
memories, I/O units, as shown in the figure below.
A diagram of a system based on a single bus

Only one devices connected to a bus can transmit data. Many devices can receive data. In the
last case we speak about a multicast transmission. If data are meant for all devices connected
to a bus we speak about a broadcast transmission. Accessing the bus must be synchronized.
It is done with the use of two methods: a token method and a bus arbiter method. With the
token method, a token (a special control message or signal) is circulating between the devices
connected to a bus and it gives the right to transmit to the bus to a single device at a time. The
bus arbiter receives data transmission requests from the devices connected to a bus. It selects
one device according to a selected strategy (ex. using a system of assigned priorities) and
sends an acknowledge message (signal) to one of the requesting devices that grants it the
transmitting right. After the selected device completes the transmission, it informs the arbiter that
can select another request. The receiver (s) address is usually given in the header of the
message. Special header values are used for the broadcast and multicasts. All receivers read
and decode headers. These devices that are specified in the header, read-in the data
transmitted over the bus.

The throughput of the network based on a bus can be increased by the use of a multibus
network shown in the figure below. In this network, processors connected to the busses can
transmit data in parallel (one for each bus) and many processors can read data from many
busses at a time.
A diagram of a system based on a multibus

3.2.2. Crossbar switches

A crossbar switch is a circuit that enables many interconnections between elements of a parallel
system at a time. A crossbar switch has a number of input and output data pins and a number of
control pins. In response to control instructions set to its control input, the crossbar switch
implements a stable connection of a determined input with a determined output. The diagrams of
a typical crossbar switch are shown in the figure below.

a) b)

Crossbar switch a) general scheme, b) internal structure

Control instructions can request reading the state of specified input and output pins i.e. their
current connections in a crossbar switch. Crossbar switches are built with the use of multiplexer
circuits, controlled by latch registers, which are set by control instructions. Crossbar switches
implement direct, single non-blocking connections, but on the condition that the necessary
input and output pins of the switch are free. The connections between free pins can always be
implemented independently on the status of other connections. New connections can be set
during data transmissions through other connections. The non-blocking connections are a big
advantage of crossbar switches. Some crossbar switches enable broadcast transmissions but in
a blocking manner for all other connections. The disadvantage of crossbar switches is that
extending their size, in the sense of the number of input/output pins, is costly in terms of
hardware. Because of that, crossbar switches are built up to the size of 100 input/output pins.
The crossbar switches that contain hundreds of pins are implemented using the technique of
multistage interconnection networks that is discussed in the next section of the lecture.
3.2.3. Multistage interconnection networks

Multistage connection networks are designed with the use of small elementary crossbar
switches (usually they have two inputs) connected in multiple layers. The elementary crossbar
switches can implement 4 types of connections: straight, crossed, upper broadcast and lower
broadcast. All elementary switches are controlled simultaneously. The network like this is an
alternative for crossbar switches if we have to switch a large number of connections, over 100.
The extension cost for such a network is relatively low.

In such networks, there is no full freedom in implementing arbitrary connections when some
connections have already been set in the switch. Because of this property, these networks
belong to the category of so called blocking networks.

However, if we increase the number of levels of elementary crossbar switches above the
number necessary to implement connections for all pairs of inputs and outputs, it is possible to
implement all requested connections at the same time but statically, before any communication
is started in the switch. It can be achieved at the cost of additional redundant hardware included
into the switch. The block diagram of such a network, called the Benes network, is shown in the
figure below.

A multistage connection network for parallel systems

To obtain nonblocking properties of the multistage connection network, the redundancy level in
the circuit should be much increased. To build a nonblocking multistage network n x n, the
elementary two-input switches have to be replaced by 3 layers of switches n x m, r x r and m x
n, where m ³ 2n - 1 and r is the number of elementary switches in the layer 1 and 3. Such a
switch was designed by a French mathematician Close and it is called the Close network. This
switch is commonly used to build large integrated crossbar switches. The block diagram of the
Close network is shown in the figure below.
A nonblocking Close interconnection network

You might also like