Computer Architecture Unit 9
variation in compiler vectorisation level has been noted by various studies of
the functioning of applications on vector processors. The hand-optimised
versions normally depict important gains in level of vectorisation for codes
which the compiler was not able to vectorise properly by itself, as all codes
at present were above 50% vectorisation. Interestingly, the quicker code
created by the Cray programmers had lower vectorisation levels. The
vectorisation level is not enough by itself to decide performance.
Alternative vectorisation methods might implement lesser instructions, or
maintain more values in vector registers, or permit higher chaining and
overlap in the midst of vector operations, and thus enhance performance
even in case the vectorisation level stays the same or decreases.
For instance, BDNA has approximately the same vectorisation level in the
two versions, however the hand-optimised code is more than 50% faster.
There is also huge variation in the way various compilers perform in
vectorising programs. Summing up the state of vectorising compilers, look
at the data in figure 9.5, that depicts the degree of vectorisation for various
processors, which utilise a test suite containing 100 handwritten FORTRAN
kernels.
Figure 9.5: Result of applying Vectorising Compilers to the 100 FORTRAN
Test Kernels
The kernels were planned to verify vectorisation ability and are able to be
vectorised by hand.
Manipal University of Jaipur B1648 Page No. 211
Computer Architecture Unit 9
Self Assessment Questions
10. List two factors which enable a program to run successfully in vector
mode.
11. There does not exist any variation in the capability of compilers to
decide if a loop can be vectorised. (True/False)
Activity 2:
Visit your local computer vendor and get an expert opinion about vector
processors and their working.
9.6 Summary
There are several representative application areas where vector processing
is of the utmost importance. Depending upon the way the operands are
fetched, vector processors can be segregated into two groups.
Operands are straight away streamed from the memory to the functional
units and outcomes are written back to memory at the time the vector
operation advances in this architecture.
Operands are read into vector registers wherein they are fed to the
functional units and outcomes of operations are written to vector
registers in this architecture.
Vector register architectures have several advantages over vector
memory-memory architectures.
There are several major components of the vector unit of a register-
register vector machine
The various types of vector instructions for a register-register vector
processor are:
Vector-scalar Instructions
Vector-vector Instructions
Vector-memory Instructions
Gather and Scatter Instructions
Masking Instructions
Vector Reduction Instructions
CRAY-1 is one of the oldest processors that implemented vector
processing.
Two issues that arise in real programs: (i) the vector length in a program is
not exactly 64. (ii) Non adjacent elements in vectors that reside in memory.
Manipal University of Jaipur B1648 Page No. 212
Computer Architecture Unit 9
The structure of the program & capability of the compiler are two factors
that affect the success with which a program can be run in vector mode.
9.7 Glossary
ASC: Advanced Scientific Computer
Data hazards: the conflicts in register accesses
ETA-10: A later shared-memory multiprocessor version of the CDC
Cyber 205.
Functional hazards: the conflicts in functional units.
Gather: an operation that fetches the non-zero elements of a sparse
vector from memory.
Masking instructions: These instructions use a mask vector to expand
or compress a vector
Scatter: It stores a vector in a sparse vector into memory.
SECDED: single-error correction, double-error detection.
Small scale integration: it can pack 10 to 20 transistors in a single
chip.
Strip mining: the vector is partitioned into strips of 64 elements.
Vector reduction instructions: These instructions accept one or two
vectors as input and produce a scalar as output.
9.8 Terminal Questions
1. Explain the importance of Vector Processors.
2. What are the different types of Vector Processing?
3. How is vector register architecture more advantageous over memory-
memory vector architecture?
4. Write short notes on:
a) CDC Cyber 200 model 205 computer overview
b) CRAY-1
c) Vector Length
d) Vector Stride
5. List the various functional units of Vector Processor and explain each
one in brief.
6. Explain the various types of vector instructions in detail.
7. How effective is the compiler in vector processors?
Manipal University of Jaipur B1648 Page No. 213
Computer Architecture Unit 9
9.9 Answers
Self Assessment Questions
1. Vector processors
2. Data parallelism
3. ETA-10
4. True
5. Crossbars
6. Vector-memory instructions
7. False
8. Strip mining
9. Sequential words
10. Structure of the program & capability of the compiler
11. False
Terminal Questions
1. There are various application areas of vector processors which are of
considerable importance. Refer Section 9.2.
2. Depending upon the way the operands are fetched, vector processors
can be segregated into two groups: Memory-memory vector architecture
and Vector-register architecture. Refer Section 9.3.
3. Due to the capability to overlap memory accesses as well as the
probable use of vector processors again, vector-register vector
processors are normally more efficient as compared to memory-memory
vector processors. Refer Section 9.3.
4. a. The CDC Cyber 205 is based on the concepts initiated for the CDC
Star 100; the first commercial model was produced in 1981. Refer
Section 9.4.
b. CRAY-1 is one of the oldest processors that implemented vector
processing. Refer Section 9.5.
c. The vector size may be less than the vector register size, and the
vector size may be larger than the vector register size. Refer
Section 9.6.
d. As vectors are one-dimensional series, saving a vector in memory is
direct: vector elements are stored as sequential words in memory.
Refer Section 9.6.
Manipal University of Jaipur B1648 Page No. 214
Computer Architecture Unit 9
5. The major components of the vector unit of a register-register vector
machine are Vector Registers, Vector Functional Units, Scalar Registers
etc. Refer Section 9.5.
6. The various types of vector instructions for a register-register vector
processor are: (Refer Section 9.5.)
a. Vector-scalar Instructions
b. Vector-vector Instructions
c. Vector-memory Instructions
d. Gather and Scatter Instructions
e. Masking Instructions
f. Vector Reduction Instructions
7. Like an indication of vectorisation level which can be acquired in
scientific programs, we should observe the vectorisation levels noted for
the Perfect Club benchmarks. Refer Section 9.7.
References:
Hwang, K. (1993). Advanced Computer Architecture. McGraw-Hill.
Godse, D. A. & Godse, A. P. (2010). Computer Organisation. Technical
Publications.
Hennessy, John L., Patterson, David A. & Goldberg David (2011).
Computer Architecture: A Quantitative Approach, Morgan Kaufmann;
5th edition.
Sima, Dezsö, Fountain, Terry J. &Kacsuk, Péter (1997). Advanced
computer architectures - a design space approach. Addison-Wesley-
Longman.
E-references:
https://csel.cs.colorado.edu/~csci4576/VectorArch/VectorArch.html
http://www.cs.clemson.edu/~mark/464/appG.pdf
nasa_fig.gif
Manipal University of Jaipur B1648 Page No. 215