0% found this document useful (0 votes)

19 views25 pages

Vector & Array Processing

This document provides an overview of vector processing and GPU basics, highlighting the differences between CPU and GPU architectures, and introducing CUDA programming. It explains vector processors, their architectures, and the parallel processing capabilities of GPUs. Additionally, it outlines the CUDA architecture and its components, emphasizing its role in enhancing computing performance through parallel execution.

Uploaded by

Ashmy Shams

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

19 views25 pages

Vector & Array Processing

Uploaded by

Ashmy Shams

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 25

Vector Processing & GPU Basics

MODULE III
Contents/Syllabus
Vector processing and array processing

CPU v/s GPU

GPU Architecture

Introduction to GPU programming – CUDA,

Memory Hierarchy Design

Vector Processor
Vector processor is basically a central processing
unit that has the ability to execute the complete
vector input in a single instruction.

It is a complete unit of hardware resources that

executes a sequential set of similar data items in
the memory using a single instruction.
Architecture & Working
Vectorized Code
Vectorized Code
Scalar Processing v/s Vector Processing

Loop 10 iterations
Read instruction and decode
Read ith instruction and decode
Fetch all 10 elements of A[]
Fetch the A[i] element
Fetch all 10 elements of B[]
Fetch the B[i] element
Add A[ ]+B[ ]
Add A[i] + B[i]
Store result in C[ ]
Store result in C[i]
Increment i till i=10
Classification of Vector
Processors

Vector Processor
Architectures

Register to Register Memory to Memory

Architecture Architecture
Register to Register
Architecture
Highly used in vector computers.

The fetching of the operand or previous results

indirectly takes place through the main memory by
the use of registers.

The several vector pipelines present in the vector

computer help in retrieving the data from the
registers and also storing the results in the
desired register.
Register to Register
Architecture
These vector registers are user instruction
programmable.

According to the register address present in the

instruction, the data is fetched and stored in the
desired register.
Memory to Memory Architecture

The operands or the results are directly fetched

from the memory despite using registers.

The address of the desired data to be accessed

must be present in the vector instruction.
Memory to Memory Architecture
This architecture enables the fetching of data of
size 512 bits from memory to pipeline.

Due to high memory access time, the pipelines of

the vector computer requires higher startup time,
as higher time is required to initiate the vector
instruction.
Graphics Processing Unit
Highlights

What is a GPU?
What is the Difference between a CPU and a GPU?
Why should you use a GPU?
GPU - Introduction
The GPU accelerate applications running on the CPU
by offloading some of the compute-intensive and
time consuming portions of the code.

The rest of the application still runs on the CPU.

This is known as "heterogeneous" or "hybrid"

computing.
Uses massive Parallel Processing Power
GPU-Introduction
A CPU consists of two to eight CPU cores, while
the GPU consists of hundreds of smaller cores.

Together, they operate to crunch through the data

in the application.

This massively parallel architecture is what gives

the GPU its high compute performance.
CPU V/S GPU
Check out these YouTube Videos
CPU V/S GPU

CPU GPU

Central Processing Unit Graphics Processing Unit

Several cores Many cores

Low latency High throughput

Good for serial processing Good for parallel processing

Can do a handful of operations at once Can do thousands of operations at once

Best GPU Manufacturers
CUDA Architecture
CUDA (an acronym for Compute Unified Device
Architecture) is a parallel computing platform and
application programming interface (API) model
created by Nvidia.

It allows software developers and software

engineers to use a CUDA-enabled graphics
processing unit (GPU) for general purpose
CUDA Architecture
The CUDA platform is designed to work with
programming languages such as C, C++, and Fortran.

This accessibility makes it easier for specialists

in parallel programming to use GPU resources
CUDA - GPU PROCESS

1. Copy data from main memory

to GPU memory
2. CPU initiates the GPU
compute kernel
3. GPU's CUDA cores execute the
kernel in parallel
4. Copy the resulting data from
GPU memory to main memory
CUDA ARCHITECTURE
The CUDA Architecture consists of several
components, in the green boxes below:
1. Parallel compute engines inside NVIDIA GPUs
2. OS kernel-level support for hardware
initialization, configuration, etc.
3. User-mode driver, which provides a device-level
API for developers
4. PTX instruction set architecture (ISA) for
parallel computing kernels and functions
CUDA ARCHITECTURE

Pure Install Guide
100% (1)
Pure Install Guide
14 pages
Cambium-Networks Datasheet PoE Power Injector 30V 15W N000900L001D 01112023
No ratings yet
Cambium-Networks Datasheet PoE Power Injector 30V 15W N000900L001D 01112023
1 page
BS en 14600 2005
No ratings yet
BS en 14600 2005
30 pages
Brunvoll Bow Thruster FU 63 LTC 1550
100% (4)
Brunvoll Bow Thruster FU 63 LTC 1550
149 pages
OS Unit-III - Memory Management
No ratings yet
OS Unit-III - Memory Management
54 pages
Salab Exp Final
No ratings yet
Salab Exp Final
24 pages
Gpu Computing
No ratings yet
Gpu Computing
57 pages
Operatingsystem 170506020841
No ratings yet
Operatingsystem 170506020841
9 pages
PDC Lecture 09
No ratings yet
PDC Lecture 09
36 pages
LRU Algorithm Examples
No ratings yet
LRU Algorithm Examples
2 pages
Bugreport Gust - Ru UP1A.231005.007 2024 09 24 13 24 42 Dumpstate - Log 17779
No ratings yet
Bugreport Gust - Ru UP1A.231005.007 2024 09 24 13 24 42 Dumpstate - Log 17779
31 pages
Ilovepdf Merged
No ratings yet
Ilovepdf Merged
17 pages
Comp206 Lecture14
No ratings yet
Comp206 Lecture14
29 pages
Gpu Research Paper
No ratings yet
Gpu Research Paper
6 pages
Cuda Final
No ratings yet
Cuda Final
17 pages
GPU Programming Slides 2
No ratings yet
GPU Programming Slides 2
37 pages
Software and Database Management Technologies: Seminar Paper
No ratings yet
Software and Database Management Technologies: Seminar Paper
15 pages
Unit 4
100% (1)
Unit 4
48 pages
GPU Architecture Ebook
No ratings yet
GPU Architecture Ebook
67 pages
Advt CORP GRP A 04 2024
No ratings yet
Advt CORP GRP A 04 2024
46 pages
Csa 5
No ratings yet
Csa 5
4 pages
2VAA005340 en D Symphony Plus Product Catalog PDF
No ratings yet
2VAA005340 en D Symphony Plus Product Catalog PDF
148 pages
AMPE Tema4 GPU Architecture
No ratings yet
AMPE Tema4 GPU Architecture
95 pages
Cs-3006 8 Gpuprogramming Using Cuda&Opencl
No ratings yet
Cs-3006 8 Gpuprogramming Using Cuda&Opencl
167 pages
A1.1 - Computer Hardware and Operations
No ratings yet
A1.1 - Computer Hardware and Operations
30 pages
CUDA Class Lecture01
No ratings yet
CUDA Class Lecture01
26 pages
w13s1 MultiprocessingGPU
No ratings yet
w13s1 MultiprocessingGPU
21 pages
Coe4590 15 Gpu1
No ratings yet
Coe4590 15 Gpu1
14 pages
Raspberry Pi Kiosk Mode Setup Guide
No ratings yet
Raspberry Pi Kiosk Mode Setup Guide
4 pages
Note2 4
No ratings yet
Note2 4
11 pages
Troubleshooting Note For SAP HANA Platform Lifecycle Management Tool HDBLCM
No ratings yet
Troubleshooting Note For SAP HANA Platform Lifecycle Management Tool HDBLCM
9 pages
Billing For Eway Bill
No ratings yet
Billing For Eway Bill
369 pages
Lecture-12-PDC - CUDA
No ratings yet
Lecture-12-PDC - CUDA
25 pages
Chapter7 GPU
No ratings yet
Chapter7 GPU
45 pages
CUDA 1 - Introduction To GPU, CUDA
No ratings yet
CUDA 1 - Introduction To GPU, CUDA
21 pages
p10 Cuda
No ratings yet
p10 Cuda
28 pages
DS1822 - Parallel Computing-Unit3
No ratings yet
DS1822 - Parallel Computing-Unit3
17 pages
ZFOD - Power Generation Electrical Substation SPV - JA
No ratings yet
ZFOD - Power Generation Electrical Substation SPV - JA
2 pages
Cooler Master G500 Gold Review
No ratings yet
Cooler Master G500 Gold Review
17 pages
Using GPUs
No ratings yet
Using GPUs
18 pages
Revision Unit 2 Chapter - 1
No ratings yet
Revision Unit 2 Chapter - 1
10 pages
GPU & CUDA Programming Guide
No ratings yet
GPU & CUDA Programming Guide
31 pages
GPU Architecture and Programming
No ratings yet
GPU Architecture and Programming
3 pages
ALC269
No ratings yet
ALC269
79 pages
Phishing Architecture Like Example
No ratings yet
Phishing Architecture Like Example
1 page
03 FSG PM Draw-Wire-Encoder
No ratings yet
03 FSG PM Draw-Wire-Encoder
3 pages
Cuda
No ratings yet
Cuda
69 pages
Gpus
No ratings yet
Gpus
32 pages
Introduction CUDA
No ratings yet
Introduction CUDA
46 pages
0 Gpu Computing I Give It
No ratings yet
0 Gpu Computing I Give It
57 pages
CPU vs GPU: Functions and Real-Life Uses
No ratings yet
CPU vs GPU: Functions and Real-Life Uses
8 pages
Word Page Orientation Guide
No ratings yet
Word Page Orientation Guide
2 pages
De Lab Manual
No ratings yet
De Lab Manual
20 pages
Cuuda Nvidai Guide - Part1
No ratings yet
Cuuda Nvidai Guide - Part1
15 pages
Purolator Rate Guide English
No ratings yet
Purolator Rate Guide English
141 pages
Comparative Study On CPU GPU and TPU
No ratings yet
Comparative Study On CPU GPU and TPU
9 pages
GPUMod 2
No ratings yet
GPUMod 2
64 pages
Chapter 8
No ratings yet
Chapter 8
58 pages
Free As A Bird Event-Based Dynamic Sense-And-Avoid For Ornithopter Robot Flight
No ratings yet
Free As A Bird Event-Based Dynamic Sense-And-Avoid For Ornithopter Robot Flight
8 pages
HPC 5th Unit - 240504 - 160548
No ratings yet
HPC 5th Unit - 240504 - 160548
18 pages
GPU Architecture
No ratings yet
GPU Architecture
12 pages
Lecture 2
No ratings yet
Lecture 2
77 pages
Chapter 5 - General Purpose PGPU, CUDA
No ratings yet
Chapter 5 - General Purpose PGPU, CUDA
70 pages
Huawei OSN Devices Overview
No ratings yet
Huawei OSN Devices Overview
2 pages
6014 Question Paper
No ratings yet
6014 Question Paper
2 pages
Facebook PHP SDK v5
No ratings yet
Facebook PHP SDK v5
26 pages
Brief Manual
No ratings yet
Brief Manual
9 pages
Gpu Cuda Part2
No ratings yet
Gpu Cuda Part2
15 pages
CUDA Tutorial
No ratings yet
CUDA Tutorial
50 pages
Technical Specs Nova35
No ratings yet
Technical Specs Nova35
1 page
GPU Cluster4
No ratings yet
GPU Cluster4
31 pages
DISH Opposes SpaceX 12 GHz Band Use
No ratings yet
DISH Opposes SpaceX 12 GHz Band Use
11 pages
Aptitude Test: Paw: Cat:: Hoof: ?
No ratings yet
Aptitude Test: Paw: Cat:: Hoof: ?
15 pages
Engineering Parts Specification
No ratings yet
Engineering Parts Specification
1 page
Seminar Igor Kamzic COSC3P93
No ratings yet
Seminar Igor Kamzic COSC3P93
58 pages
GPU Programming: Dr. Florian Ferreira
No ratings yet
GPU Programming: Dr. Florian Ferreira
101 pages
Quiz3 - Pacuribot
No ratings yet
Quiz3 - Pacuribot
4 pages
Parallel & Distributed Computing Report
No ratings yet
Parallel & Distributed Computing Report
4 pages
1 Cuda
100% (1)
1 Cuda
173 pages
Introduction To Gpu Programming With Cuda and Openacc
100% (1)
Introduction To Gpu Programming With Cuda and Openacc
40 pages
CUDA Compute Unified Device Architecture
No ratings yet
CUDA Compute Unified Device Architecture
26 pages
GPU vs CPU: Performance and Efficiency
No ratings yet
GPU vs CPU: Performance and Efficiency
6 pages
лк CUDA - 1 PDCn
No ratings yet
лк CUDA - 1 PDCn
31 pages
Ms Angle Weights
No ratings yet
Ms Angle Weights
12 pages
A Beginner'S Guide To Programming Gpus With Cuda: Mike Peardon
No ratings yet
A Beginner'S Guide To Programming Gpus With Cuda: Mike Peardon
21 pages
CUDA
No ratings yet
CUDA
46 pages
GPU Basics
No ratings yet
GPU Basics
93 pages
Indian Digital Camera Buying Survey
100% (3)
Indian Digital Camera Buying Survey
5 pages
Motivations of Fuzzy Logic
No ratings yet
Motivations of Fuzzy Logic
3 pages
Role of Information Systems in Indian Railways
No ratings yet
Role of Information Systems in Indian Railways
21 pages
GPGPU Programming With CUDA: Leandro Avila - University of Northern Iowa
No ratings yet
GPGPU Programming With CUDA: Leandro Avila - University of Northern Iowa
29 pages
Programming Gpus With Cuda: John Mellor-Crummey
No ratings yet
Programming Gpus With Cuda: John Mellor-Crummey
42 pages
Intro to CUDA Programming Guide
No ratings yet
Intro to CUDA Programming Guide
33 pages
CUDA Programming for Engineers
No ratings yet
CUDA Programming for Engineers
84 pages
Developers Had To Map Scientific Calculations Onto Problems That Could Be Represented by Triangles and Polygons
No ratings yet
Developers Had To Map Scientific Calculations Onto Problems That Could Be Represented by Triangles and Polygons
2 pages
ECE 498AL The CUDA Programming Model
No ratings yet
ECE 498AL The CUDA Programming Model
37 pages

Vector & Array Processing

Uploaded by

Vector & Array Processing

Uploaded by

Vector Processing & GPU Basics

CPU v/s GPU

Introduction to GPU programming – CUDA,

Memory Hierarchy Design

It is a complete unit of hardware resources that

Register to Register Memory to Memory

The fetching of the operand or previous results

The several vector pipelines present in the vector

According to the register address present in the

The operands or the results are directly fetched

The address of the desired data to be accessed

Due to high memory access time, the pipelines of

The rest of the application still runs on the CPU.

This is known as "heterogeneous" or "hybrid"

Together, they operate to crunch through the data

This massively parallel architecture is what gives

Central Processing Unit Graphics Processing Unit

Several cores Many cores

Low latency High throughput

Good for serial processing Good for parallel processing

Can do a handful of operations at once Can do thousands of operations at once

It allows software developers and software

This accessibility makes it easier for specialists

1. Copy data from main memory

You might also like