CUDA

This is a power point presentation on gpu processors. Go for this topic in your college presentations. Kunal Garg

Uploaded by

kunalgrg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

1K views46 pages

CUDA

This is a power point presentation on gpu processors. Go for this topic in your college presentations. Kunal Garg

Uploaded by

kunalgrg

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 46

CUDA

A technology that can make super-

computers personal
Presented by
Kunal Garg
2507276
UIET KU
Kurukshetra,India
SUPERCOMPUTER
A supercomputer is a computer that is at the
frontline of current processing capacity, particularly
speed of calculation.
Supercomputers are used for highly calculation-
intensive tasks.
Space for supercomputer pic
GPU
A graphics processing unit or GPU
(VPU) is a specialized processor that
offloads 3D or 2D graphics rendering
from the microprocessor.
Used in embedded systems,
mobile phones, personal computers,
workstations, and game consoles
GPU Computing
The excellent floating point
performance in GPUs led to
the advent of General Purpose
Computing on GPU’s(GPGPU)
GPU computing is the use
of a GPU to do general
purpose scientific and
engineering computing
The model for GPU
computing is to use a CPU
and GPU together in a
heterogeneous computing
model.
Problems in
GPU Programming
Required graphical languages
Difficult for users to program applications for GPU
CUDA
CUDA is
an acronym for Compute Unified Device
Architecture
a parallel computing architecture
computing engine
CUDA
CUDA with industry-standard C
 Write a program for one thread
 Instantiate it on many parallel threads
 Familiar programming model and language
CUDA is a scalable parallel programming model
 Program runs on any number of processors
without recompiling
Advantages of CUDA
CUDA has following advantages over
traditional GPGPU using graphics APIs.
Scattered reads
Shared memory
Faster downloads and readbacks to and from the GPU
Full support for integer and bitwise operations
CUDA Programming Model
Parallel code (kernel) is launched and executed
on a device by many threads
Threads are grouped into thread blocks
Parallel code is written for a thread
 Each thread is free to execute a unique code path
 Built-in thread and block ID variables
CUDA Architecture
The CUDA Architecture
Consists of several
components

Parallel compute engines

OS kernel-level support

User-mode driver

ISA
Tesla 10 Series

CUDA Computing with Tesla T10

240 SP processors at 1.45 GHz: 1 TFLOPS peak
30 DP processors at 1.44Ghz: 86 GFLOPS peak
128 threads per processor: 30,720 threads total
Thread Hierarchy
Threads launched for a parallel section are section
are partitioned into
thread blocks
 Grid = all blocks for a given
launch
Thread block is a group of
threads that can
 Synchronize their execution
 Communicate via shared
memory
Execution Model
Warps and Half Warps
GPU Memory Allocation / Release
Host (CPU) manages device (GPU) memory:
cudaMalloc (void ** pointer, size_t nbytes)
cudaMemset (void * pointer, int value, size_t
count)
cudaFree (void* pointer)
Next Generation CUDA Architecture
The next generation CUDA architecture, codenamed
Fermi is the most advanced GPU architecture ever
built. Its features include
• 512 CUDA cores
• 3.2 billion transistors
• Nvidia Parallel Datacache Technology
Nvidia Gigathread Engine
ECC Support
Applications
Accelerated
rendering of
3D graphics
Video Forensic
Molecular Dynamics
Computational Chemistry
Life Sciences
Bioinformatics
Electrodynamics
Medical Imaging
Oil and gas
Weather and Ocean Modeling
Electronic Design Automaton
Video Imaging
Video
Acceleration
Why should I use a GPU as a Processor
When compared to the latest quad-core CPU, Tesla 20-
series GPU computing processors deliver equivalent
performance at 1/20th the power consumption and
1/10th the cost
When computational fluid dynamics problem is
solved it takes
 9 minutes on a Tesla S870(4GPUs)
 12 hours on one 2.5 GHz CPU core
Double Precision Performance
 Intel core i7 980XE is 107.6 GFLOPS
 AMD Hemlock 5970 is 928 GFLOPS (GPU)
 nVidia's Tesla S2050 & S2070 is 2.1 TFlops - 2.5 Tflops(GPU)
Tesla C1060-933 GFLOPs (GPU)
GeForce 8800 GTX - 346 GFLOPs(GPU)
Core 2 Duo E6600 - 38 GFLOPs
Athlon 64 X2 4600+ - 19 GFLOPs
After all, it’s your personal supercomputer
Bibliography

GPU Architecture Ebook
No ratings yet
GPU Architecture Ebook
67 pages
CUDA Class Lecture01
No ratings yet
CUDA Class Lecture01
26 pages
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
No ratings yet
Cuda-: An Emerging Technology That Can Make Robots Reflex Action Faster
11 pages
Super Computer Documentation
100% (1)
Super Computer Documentation
12 pages
Introduction To Parallel Computing CIS 410/510 Department of Computer and Information Science
No ratings yet
Introduction To Parallel Computing CIS 410/510 Department of Computer and Information Science
65 pages
Amd Cdna 3 White Paper
No ratings yet
Amd Cdna 3 White Paper
27 pages
1 Cuda
100% (1)
1 Cuda
173 pages
UNIT 1 Topic Types of Computer
No ratings yet
UNIT 1 Topic Types of Computer
4 pages
Super Computer Presentation
100% (1)
Super Computer Presentation
16 pages
Classification of The Different Types of Computers Based On Their Sizes and Functionalities
0% (1)
Classification of The Different Types of Computers Based On Their Sizes and Functionalities
1 page
Intro Spu Optimizations Part 1
No ratings yet
Intro Spu Optimizations Part 1
62 pages
Compute Cores Whitepaper
No ratings yet
Compute Cores Whitepaper
6 pages
Deferred Lighting and Post-Processing On PS3
No ratings yet
Deferred Lighting and Post-Processing On PS3
38 pages
Chapter 2 Hardware and Software
No ratings yet
Chapter 2 Hardware and Software
16 pages
CELL Handbook
No ratings yet
CELL Handbook
884 pages
PixelJunk Shooter Fluid Sim and Rendering
No ratings yet
PixelJunk Shooter Fluid Sim and Rendering
60 pages
Energy Aware Edge Computing A Survey
No ratings yet
Energy Aware Edge Computing A Survey
25 pages
CFD Lecture 1 2023
No ratings yet
CFD Lecture 1 2023
61 pages
Supercomputers
No ratings yet
Supercomputers
35 pages
Verifying Engineering Calculation
50% (2)
Verifying Engineering Calculation
542 pages
Hacking Human Brain.
100% (3)
Hacking Human Brain.
6 pages
Amd Cdna2 White Paper
No ratings yet
Amd Cdna2 White Paper
17 pages
AMD OpenCL Programming User Guide
No ratings yet
AMD OpenCL Programming User Guide
180 pages
Intro To Gpu &amp Cuda
No ratings yet
Intro To Gpu &amp Cuda
15 pages
Computer Power User - 2009-09-September
No ratings yet
Computer Power User - 2009-09-September
89 pages
CompOrg 5thed HW1
No ratings yet
CompOrg 5thed HW1
2 pages
2620 Final PDF
No ratings yet
2620 Final PDF
45 pages
CSC 505 Performance 1
No ratings yet
CSC 505 Performance 1
111 pages
Introduction To The Graphics Pipeline of The PS3
No ratings yet
Introduction To The Graphics Pipeline of The PS3
29 pages
Nvidia h100 Datasheet 2287922 Web
No ratings yet
Nvidia h100 Datasheet 2287922 Web
3 pages
Compute Unified Device Architecture
No ratings yet
Compute Unified Device Architecture
6 pages
SG 247575
No ratings yet
SG 247575
666 pages
Barlas Exercises Ch1
No ratings yet
Barlas Exercises Ch1
1 page
Realflow Lava Simulation Guide
100% (1)
Realflow Lava Simulation Guide
8 pages
Parallel Computing Course Guide
No ratings yet
Parallel Computing Course Guide
37 pages
CUDA Installation Guide Windows
100% (1)
CUDA Installation Guide Windows
17 pages
Supercomputerjgs Ojirs
No ratings yet
Supercomputerjgs Ojirs
12 pages
Computing Architectures For Virtual Reality: Electrical and Computer Engineering Dept
100% (1)
Computing Architectures For Virtual Reality: Electrical and Computer Engineering Dept
136 pages
Product Availability Update: Processamento Paralelo em GPU's Na Arquitetura Fermi
100% (1)
Product Availability Update: Processamento Paralelo em GPU's Na Arquitetura Fermi
44 pages
ITC - Chapter # 2
No ratings yet
ITC - Chapter # 2
27 pages
GPU Datasheet
No ratings yet
GPU Datasheet
3 pages
Amd Cdna Whitepaper
No ratings yet
Amd Cdna Whitepaper
11 pages
Cray Timeline
No ratings yet
Cray Timeline
1 page
Image Rotation Using CUDA
No ratings yet
Image Rotation Using CUDA
18 pages
High Performance Computing On Gpu
No ratings yet
High Performance Computing On Gpu
37 pages
Embedded System Design BSC 01
No ratings yet
Embedded System Design BSC 01
80 pages
22POP13set1 Answers
No ratings yet
22POP13set1 Answers
20 pages
Crusoe Processor: Seminar Guide: - By: - Prof. H. S. Kulkarni Ashish
No ratings yet
Crusoe Processor: Seminar Guide: - By: - Prof. H. S. Kulkarni Ashish
26 pages
Lecture 1: Overview of Scientific Computing: Hao Wu
No ratings yet
Lecture 1: Overview of Scientific Computing: Hao Wu
56 pages
ITIL Service Operation Guide
No ratings yet
ITIL Service Operation Guide
26 pages
3D Attack - 2005 - 03
No ratings yet
3D Attack - 2005 - 03
57 pages
Computer Fundamentals
No ratings yet
Computer Fundamentals
3 pages
Architecture Overview: Introducing The Cell BE Installing Linux SIMD Programming in C/C++ Asynchronous Data Transfer With The DMA
No ratings yet
Architecture Overview: Introducing The Cell BE Installing Linux SIMD Programming in C/C++ Asynchronous Data Transfer With The DMA
21 pages
Cell Processor Architecture Overview
100% (2)
Cell Processor Architecture Overview
27 pages
2013 COMP5318 Lecture1
No ratings yet
2013 COMP5318 Lecture1
21 pages
CUDA Execution Model
No ratings yet
CUDA Execution Model
67 pages
Supercomputer Benchmarking Guide
No ratings yet
Supercomputer Benchmarking Guide
18 pages
Introduction To OpenCL Programming (201005)
No ratings yet
Introduction To OpenCL Programming (201005)
132 pages
Mali GPU Architecture
No ratings yet
Mali GPU Architecture
21 pages
Numerical Methods Implementation On CUDA
No ratings yet
Numerical Methods Implementation On CUDA
73 pages
Computer Graphics CSE 306
No ratings yet
Computer Graphics CSE 306
119 pages
Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
No ratings yet
Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows
13 pages
Welcome To CUDA: Seungwoo Yoo Computational Imaging Lab. Feb. 12. 2011
No ratings yet
Welcome To CUDA: Seungwoo Yoo Computational Imaging Lab. Feb. 12. 2011
24 pages
IBM Blue Gene Supercomputers Overview
No ratings yet
IBM Blue Gene Supercomputers Overview
27 pages
Graphics Processing Unit
No ratings yet
Graphics Processing Unit
22 pages
Command Function Pointers Return Codes: Vulkan
No ratings yet
Command Function Pointers Return Codes: Vulkan
13 pages
1.2D Viewing Transformations
No ratings yet
1.2D Viewing Transformations
127 pages
CUDA Installation Guide Windows
No ratings yet
CUDA Installation Guide Windows
28 pages
Ansys Fluent With Primergy HPC
No ratings yet
Ansys Fluent With Primergy HPC
8 pages
3DO M2: Next-Gen 64-Bit Gaming Tech
No ratings yet
3DO M2: Next-Gen 64-Bit Gaming Tech
4 pages
Game Console - Graphics Team - Bibliographic Report
No ratings yet
Game Console - Graphics Team - Bibliographic Report
42 pages
GPU Wiki
No ratings yet
GPU Wiki
9 pages
Graphic Engines 3
No ratings yet
Graphic Engines 3
8 pages
Nintendo 64 Architecture: by Courtney Getman and Collin Reeser
No ratings yet
Nintendo 64 Architecture: by Courtney Getman and Collin Reeser
19 pages
CUDA Memory for HPC Students
No ratings yet
CUDA Memory for HPC Students
27 pages
Performance Tuning An HPC Cluster - FINAL 012009
No ratings yet
Performance Tuning An HPC Cluster - FINAL 012009
41 pages
The Libflame Library For Dense Matrix Computations
No ratings yet
The Libflame Library For Dense Matrix Computations
8 pages
Transmeta Crusoe: A Revolutionary CPU For Mobile Computing Ashraful Alam
No ratings yet
Transmeta Crusoe: A Revolutionary CPU For Mobile Computing Ashraful Alam
23 pages
The Evolution of Gpus For General Purpose Computing
No ratings yet
The Evolution of Gpus For General Purpose Computing
38 pages
Physics Engine Applications & Challenges
No ratings yet
Physics Engine Applications & Challenges
6 pages
The End of The Gpu Roadmap: Tim Sweeney CEO, Founder Epic Games
No ratings yet
The End of The Gpu Roadmap: Tim Sweeney CEO, Founder Epic Games
74 pages
Graphics Processing Units Paper PDF
No ratings yet
Graphics Processing Units Paper PDF
14 pages
Course28-Advanced Real-Time Rendering in 3D Graphics and Games SIGGRAPH07
No ratings yet
Course28-Advanced Real-Time Rendering in 3D Graphics and Games SIGGRAPH07
144 pages
OpenGL 3D Graphics Basics
No ratings yet
OpenGL 3D Graphics Basics
22 pages
MMX Unit 1
No ratings yet
MMX Unit 1
33 pages
Optimize For Adreno - 0 PDF
No ratings yet
Optimize For Adreno - 0 PDF
11 pages
Using OpenCL Programming Massively Parallel Computers
No ratings yet
Using OpenCL Programming Massively Parallel Computers
309 pages
FPGA VGA Image Display Tutorial
No ratings yet
FPGA VGA Image Display Tutorial
17 pages
AMD ZEN Architecture PDF
100% (1)
AMD ZEN Architecture PDF
19 pages

CUDA

Uploaded by

CUDA

Uploaded by

CUDA

A technology that can make super-

Parallel compute engines

OS kernel-level support

CUDA Computing with Tesla T10

You might also like