0% found this document useful (0 votes)

56 views35 pages

Mpsoc Architectures Openmp

1) The document introduces OpenMP, a specification for multi-threaded programming that uses compiler directives and library routines to define parallel regions of code. 2) OpenMP aims to simplify parallel programming by allowing programmers to separate serial and parallel code regions, hiding complex threading details. It provides constructs for data sharing, synchronization, and loop parallelization. 3) The document uses examples to illustrate how OpenMP makes parallel programming easier than lower-level threading APIs like Pthreads, allowing serial code to be annotated for parallel execution with minimal changes to address threading issues.

Uploaded by

kaoutar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

56 views35 pages

Mpsoc Architectures Openmp

Uploaded by

kaoutar

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 35

11/7/17

MPSoC Architectures
OpenMP

Alberto Bosio, Associate Professor – UM

Microelectronic Departement
[email protected]

Introduction to OpenMP
l What is OpenMP?
l Open specification for Multi-Processing
l “Standard” API for defining multi-threaded shared-

memory programs
–www.openmp.org – Talks, examples, forums, etc.

l High-level API
l Preprocessor (compiler) directives ( ~ 80% )
l Library Calls ( ~ 19% )
l Environment Variables ( ~ 1% )

1
11/7/17

A Programmer’s View of OpenMP

l OpenMP is a portable, threaded, shared-memory
programming specification with “light” syntax
l Exact behavior depends on OpenMP implementation!
l Requires compiler support (C or Fortran)

l OpenMP will:
l Allow a programmer to separate a program into serial
regions and parallel regions, rather than T concurrently-
executing threads.
l Hide stack management
l Provide synchronization constructs

l OpenMP will not:

l Parallelize (or detect!) dependencies
l Guarantee speedup
l Provide freedom from data races

Outline
l Introduction
l Motivating example
l Parallel Programming is Hard

l OpenMP Programming Model

l Easier than PThreads

l Microbenchmark Performance Comparison

l vs. PThreads

l Discussion
l specOMP

2
11/7/17

Current Parallel Programming

1. Start with a parallel algorithm
2. Implement, keeping in mind:
• Data races
• Synchronization
• Threading Syntax
3. Test & Debug
4. Debug
5. Debug

Motivation – Threading Library

void* SayHello(void *foo) {
printf( "Hello, world!\n" );
return NULL;
}

int main() {
pthread_attr_t attr;
pthread_t threads[16];
int tn;
pthread_attr_init(&attr);
pthread_attr_setscope(&attr, PTHREAD_SCOPE_SYSTEM);
for(tn=0; tn<16; tn++) {
pthread_create(&threads[tn], &attr, SayHello, NULL);
}
for(tn=0; tn<16 ; tn++) {
pthread_join(threads[tn], NULL);
}
return 0;
}

3
11/7/17

Motivation

• Thread libraries are hard to use

– P-Threads/Solaris threads have many library calls
for initialization, synchronization, thread creation,
condition variables, etc.
– Programmer must code with multiple threads in
mind

• Synchronization between threads introduces a

new dimension of program correctness

Motivation

l Wouldn’t it be nice to write serial programs and

somehow parallelize them “automatically”?

l OpenMP can parallelize many serial programs with

relatively few annotations that specify parallelism
and independence

l OpenMP is a small API that hides cumbersome

threading calls with simpler directives

4
11/7/17

Better Parallel Programming

1. Start with some algorithm
• Embarrassing parallelism is helpful, but not
necessary
2. Implement serially, ignoring:
• Data Races
• Synchronization
• Threading Syntax
3. Test and Debug
4. Automatically (magically?) parallelize
• Expect linear speedup

Motivation – OpenMP

int main() {

// Do this part in parallel

printf( "Hello, World!\n" );

return 0;
}

5
11/7/17

Motivation – OpenMP

int main() {

omp_set_num_threads(16);

// Do this part in parallel

#pragma omp parallel
{
printf( "Hello, World!\n" );
}

return 0;
}

OpenMP Parallel Programming

1. Start with a parallelizable algorithm
• Embarrassing parallelism is good, loop-level
parallelism is necessary
2. Implement serially, mostly ignoring:
• Data Races
• Synchronization
• Threading Syntax
3. Test and Debug
4. Annotate the code with parallelization (and
synchronization) directives
• Hope for linear speedup
5. Test and Debug

6
11/7/17

Programming Model - Threading

l Serial regions by default,
annotate to create parallel
regions
l Generic parallel regions
Fork
l Parallelized loops
l Sectioned parallel regions

l Thread-like Fork/Join model Join

l Arbitrary number of logical

thread creation/ destruction
events

Programming Model - Threading

int main() {
// serial region
printf(“Hello…”);

// parallel region
Fork
#pragma omp parallel
{
printf(“World”);
}

// serial again Join

Hello…WorldWorldWorldWorld!
printf(“!”);
}

7
11/7/17

Programming Model – Nested

Threading
• Fork/Join can be nested

Fork
– Nesting complication handled
“automagically” at compile-time
Fork
– Independent of the number of
threads actually running Join

Join

Programming Model – Thread

Identification
Master Thread
• Thread with ID=0 0

• Only thread that exists in

sequential regions
Fork
• Depending on implementation,
may have special purpose inside 0 1 2 3 4 5 6 7
parallel regions
• Some special directives affect only
Join
the master thread (like master)
0

8
11/7/17

Example

int main() {

int tid, nthreads;

omp_set_num_threads(16);

// Do this part in parallel

#pragma omp parallel private(nthreads, tid)
{
printf( "Hello, World!\n" );
/* Obtain and print thread id */
tid = omp_get_thread_num();
if (tid == 0)
{
nthreads = omp_get_num_threads();
printf("I'm the master, Number of threads = %d\n", nthreads);
}

return 0;

Programming Model – Data/Control

Parallelism
l Data parallelism
l Threads perform similar
functions, guided by thread Fork
identifier

l Control parallelism
l Threads perform differing
functions
- One thread for I/O, one for
computation, etc… Join

9
11/7/17

Programming model: Summary

Memory Model
l Shared memory communication
l Threads cooperates by accessing shared variables
l The sharing is defined syntactically
l Any variable that is seen by two or more threads is
shared
l Any variable that is seen by one thread only is
private
l Race conditions possible
l Use synchronization to protect from conflicts
l Change how data is stored to minimize the
synchronization

10
11/7/17

Structure

Programming Model – Concurrent

Loops
l OpenMP easily parallelizes loops
l No data dependencies between
iterations!

l Preprocessor calculates loop

bounds for each thread directly
from serial source

#pragma omp parallel for

for( i=0; i < 25; i++ ) {
printf(“Foo”);
}

11
11/7/17

The problem
l Executes the same code as many times as there are
threads
l How many threads do we have? omp_set_num_threads(n)
What is the use of repeating the same work n times in
parallel? Can use omp_thread_num() to distribute the work
between threads.
l D is shared between the threads, i and sum are private

Programming Model – Concurrent

Loops

12
11/7/17

Programming Model – Concurrent

Loops

Programming Model – Concurrent

Loops
l Load balancing
l If all the iterations execute at the same speed, the
processors are used optimally If some iterations are faster
than others, some processors may get idle, reducing the
speedup
l We don't always know the distribution of work, may need to
re-distribute dynamically
l Granularity
l Thread creation and synchronization takes time Assigning
work to threads on per-iteration resolution may take more
time than the execution itself! Need to coalesce the work to
coarse chunks to overcome the threading overhead
l Trade-off between load balancing and granularity!

13
11/7/17

Controlling Granularity
l #pragma omp parallel if (expression)
l Can be used to disable parallelization in some
cases (when the input is determined to be too small
to be beneficially multithreaded)
l #pragma omp num_threads (expression)
l Control the number of threads used for this parallel
region

Programming Model – Loop

Scheduling
• schedule clause determines how loop iterations are
divided among the thread team
–static([chunk]) divides iterations statically between
threads
- Each thread receives [chunk] iterations, rounding as
necessary to account for all iterations
- Default [chunk] is ceil( # iterations / # threads
)
–dynamic([chunk]) allocates [chunk] iterations per
thread, allocating an additional [chunk] iterations when
a thread finishes
- Forms a logical work queue, consisting of all loop iterations
- Default [chunk] is 1
–guided([chunk]) allocates dynamically, but
[chunk] is exponentially reduced with each allocation

14
11/7/17

Programming Model – Loop

Scheduling

Example

l The function TestForPrime (usually) takes little

time But can take long, if the number is a prime
indeed
l Solution: use dynamic, but with chunks

15
11/7/17

Work sharing: Sections

Sections
l The SECTIONS directive is a non-iterative
work-sharing construct. It specifies that the
enclosed section(s) of code are to be divided
among the threads in the team.
l Independent SECTION directives are nested
within a SECTIONS directive.
l Each SECTION is executed once by a thread in the
team. Different sections may be executed by
different threads. It is possible that for a thread to
execute more than one section if it is quick enough
and the implementation permits such.

16
11/7/17

Example
#include <omp.h>
#define N 1000

main ()
{
int i;
float a[N], b[N], c[N], d[N];
/* Some initializations */
for (i=0; i < N; i++) {
a[i] = i * 1.5;
b[i] = i + 22.35;
}

Example
#pragma omp parallel shared(a,b,c,d) private(i)
{
#pragma omp sections
{
#pragma omp section
for (i=0; i < N; i++)
c[i] = a[i] + b[i];
#pragma omp section
for (i=0; i < N; i++)
d[i] = a[i] * b[i];
} /* end of sections */
} /* end of parallel section */
}

17
11/7/17

Data Sharing
l Shared Memory programming model
l Most variables are shared by default
l We can define a variable as private

// Do this part in parallel

#pragma omp parallel private(nthreads, tid)
{
printf( "Hello, World!\n" );
if (tid == 0)
{
….
}

Programming Model – Data Sharing

l Parallel programs often employ
two types of data int bigdata[1024];
l Shared data, visible to all threads,
similarly named
l Private data, visible to a single thread void* foo(void* bar) {
(often stack-allocated)
int tid;
• PThreads:
– Global-scoped variables are shared
– Stack-allocated variables are private #pragma omp parallel \
shared ( bigdata ) \
private ( tid )

• OpenMP: {
– shared variables are shared /* Calc. here */
– private variables are private
}
}

18
11/7/17

Programming Model – Data Sharing

l private:
l A copy of the variable is created for each thread.
l No connection between the original variable and the
private copies
l Can achieve the same using variables inside { }

Int i;
#pragma omp parallel for private(i)
for (i=0; i<n; i++) { ... }

Programming Model – Data Sharing

l Firstprivate:
l Same, but the initial value is copied from the main
copy
l Lastprivate:
l Same, but the last value is copied to the main copy

19
11/7/17

Thread private
l Similar to private, but defined per variable
l Declaration immediately after variable definition.
l Must be visible in all translation units. Persistent
between parallel sections
l Can be initialized from the master's copy with
l #pragma omp copyin
l More efficient than private, but a global variable!

Synchronization
l What should the result be (assuming 2
threads)?

X=0;
#pragma omp parallel
X = X+1;

20
11/7/17

Synchronization
l 2 is the expected answer But can be 1 with
unfortunate interleaving
l OpenMP assumes that the programmer knows
what he is doing
l Regions of code that are marked to run in
parallel are independent If access collisions are
possible, it is the programmer's responsibility to
insert protection

Synchronization
l Many of the existing mechanisms for shared
programming
l OpenMP Synchronization
l Nowait (turn synchronization off!)
l Single/Master execution
l Critical sections, Atomic updates
l Ordered
l Barriers
l Flush (memory subsystem synchronization)
l Reduction (special case)

21
11/7/17

Single/Master
l #pragma omp single
l Only one of the threads will execute the following
block of code
l The rest will wait for it to complete
l Good for non-thread-safe regions of code (such as
I/O)
l Must be used in a parallel region
l Applicable to parallel for sections

Single/Master
l #pragma omp master
l The following block will be executed by the master
thread
l No synchronization involved
l Applicable only to parallel sections
#pragma omp parallel
{
do_preprocessing () ;
#pragma omp single
read_input () ;
#pragma omp master
notify_input_consumed () ;
do_processing () ; }

22
11/7/17

Critical Sections
l #pragma omp critical [name]
l Standard critical section functionality
l Critical sections are global in the program
l Can be used to protect a single resource in different
functions
l Critical sections are identified by the name
l All the unnamed critical sections are mutually
exclusive throughout the program
l All the critical sections having the same name are
mutually exclusive between themselves

Critical Sections
int x=0;
#pragma omp parallel shared(x)
{
#pragma omp critical
x++;
}

23
11/7/17

Ordered
l #pragma omp ordered statement
l Executes the statement in the sequential order
of iterations
l Example:
#pragma omp parallel for ordered
for (j=0; j<N; j++) {
int result = j*j;
#pragma omp ordered
printf ("computation(%d) = %d\n" ,j ,
result ) ;
}

Barrier synchronization
l #pragma omp barrier
l Performs a barrier synchronization between all
the threads in a team at the given point.
l Example:
#pragma omp parallel
{
int result = heavy_computation_part1 ()
;
#pragma omp atomic
sum += result ;
#pragma omp barrier
heavy_computation_part2 (sum) ;
}

24
11/7/17

Explicit Locking
l Can be used to pass lock variables around
(unlike critical sections!)
l Can be used to implement more involved
synchronization constructs
l Functions:
l omp_init_lock(), omp_destroy_lock(),
omp_set_lock(), omp_unset_lock(), omp_test_lock()
The usual semantics
l Use #pragma omp flush to synchronize memory

Consistency Violation?

25
11/7/17

Consistency Violation?

Reduction
for (j=0; j<N; j++) {
sum =
sum+a[j]∗b[j];
}
l How to parallelize this code?
l sum is not private, but accessing it atomically is too
expensive
l Have a private copy of sum in each thread, then
add them up
l Use the reduction clause!
l #pragma omp parallel for reduction(+: sum)
l An operator must be used: +, -, *...

26
11/7/17

Synchronization Overhead
l Lost time waiting for locks
l Prefer to use structures that are as lock-free as
possible!

Summary
l OpenMP is a compiler-based technique to create
concurrent code from (mostly) serial code
l OpenMP can enable (easy) parallelization of loop-
based code
l Lightweight syntactic language extensions

l OpenMP performs comparably to manually-coded

threading
l Scalable
l Portable

l Not a silver bullet for all applications

27
11/7/17

More Information

• www.openmp.org
l OpenMP official site

• www.llnl.gov/computing/tutorials/openMP/
l A handy OpenMP tutorial

• www.nersc.gov/nusers/help/tutorials/openmp/
l Another OpenMP tutorial and reference

Backup Slides
Syntax, etc

28
11/7/17

lOpenMP Syntax

l General syntax for OpenMP directives

#pragma omp directive [clause…] CR

l Directive specifies type of OpenMP operation

l Parallelization
l Synchronization
l Etc.
l Clauses (optional) modify semantics of Directive

lOpenMP Syntax

l PARALLEL syntax
#pragma omp parallel [clause…] CR
structured_block

Ex:
#pragma omp parallel
Output:
Hello! (T=4)
Hello!
{
Hello!
printf(“Hello!\n”);
Hello!
} // implicit barrier

29
11/7/17

l OpenMP Syntax
l DO/for Syntax (DO-Fortran, for-C)
#pragma omp for [clause…] CR
for_loop

Ex:
#pragma omp parallel
{
#pragma omp for private(i) shared(x) \
schedule(static,x/N)
for(i=0;i<x;i++) printf(“Hello!\n”);
} // implicit barrier
Note: Must reside inside a parallel section

l OpenMP Syntax
More on Clauses
• private() – A variable in private list is private
to each thread
• shared() – Variables in shared list are visible
to all threads
l Implies no synchronization, or even consistency!
• schedule() – Determines how iterations will
be divided among threads
–schedule(static, C) – Each thread will be
given C iterations
- Usually T*C = Number of total iterations
–schedule(dynamic) – Each thread will be given
additional iterations as-needed
- Often less efficient than considered static allocation
• nowait – Removes implicit
CS Architecture Seminarbarrier from end of

block

30
11/7/17

OpenMP Syntax
l

l PARALLEL FOR (combines parallel and

for)
#pragma omp parallel for [clause…] CR
for_loop

Ex:
#pragma omp parallel for shared(x)\
private(i)
\

schedule(dynamic)
for(i=0;i<x;i++) {
printf(“Hello!\n”);

lExample: AddMatrix

Files:
(Makefile)
addmatrix.c // omp-
parallelized
matrixmain.c // non-omp
printmatrix.c // non-omp

31
11/7/17

lOpenMP Syntax
l ATOMIC syntax
#pragma omp atomic CR
simple_statement

Ex:
#pragma omp parallel shared(x)
{
#pragma omp atomic
x++;
} // implicit barrier

OpenMP Syntax

• CRITICAL syntax
#pragma omp critical CR
structured_block
Ex:
#pragma omp parallel shared(x)
{
#pragma omp critical
{
// only one thread in here
}
} // implicit barrier

32
11/7/17

l OpenMP Syntax
ATOMIC vs. CRITICAL

l Use ATOMIC for “simple statements”

l Can have lower overhead than CRITICAL if HW
atomics are leveraged (implementation dep.)

l Use CRITICAL for larger expressions

l May involve an unseen implicit lock

l OpenMP Syntax
l MASTER – only Thread 0 executes a block
#pragma omp master CR
structured_block

l SINGLE – onlyomp
#pragma one single
thread executes
CR a block
structured_block

l No implied synchronization

33
11/7/17

lOpenMP Syntax
l BARRIER
#pragma omp barrier CR

l Locks
l Locks are provided through omp.h library calls
–omp_init_lock()
–omp_destroy_lock()
–omp_test_lock()
–omp_set_lock()
–omp_unset_lock()

lOpenMP Syntax
l FLUSH
#pragma omp flush CR

l Guarantees that threads’ views of memory is

consistent
l Why? Recall OpenMP directives…
l Code is generated by directives at compile-time
- Variables are not always declared as volatile
- Using variables from registers instead of memory can
seem like a consistency violation
l Synch. Often has an implicit flush
- ATOMIC, CRITICAL

34
11/7/17

lOpenMP Syntax
l Functions
omp_set_num_threads()
omp_get_num_threads()
omp_get_max_threads()
omp_get_num_procs()
omp_get_thread_num()
omp_set_dynamic()
omp_[init destroy test set
unset]_lock()

Function for the environment

l omp_set_dynamic(int)
l omp_set_num_threads(int)
l omp_get_num_threads()
l omp_get_num_procs()
l omp_get_thread_num()
l omp_set_nested(int)
l omp_in_parallel()
l omp_get_wtime()

AZ 305+Official+Course+Study+Guide
No ratings yet
AZ 305+Official+Course+Study+Guide
14 pages
OpenMP Intro
No ratings yet
OpenMP Intro
52 pages
OpenMP for Shared Memory Programming
No ratings yet
OpenMP for Shared Memory Programming
30 pages
CS-3006 8 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 8 UsingOpenMP SharedMemoryProgramming
61 pages
OpenMP for Parallel Programming
No ratings yet
OpenMP for Parallel Programming
40 pages
Open MPLecture
No ratings yet
Open MPLecture
54 pages
OpenMP Shared Memory Programming Guide
No ratings yet
OpenMP Shared Memory Programming Guide
65 pages
Lecture Open MP
No ratings yet
Lecture Open MP
25 pages
High Performance Computing (HPC) - Lec3
No ratings yet
High Performance Computing (HPC) - Lec3
35 pages
Introduction To OpenMP
No ratings yet
Introduction To OpenMP
46 pages
CS-3006 5 UsingOpenMP SharedMemoryProgramming
No ratings yet
CS-3006 5 UsingOpenMP SharedMemoryProgramming
76 pages
Lec 12 OpenMP
No ratings yet
Lec 12 OpenMP
152 pages
Unit 3
No ratings yet
Unit 3
13 pages
OpenMP Shared Memory Guide
No ratings yet
OpenMP Shared Memory Guide
35 pages
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
No ratings yet
Govindarajan - ParallelizationPrinciples NSM AstroPhysics
50 pages
Openmp HPC Ass1
No ratings yet
Openmp HPC Ass1
43 pages
Parallel Programming Module 2
No ratings yet
Parallel Programming Module 2
112 pages
Open MP
No ratings yet
Open MP
30 pages
OpenMP Parallel Programming Guide
No ratings yet
OpenMP Parallel Programming Guide
25 pages
About OpenMP
No ratings yet
About OpenMP
86 pages
Unit Iii
No ratings yet
Unit Iii
61 pages
Unit III
No ratings yet
Unit III
15 pages
OpenMP Shared-Memory Programming Guide
No ratings yet
OpenMP Shared-Memory Programming Guide
37 pages
Lecture Open MP
No ratings yet
Lecture Open MP
35 pages
Openmp Overview
No ratings yet
Openmp Overview
74 pages
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
No ratings yet
Parallel Computing and Openmp Tutorial: Shao-Ching Huang
58 pages
A Tutorial On Parallel Computing On Shared Memory Systems
No ratings yet
A Tutorial On Parallel Computing On Shared Memory Systems
23 pages
Lect11 Openmp1
No ratings yet
Lect11 Openmp1
35 pages
Parallel Programming Using Openmp: Mike Bailey
No ratings yet
Parallel Programming Using Openmp: Mike Bailey
27 pages
Chap4 OpenMP
No ratings yet
Chap4 OpenMP
35 pages
OpenMP Guide for Parallel Computing
No ratings yet
OpenMP Guide for Parallel Computing
32 pages
Omp Sync Data Runtime Environment
No ratings yet
Omp Sync Data Runtime Environment
59 pages
OpenMP for Parallel Programming
No ratings yet
OpenMP for Parallel Programming
29 pages
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
No ratings yet
Programming Shared-Memory Platforms With Openmp: John Mellor-Crummey
46 pages
OpenMP for Parallel Programming
No ratings yet
OpenMP for Parallel Programming
51 pages
Open MP
No ratings yet
Open MP
28 pages
21th 22th Lecture
No ratings yet
21th 22th Lecture
22 pages
OpenMP P1
No ratings yet
OpenMP P1
32 pages
OpenMP SPM
No ratings yet
OpenMP SPM
9 pages
Xe 62011 Open MP
No ratings yet
Xe 62011 Open MP
46 pages
Openmp 1
No ratings yet
Openmp 1
38 pages
OpenMP Basics
No ratings yet
OpenMP Basics
47 pages
PDC Lecture 7
No ratings yet
PDC Lecture 7
10 pages
Openmp: Openmp Adds Constructs For Shared-Memory
No ratings yet
Openmp: Openmp Adds Constructs For Shared-Memory
15 pages
Unit 3 HPC
No ratings yet
Unit 3 HPC
10 pages
Parallel Programming Module 3
No ratings yet
Parallel Programming Module 3
44 pages
Sample - Code - Parallel - Cse6230 Fa14 04 Omp
No ratings yet
Sample - Code - Parallel - Cse6230 Fa14 04 Omp
51 pages
Parallel Programming Unit 2
No ratings yet
Parallel Programming Unit 2
71 pages
OpenMPSlides Tamu SC PDF
No ratings yet
OpenMPSlides Tamu SC PDF
74 pages
Introduction To Open MP
No ratings yet
Introduction To Open MP
42 pages
M4: Shared Memory Programming With Openmp
No ratings yet
M4: Shared Memory Programming With Openmp
63 pages
OPENMP1
No ratings yet
OPENMP1
67 pages
Ipc - Assig 1
No ratings yet
Ipc - Assig 1
9 pages
OpenMP 01 Introduction
No ratings yet
OpenMP 01 Introduction
70 pages
Num Tech
No ratings yet
Num Tech
39 pages
Openmp
No ratings yet
Openmp
21 pages
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
No ratings yet
Parallel Programming: in C With Mpi and Openmp Michael J. Quinn
73 pages
1 PB 1
No ratings yet
1 PB 1
13 pages
Dynamic Clustering Equivalent Model of Wind Turbin
No ratings yet
Dynamic Clustering Equivalent Model of Wind Turbin
17 pages
Implementation and Validation of An Adaptive Fuzzy
No ratings yet
Implementation and Validation of An Adaptive Fuzzy
19 pages
Performance Enhancement of Shunt Active Power Filter Application Using Adaptive Neural Network
No ratings yet
Performance Enhancement of Shunt Active Power Filter Application Using Adaptive Neural Network
8 pages
Energies 13 04152
No ratings yet
Energies 13 04152
39 pages
Voltage Sag Mitigation With Repetitive Controlled Dynamic Voltage Restorer
No ratings yet
Voltage Sag Mitigation With Repetitive Controlled Dynamic Voltage Restorer
69 pages
Buildings 12 01003 v2 1
No ratings yet
Buildings 12 01003 v2 1
16 pages
Real Time Eeg Signal Processing Based On Ti S tms320c6713 DSK
No ratings yet
Real Time Eeg Signal Processing Based On Ti S tms320c6713 DSK
9 pages
Research On The Current Situation of Data Governance in The Wind Power Industry
No ratings yet
Research On The Current Situation of Data Governance in The Wind Power Industry
8 pages
Opencl 2pp
No ratings yet
Opencl 2pp
28 pages
Citation 221305915
No ratings yet
Citation 221305915
1 page
Median Filter Using Nios Ii Processor With Sort Hardware Accelerator
No ratings yet
Median Filter Using Nios Ii Processor With Sort Hardware Accelerator
87 pages
Empirical Mode Decomposition: Applications On Signal and Image Processing
No ratings yet
Empirical Mode Decomposition: Applications On Signal and Image Processing
52 pages
A Jump Start To Opencl: March 15, 2009 Cis 565/665 - Gpu Computing and Architecture
No ratings yet
A Jump Start To Opencl: March 15, 2009 Cis 565/665 - Gpu Computing and Architecture
74 pages
Sap Tables List
100% (1)
Sap Tables List
2 pages
B - Bluearc-Admin-Guide PDF
No ratings yet
B - Bluearc-Admin-Guide PDF
487 pages
Missra Decoder Userguide Pod
No ratings yet
Missra Decoder Userguide Pod
9 pages
Data Warehouse Data Modelling - Vincent Rainardi
No ratings yet
Data Warehouse Data Modelling - Vincent Rainardi
34 pages
SolarWinds Interview Perp - Edition 8 (MIBs and OIDs)
No ratings yet
SolarWinds Interview Perp - Edition 8 (MIBs and OIDs)
21 pages
Business Intelligence INTRO
No ratings yet
Business Intelligence INTRO
45 pages
En Raccoon Stealer Technical Analysis Report
No ratings yet
En Raccoon Stealer Technical Analysis Report
28 pages
SAP HCM: Get Manager's Subordinates
No ratings yet
SAP HCM: Get Manager's Subordinates
4 pages
Auditing Internal Control Over Financial Reporting - Chapter 7
100% (3)
Auditing Internal Control Over Financial Reporting - Chapter 7
25 pages
"Budget Management System (SSP) ": A.P.S. University, Rewa (M.P.)
No ratings yet
"Budget Management System (SSP) ": A.P.S. University, Rewa (M.P.)
28 pages
PLC Automation Guide
No ratings yet
PLC Automation Guide
4,541 pages
Food Recipe Blog: Project Report On
50% (2)
Food Recipe Blog: Project Report On
6 pages
GR PO Tipo de Movimiento en MSEG
No ratings yet
GR PO Tipo de Movimiento en MSEG
7 pages
Sky Bus Online Ticket Reservation System
No ratings yet
Sky Bus Online Ticket Reservation System
84 pages
Architecting On Aws1
No ratings yet
Architecting On Aws1
4 pages
Naukri RahulR 2042261 - 01 06 - 1
No ratings yet
Naukri RahulR 2042261 - 01 06 - 1
2 pages
Software Testing Concepts
No ratings yet
Software Testing Concepts
10 pages
Healthy Happy and Safe Community Dha Medical Fitness
No ratings yet
Healthy Happy and Safe Community Dha Medical Fitness
19 pages
Srs
No ratings yet
Srs
3 pages
Question Paper
No ratings yet
Question Paper
2 pages
A Survey On Essential Components of A Self-Sovereign Identity
No ratings yet
A Survey On Essential Components of A Self-Sovereign Identity
7 pages
Apache Superset Readthedocs Io en Latest PDF
No ratings yet
Apache Superset Readthedocs Io en Latest PDF
120 pages
Aspiring Data Analyst Profile
No ratings yet
Aspiring Data Analyst Profile
1 page
MBR vs GPT: Partition Management Guide
No ratings yet
MBR vs GPT: Partition Management Guide
40 pages
Replil Industrial Patch Manager Datasheet
No ratings yet
Replil Industrial Patch Manager Datasheet
2 pages
Content Addressed Storage: Section 2: Storage Networking Technologies and Virtualization
100% (1)
Content Addressed Storage: Section 2: Storage Networking Technologies and Virtualization
23 pages
Tod
No ratings yet
Tod
7 pages
Kofax Insight: Installation Guide
No ratings yet
Kofax Insight: Installation Guide
58 pages
SAP Sybase ASE Warm Standby Database Using Replication Server - Database Tutorials
No ratings yet
SAP Sybase ASE Warm Standby Database Using Replication Server - Database Tutorials
7 pages

Mpsoc Architectures Openmp

Uploaded by

Mpsoc Architectures Openmp

Uploaded by

11/7/17

Alberto Bosio, Associate Professor – UM

A Programmer’s View of OpenMP

l OpenMP will not:

l OpenMP Programming Model

l Microbenchmark Performance Comparison

Current Parallel Programming

Motivation – Threading Library

• Thread libraries are hard to use

• Synchronization between threads introduces a

l Wouldn’t it be nice to write serial programs and

l OpenMP can parallelize many serial programs with

l OpenMP is a small API that hides cumbersome

Better Parallel Programming

// Do this part in parallel

printf( "Hello, World!\n" );

// Do this part in parallel

OpenMP Parallel Programming

Programming Model - Threading

l Thread-like Fork/Join model Join

l Arbitrary number of logical

Programming Model - Threading

// serial again Join

Programming Model – Nested

Programming Model – Thread

• Only thread that exists in

int tid, nthreads;

// Do this part in parallel

Programming Model – Data/Control

Programming model: Summary

Programming Model – Concurrent

l Preprocessor calculates loop

#pragma omp parallel for

Programming Model – Concurrent

Programming Model – Concurrent

Programming Model – Concurrent

Programming Model – Loop

Programming Model – Loop

l The function TestForPrime (usually) takes little

Work sharing: Sections

// Do this part in parallel

Programming Model – Data Sharing

Programming Model – Data Sharing

Programming Model – Data Sharing

l OpenMP performs comparably to manually-coded

l Not a silver bullet for all applications

l General syntax for OpenMP directives

l Directive specifies type of OpenMP operation

l PARALLEL FOR (combines parallel and

l Use ATOMIC for “simple statements”

l Use CRITICAL for larger expressions

l Guarantees that threads’ views of memory is

Function for the environment

You might also like