0% found this document useful (0 votes)

518 views268 pages

PETSc Tutorial

The document outlines a tutorial presentation on the Portable Extensible Toolkit for Scientific Computing (PETSc) focusing on its distributed memory abstraction (DM) for structured and unstructured meshes, including the DMDA interface for structured grids which handles parallel data layout and mappings between local and global indices and vectors.

Uploaded by

dkn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

518 views268 pages

PETSc Tutorial

Uploaded by

dkn

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 268

The

Portable Extensible Toolkit for Scientific Computing

Matthew Knepley
Mathematics and Computer Science Division
Argonne National Laboratory

Computation Institute
University of Chicago

PETSc Tutorial
Groupe Calcul, CNRS
University Paris-Sud 11
Orsay, France
June 1113, 2013

M. Knepley (UC)

PETSc

CNRS 12

1 / 156

Main Point

Never believe anything,

unless you can run it.

M. Knepley (UC)

PETSc

CNRS 12

2 / 156

Main Point

Never believe anything,

unless you can run it.

M. Knepley (UC)

PETSc

CNRS 12

2 / 156

The PETSc Team

Bill Gropp

Barry Smith

Satish Balay

Jed Brown

Matt Knepley

Lisandro Dalcin

Hong Zhang

Mark Adams

Peter Brune

M. Knepley (UC)

PETSc

CNRS 12

3 / 156

Timeline
PETSc-1 PETSc-2
Barry
Bill
Lois
Satish
Dinesh
Hong
Kris
Matt
Victor
Dmitry
Lisandro
Jed
Shri
MPI-1
Peter
1991
1995 MPI-2 2000
M. Knepley (UC)

PETSc

PETSc-3

2005

2010
CNRS 12

4 / 156

What I Need From You

Tell me if you do not understand

Tell me if an example does not work
Suggest better wording or figures
Followup problems at [email protected]

M. Knepley (UC)

PETSc

CNRS 12

5 / 156

Ask Questions!!!
Helps me understand what you are missing
Helps you clarify misunderstandings
Helps others with the same question

M. Knepley (UC)

PETSc

CNRS 12

6 / 156

How We Can Help at the Tutorial

Point out relevant documentation

Quickly answer questions
Help install
Guide design of large scale codes
Answer email at [email protected]

M. Knepley (UC)

PETSc

CNRS 12

7 / 156

How We Can Help at the Tutorial

Point out relevant documentation

Quickly answer questions
Help install
Guide design of large scale codes
Answer email at [email protected]

M. Knepley (UC)

PETSc

CNRS 12

7 / 156

How We Can Help at the Tutorial

Point out relevant documentation

Quickly answer questions
Help install
Guide design of large scale codes
Answer email at [email protected]

M. Knepley (UC)

PETSc

CNRS 12

7 / 156

How We Can Help at the Tutorial

Point out relevant documentation

Quickly answer questions
Help install
Guide design of large scale codes
Answer email at [email protected]

M. Knepley (UC)

PETSc

CNRS 12

7 / 156

Outline

DM
Structured Meshes (DMDA)
Unstructured Meshes (DMPlex)

Managing Discretized Data

Advanced Solvers

M. Knepley (UC)

PETSc

CNRS 12

8 / 156

DM Interface

Allocation
DMCreateGlobalVector(DM, Vec *)
DMCreateLocalVector(DM, Vec *)
DMCreateMatrix(DM, MatType, Mat *)

Mapping
DMGlobalToLocalBegin/End(DM, Vec, InsertMode, Vec)
DMLocalToGlobalBegin/End(DM, Vec, InsertMode, Vec)
DMGetLocalToGlobalMapping(DM, IS *)

M. Knepley (UC)

PETSc

CNRS 12

9 / 156

DM Interface

Geometry
DMGetCoordinateDM(DM, DM *)
DMGetCoordinates(DM, Vec *)
DMGetCoordinatesLocal(DM, Vec *)

Layout
DMGetDefaultSection(DM, PetscSection *)
DMGetDefaultGlobalSection(DM, PetscSection *)
DMGetDefaultSF(DM, PetscSF *)

M. Knepley (UC)

PETSc

CNRS 12

10 / 156

DM Interface

Hierarchy
DMRefine(DM, MPI_Comm, DM *)
DMCoarsen(DM, MPI_Comm, DM *)
DMGetSubDM(DM, MPI_Comm, DM *)

Intergrid transfer
DMGetInterpolation(DM, DM, Mat *, Vec *)
DMGetAggregates(DM, DM, Mat *)
DMGetInjection(DM, DM, VecScatter *)

M. Knepley (UC)

PETSc

CNRS 12

11 / 156

Multigrid Paradigm

The DM interface uses the local callback functions to

assemble global functions/operators from local pieces
assemble functions/operators on coarse grids
Then PCMG organizes
control flow for the multilevel solve, and
projection and smoothing operators at each level.

M. Knepley (UC)

PETSc

CNRS 12

12 / 156

Structured Meshes (DMDA)

Outline

DM
Structured Meshes (DMDA)
Unstructured Meshes (DMPlex)

M. Knepley (UC)

PETSc

CNRS 12

13 / 156

Structured Meshes (DMDA)

What is a DMDA?

DMDA is a topology interface on structured grids

Handles parallel data layout
Handles local and global indices
DMDAGetGlobalIndices()

and DMDAGetAO()

Provides local and global vectors

DMGetGlobalVector()

and DMGetLocalVector()

Handles ghost values coherence

DMGlobalToLocalBegin/End()

M. Knepley (UC)

and DMLocalToGlobalBegin/End()

PETSc

CNRS 12

14 / 156

Structured Meshes (DMDA)

Residual Evaluation
The DM interface is based upon local callback functions
FormFunctionLocal()
FormJacobianLocal()

Callbacks are registered using

SNESSetDM(), TSSetDM()
DMSNESSetFunctionLocal(), DMTSSetJacobianLocal()

When PETSc needs to evaluate the nonlinear residual F(x),

Each process evaluates the local residual
PETSc assembles the global residual automatically
Uses DMLocalToGlobal() method

M. Knepley (UC)

PETSc

CNRS 12

15 / 156

Structured Meshes (DMDA)

Ghost Values
To evaluate a local function f (x), each process requires
its local portion of the vector x
its ghost values, bordering portions of x owned by neighboring
processes

Local Node
Ghost Node

M. Knepley (UC)

PETSc

CNRS 12

16 / 156

Structured Meshes (DMDA)

DMDA Global Numberings

Proc 2
Proc 3
26 27 28 29
21 22 23 24
16 17 18 19
11 12 13 14
6
7
8
9
1
2
3
4
Proc 0
Proc 1
Natural numbering

Proc 2
Proc 3
22 23 28 29
19 20 26 27
16 17 24 25
7
8 13 14
4
5 11 12
1
2
9 10
Proc 0
Proc 1
PETSc numbering

25
20
15
10
5
0

M. Knepley (UC)

21
18
15
6
3
0

PETSc

CNRS 12

17 / 156

Structured Meshes (DMDA)

DMDA Global vs. Local Numbering

Global: Each vertex has a unique id belongs on a unique process
Local: Numbering includes vertices from neighboring processes
These are called ghost vertices

Proc 2
Proc 3
X
X
X X
X
X
X X
13 14 15 X
9 10 11 X
5
6
7 X
1
2
3 X
Proc 0
Proc 1
Local numbering

Proc 2
Proc 3
22 23 28 29
19 20 26 27
16 17 24 25
7
8 13 14
4
5 11 12
1
2
9 10
Proc 0
Proc 1
Global numbering

X
X
12
8
4
0

M. Knepley (UC)

21
18
15
6
3
0

PETSc

CNRS 12

18 / 156

Structured Meshes (DMDA)

DMDA Local Function

User provided function calculates the nonlinear residual (in 2D)

(lfunc)(DMDALocalInfo info, PetscScalar x, PetscScalar r, void *ctx)

info: All layout and numbering information

x: The current solution (a multidimensional array)
r: The residual
ctx: The user context passed to DASetLocalFunction()
The local DMDA function is activated by calling
DMDASNESSetFunctionLocal(dm, INSERT_VALUES, lfunc, &ctx)

M. Knepley (UC)

PETSc

CNRS 12

19 / 156

Structured Meshes (DMDA)

Bratu Residual Evaluation

u + eu = 0

ResLocal(DMDALocalInfo info, PetscScalar x, PetscScalar f, void ctx)

for(j = info->ys; j < info->ys+info->ym; ++j) {
for(i = info->xs; i < info->xs+info->xm; ++i) {
u = x[j][i];
if (i==0 || j==0 || i == M || j == N) {
f[j][i] = u; continue;
}
u_xx
= (2.0*u - x[j][i-1] - x[j][i+1])*hydhx;
u_yy
= (2.0*u - x[j-1][i] - x[j+1][i])*hxdhy;
f[j][i] = u_xx + u_yy - hx*hy*lambda*exp(u);
}}}

$PETCS_DIR/src/snes/examples/tutorials/ex5.c

M. Knepley (UC)

PETSc

CNRS 12

20 / 156

Structured Meshes (DMDA)

DMDA Local Jacobian

User provided function calculates the Jacobian (in 2D)

(*ljac)(DMDALocalInfo *info, PetscScalar **x, Mat J, void *ctx)

info: All layout and numbering information

x: The current solution
J: The Jacobian
ctx: The user context passed to DASetLocalJacobian()
The local DMDA function is activated by calling
DMDASNESSetJacobianLocal(dm, ljac, &ctx)

M. Knepley (UC)

PETSc

CNRS 12

21 / 156

Structured Meshes (DMDA)

Bratu Jacobian Evaluation

JacLocal(DMDALocalInfo *info,PetscScalar **x,Mat jac,void *ctx) {
for(j = info->ys; j < info->ys + info->ym; j++) {
for(i = info->xs; i < info->xs + info->xm; i++) {
row.j = j; row.i = i;
if (i == 0 || j == 0 || i == mx-1 || j == my-1) {
v[0] = 1.0;
MatSetValuesStencil(jac,1,&row,1,&row,v,INSERT_VALUES);
} else {
v[0] = -(hx/hy); col[0].j = j-1; col[0].i = i;
v[1] = -(hy/hx); col[1].j = j;
col[1].i = i-1;
v[2] = 2.0*(hy/hx+hx/hy)
- hx*hy*lambda*PetscExpScalar(x[j][i]);
v[3] = -(hy/hx); col[3].j = j;
col[3].i = i+1;
v[4] = -(hx/hy); col[4].j = j+1; col[4].i = i;
MatSetValuesStencil(jac,1,&row,5,col,v,INSERT_VALUES);
}}}}

$PETCS_DIR/src/snes/examples/tutorials/ex5.c

M. Knepley (UC)

PETSc

CNRS 12

22 / 156

Structured Meshes (DMDA)

DMDA Vectors

The DMDA object contains only layout (topology) information

All field data is contained in PETSc Vecs

Global vectors are parallel

Each process stores a unique local portion
DMCreateGlobalVector(DM da, Vec *gvec)

Local vectors are sequential (and usually temporary)

Each process stores its local portion plus ghost values
DMCreateLocalVector(DM da, Vec *lvec)

includes ghost and boundary values!

M. Knepley (UC)

PETSc

CNRS 12

23 / 156

Structured Meshes (DMDA)

Updating Ghosts

Two-step process enables overlapping

computation and communication
DMGlobalToLocalBegin(da, gvec, mode, lvec)

gvec provides the data

mode is either INSERT_VALUES or ADD_VALUES
lvec holds the local and ghost values
DMGlobalToLocalEnd(da, gvec, mode, lvec)

Finishes the communication

The process can be reversed with DALocalToGlobalBegin/End().

M. Knepley (UC)

PETSc

CNRS 12

24 / 156

Structured Meshes (DMDA)

DMDA Stencils
Both the box stencil and star stencil are available.

proc 10

proc 0 proc 1

proc 0

Star Stencil

Box Stencil
M. Knepley (UC)

proc 1

PETSc

CNRS 12

25 / 156

Structured Meshes (DMDA)

Setting Values on Regular Grids

PETSc provides
MatSetValuesStencil(Mat A, m, MatStencil idxm[], n, MatStencil idxn[],
PetscScalar values[], InsertMode mode)

Each row or column is actually a MatStencil

This specifies grid coordinates and a component if necessary
Can imagine for unstructured grids, they are vertices

The values are the same logically dense block in row/col

M. Knepley (UC)

PETSc

CNRS 12

26 / 156

Structured Meshes (DMDA)

Creating a DMDA

DMDACreate2d(comm, bdX, bdY, type, M, N, m, n, dof, s, lm[], ln[], DMDA *d

bd: Specifies boundary behavior

DMDA_BOUNDARY_NONE, DMDA_BOUNDARY_GHOSTED,
DMDA_BOUNDARY_PERIODIC

type: Specifies stencil

DA_STENCIL_BOX

or DA_STENCIL_STAR

M/N: Number of grid points in x/y-direction

m/n: Number of processes in x/y-direction
dof: Degrees of freedom per node

s: The stencil width

lm/n: Alternative array of local sizes
Use PETSC_NULL for the default

M. Knepley (UC)

PETSc

CNRS 12

27 / 156

Structured Meshes (DMDA)

Viewing the DA

We use SNES ex5

ex5 -dm_view

Shows both the DA and coordinate DA:

ex5 -dm_view draw -draw_pause -1

ex5 -da_grid_x 10 -da_grid_y 10 -dm_view draw -draw_pause -1

${PETSC_ARCH}/bin/mpiexec -n 4 ex5 -da_grid_x 10 -da_grid_y 10

-dm_view draw -draw_pause -1

Shows PETSc numbering

M. Knepley (UC)

PETSc

CNRS 12

28 / 156

Structured Meshes (DMDA)

DA Operators

Evaluate only the local portion

No nice local array form without copies

Use MatSetValuesStencil() to convert (i,j,k) to indices

Also use SNES ex48
mpiexec -n 2 ./ex5 -da_grid_x 10 -da_grid_y 10 -mat_view draw
-draw_pause -1
mpiexec -n 3 ./ex48 -mat_view draw -draw_pause 1 -da_refine 3
-mat_type aij

M. Knepley (UC)

PETSc

CNRS 12

29 / 156

Unstructured Meshes (DMPlex)

Outline

DM
Structured Meshes (DMDA)
Unstructured Meshes (DMPlex)

M. Knepley (UC)

PETSc

CNRS 12

30 / 156

Unstructured Meshes (DMPlex)

Problem
Traditional PDE codes cannot:
Compare different discretizations
Different orders, finite elements
finite volume vs. finite element

Compare different mesh types

Simplicial, hexahedral, polyhedral

Run 1D, 2D, and 3D problems

Enable an optimal solver
Fields, auxiliary operators

M. Knepley (UC)

PETSc

CNRS 12

31 / 156

Unstructured Meshes (DMPlex)

Why?
Impedence Mismatch in Interface

Interface is Too General:

Solver not told about discretization data, e.g. fields
Cannot take advantage of problem structure
blocking
saddle point structure

Interface is Too Specific:

Assembly code specialized to each discretization
dimension, cell shape, hybrid

Explicit references to element type

getVertices(faceID), getAdjacency(edgeID, VERTEX),
getAdjacency(edgeID, dim = 0)

No interface for transitive closure

Awkward nested loops to handle different dimensions
M. Knepley (UC)

PETSc

CNRS 12

32 / 156

Unstructured Meshes (DMPlex)

Mesh Representation
We represent each mesh as a Hasse Diagram:
Can represent any CW complex
Can be implemented as a Directed Acyclic Graph
Reduces mesh information to a single covering relation
Can discover dimension, since meshes are ranked posets
We use an abstract topological interface to organize traversals for:
discretization integrals
solver size determination
computing communication patterns
Mesh geometry is treated as just another mesh function.

M. Knepley (UC)

PETSc

CNRS 12

33 / 156

Unstructured Meshes (DMPlex)

Sample Meshes
Interpolated triangular mesh

8
7

0 4

6
9

Vertices

Edges

Cells

M. Knepley (UC)

8
3

9
4

10
5
1

PETSc

Depth 0

Depth 1

Depth 2

CNRS 12

34 / 156

Unstructured Meshes (DMPlex)

Sample Meshes
Optimized triangular mesh

3
0

1
4

Vertices

Cells

M. Knepley (UC)

PETSc

Depth 0

Depth 1

CNRS 12

35 / 156

Unstructured Meshes (DMPlex)

Sample Meshes
Interpolated quadrilateral mesh

9
2
10
9

Vertices

Edges

Cells

M. Knepley (UC)

10
3

0 5

11
4

13
8
14
13

12
5

6
1

PETSc

Depth 0

Depth 1

Depth 2

CNRS 12

36 / 156

Unstructured Meshes (DMPlex)

Sample Meshes
Optimized quadrilateral mesh

3
Vertices

Cells

M. Knepley (UC)

5
3

4
0

6
1

PETSc

Depth 0

Depth 1

CNRS 12

37 / 156

Unstructured Meshes (DMPlex)

Sample Meshes
Interpolated tetrahedral mesh

21
11
18

13
10

20
14

9
18

19
20

Vertices

Edges

Faces

Cells
M. Knepley (UC)

1
PETSc

Depth 0

Depth 1

Depth 2

Depth 3
CNRS 12

38 / 156

Unstructured Meshes (DMPlex)

Mesh Interface

By focusing on the key topological relations,

the interface can be both concise and quite general
Single relation
Dual is obtained by reversing arrows
Can associate functions with DAG points
Dual operation gives the support of the function

Mesh Algorithms for PDE with Sieve I: Mesh Distribution, Sci. Prog., 2009.

M. Knepley (UC)

PETSc

CNRS 12

39 / 156

Unstructured Meshes (DMPlex)

New Unstructured Interface

NO explicit references to element type

A point may be any mesh element
getCone(point): adjacent (d-1)-elements
getSupport(point): adjacent (d+1)-elements

Transitive closure
closure(cell): The computational unit for FEM

Algorithms independent of mesh

dimension
shape (even hybrid)
global topology
finite element

M. Knepley (UC)

PETSc

CNRS 12

40 / 156

Unstructured Meshes (DMPlex)

New Unstructured Interface

NO explicit references to element type

A point may be any mesh element
getCone(point): adjacent (d-1)-elements
getSupport(point): adjacent (d+1)-elements

Transitive closure
closure(cell): The computational unit for FEM

Algorithms independent of mesh

dimension
shape (even hybrid)
global topology
finite element

M. Knepley (UC)

PETSc

CNRS 12

40 / 156

Unstructured Meshes (DMPlex)

New Unstructured Interface

NO explicit references to element type

A point may be any mesh element
getCone(point): adjacent (d-1)-elements
getSupport(point): adjacent (d+1)-elements

Transitive closure
closure(cell): The computational unit for FEM

Algorithms independent of mesh

dimension
shape (even hybrid)
global topology
finite element

M. Knepley (UC)

PETSc

CNRS 12

40 / 156

Unstructured Meshes (DMPlex)

Basic Operations
Cone

We begin with the basic

covering relation,
7

cone(0) = {2, 3, 4}

0 4

6
9

Vertices

Edges

3
0

Cells

M. Knepley (UC)

PETSc

9
4

10
5
1

Depth 0

Depth 1

Depth 2

CNRS 12

41 / 156

Unstructured Meshes (DMPlex)

Basic Operations
Support

reverse arrows to get the

dual operation,
7

support(9) = {3, 4, 6}

0 4

6
9

Vertices

Edges

3
0

Cells

M. Knepley (UC)

PETSc

9
4

10
5
1

Depth 0

Depth 1

Depth 2

CNRS 12

42 / 156

Unstructured Meshes (DMPlex)

Basic Operations
Closure

add the transitive closure

of the relation,
7

closure(0) = {0, 2, 3, 4, 7, 8, 9}

0 4

6
9

Vertices

Edges

3
0

Cells

M. Knepley (UC)

PETSc

9
4

10
5
1

Depth 0

Depth 1

Depth 2

CNRS 12

43 / 156

Unstructured Meshes (DMPlex)

Basic Operations
Star

and the transitive closure

of the dual,
7

star(7) = {7, 2, 3, 0}

0 4

6
9

Vertices

Edges

3
0

Cells

M. Knepley (UC)

PETSc

9
4

10
5
1

Depth 0

Depth 1

Depth 2

CNRS 12

44 / 156

Unstructured Meshes (DMPlex)

Basic Operations
Meet

and augment with lattice

operations.
7

meet(0, 1) = {4}

0 4

6
9

Vertices

Edges

3
0

Cells

M. Knepley (UC)

PETSc

9
4

10
5
1

Depth 0

Depth 1

Depth 2

CNRS 12

45 / 156

Unstructured Meshes (DMPlex)

Basic Operations
Join

and augment with lattice

operations.
7

join(8, 9) = {4}

0 4

6
9

Vertices

Edges

3
0

Cells

M. Knepley (UC)

PETSc

9
4

10
5
1

Depth 0

Depth 1

Depth 2

CNRS 12

46 / 156

Unstructured Meshes (DMPlex)

Mesh Creation

An empty mesh can be created using

DMPlexCreate(MPI_Comm comm, DM *dm);

and then filled in using the primitives

DMPlexSetConeSize(DM dm, PetscInt p, PetscInt coneSize);
DMPlexSetCone(DM dm, PetscInt p, PetscInt cone[]);

and then
DMPlexSetSupportSize(DM dm, PetscInt p, PetscInt supportSize);
DMPlexSetSupport(DM dm, PetscInt p, PetscInt support[]);

or
DMPlexSymmetrize(DM dm);

M. Knepley (UC)

PETSc

CNRS 12

49 / 156

Unstructured Meshes (DMPlex)

Mesh Cloning

An existing mesh can be copied using

DMPlexClone(DM dm, DM *newdm);

so that the topology is shared.

DM structures are not shared
Data layout (PetscSection)
Communication pattern (PetscSF)

M. Knepley (UC)

PETSc

CNRS 12

50 / 156

Unstructured Meshes (DMPlex)

Mesh Input
DMPlex can input an existing mesh data using
DMPlexCreateFromCellList(MPI_Comm comm, PetscInt dim,
PetscInt numCells, PetscInt numVertices,
PetscInt numCorners, PetscBool interpolate,
const int cells[],
PetscInt spaceDim, const double vertexCoords[],
DM *dm)

Very common interchange format:

Used for Triangle and TetGen
DMPlexCreateFromDAG() is similar
single numbering
PETSc types

M. Knepley (UC)

PETSc

CNRS 12

51 / 156

Unstructured Meshes (DMPlex)

Mesh Generation

DMPlex can generate a mesh given a boundary

DMPlexGenerate(DM boundary, const char name[],
PetscBool interpolate, DM *mesh);

which dispatches to 3rd party mesh generators.

It also has built in meshes,
DMPlexCreateBoxMesh(), which calls DMPlexGenerate() after
DMPlexCreateSquareBoundary()
DMPlexCreateCubeBoundary()
DMPlexCreateHexBoxMesh()

M. Knepley (UC)

PETSc

CNRS 12

52 / 156

Unstructured Meshes (DMPlex)

Mesh Refinement
DMPlex can refine a mesh using
DMRefine(DM dm, MPI_Comm comm, DM *refdm);

using 3rd party generators, or parallel uniform refinement .

We view the mesh from SNES ex62 which makes ex62_sol.vtk
./ex62 -refinement_limit 0.0625 -pc_type jacobi
./ex62 -refinement_limit 0.00625 -pc_type jacobi
mpiexec -n 3 ./ex62 -refinement_limit 0.00625 -dm_view_partition
-pc_type jacobi -ksp_max_it 100

where we have generated the FEM header using

./bin/pythonscripts/PetscGenerateFEMQuadrature.py 2 1 2 1 laplacian
2 1 1 1 gradient src/snes/examples/tutorials/ex62.h

M. Knepley (UC)

PETSc

CNRS 12

53 / 156

Unstructured Meshes (DMPlex)

Mesh Refinement
DMPlex can refine a mesh using
DMRefine(DM dm, MPI_Comm comm, DM *refdm);

using 3rd party generators, or parallel uniform refinement .

We view the mesh from SNES ex62 which makes ex62_sol.vtk
./ex62 -dim 3 -refinement_limit 0.0 -pc_type jacobi
./ex62 -dim 3 -refinement_limit 0.001 -pc_type jacobi
mpiexec -n 3 ./ex62 -dim 3 -refinement_limit 0.001
-dm_view_partition -pc_type jacobi -ksp_max_it 100

where we have generated the FEM header using

./bin/pythonscripts/PetscGenerateFEMQuadrature.py 3 1 3 1 laplacian
3 1 1 1 gradient src/snes/examples/tutorials/ex62.h

M. Knepley (UC)

PETSc

CNRS 12

53 / 156

Unstructured Meshes (DMPlex)

Mesh Partitioning

DMPlex can partition an existing mesh

DMPlexCreatePartition(DM dm, PetscInt height, PetscBool enlarge,
PetscSection *partSection, IS *partition,
PetscSection *origPartSection, IS *origPartition);

which dispatches to 3rd party mesh partitioners.

Normally
not called by users
needs DMPlexCreatePartitionClosure()
serial

M. Knepley (UC)

PETSc

CNRS 12

54 / 156

Unstructured Meshes (DMPlex)

Mesh Distribution

DMPlex can distribute an existing mesh

DMPlexDistribute(DM dm, const char partitioner[],
PetscInt overlap, DM *dmParallel)

Calls DMPlexCreatePartition()
Distributes coordinates and labels
Creates PetscSF for the point distribution

M. Knepley (UC)

PETSc

CNRS 12

55 / 156

Unstructured Meshes (DMPlex)

Mesh Labels
DMLabel marks mesh points
Markers are PetscInt
Bi-directional queries
DMLabelGetValue() for a point
DMLabelGetStratumIS() for a marker

Search optimization using DMLabelCreateIndex()

They can be used to:
Define submeshes, perhaps of lower dimension
Set material properties
Mark ghost elements

M. Knepley (UC)

PETSc

CNRS 12

56 / 156

Managing Discretized Data

Outline

Managing Discretized Data

FD
PetscSection
FEM
FVM

Advanced Solvers

M. Knepley (UC)

PETSc

CNRS 12

57 / 156

Managing Discretized Data

Outline

Managing Discretized Data

FD
PetscSection
FEM
FVM

M. Knepley (UC)

PETSc

CNRS 12

58 / 156

Managing Discretized Data

Raw Array Access

You can get a multidimensional array from vector data:

DMDAVecGetArray(DM da, Vec v, void *array);

where the array dimension is taken from the DM

For a multicomponent DM, using
DMDAVecGetArrayDOF(DM da, Vec v, void *array);

will add one extra dimension for components.

M. Knepley (UC)

PETSc

CNRS 12

59 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRows()
MatZeroRowsIS()
MatZeroRowsLocal()
MatZeroRowsStencil()

DM Method:

M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRows()
MatZeroRowsIS()
MatZeroRowsLocal()
MatZeroRowsStencil()

DM Method:

M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRows()
MatZeroRowsIS()
MatZeroRowsLocal()
MatZeroRowsStencil()

DM Method:

M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRows()
MatZeroRowsIS()
MatZeroRowsLocal()
MatZeroRowsStencil()

DM Method:

M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRows()
MatZeroRowsIS()
MatZeroRowsLocal()
MatZeroRowsStencil()

DM Method:

M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRowsColumns()
MatZeroRowsColumnsIS()
MatZeroRowsColumnsLocal()
MatZeroRowsColumnsStencil()

DM Method:

M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRowsColumns()
MatZeroRowsColumnsIS()
MatZeroRowsColumnsLocal()
MatZeroRowsColumnsStencil()

DM Method:
Check stencil for boundary points, Set rhs values to boundary values
/* Test whether we are on the top edge of the global array */
if (info->ys+info->ym == info->my) {
j = info->my - 1;
/* top edge */
for (i = info->xs; i < info->xs+info->xm; ++i) {
f[j][i].u
= x[j][i].u - lid;
f[j][i].v
= x[j][i].v;
f[j][i].omega = x[j][i].omega + (x[j][i].u - x[j-1][i].u)*dhy;
f[j][i].temp = x[j][i].temp-x[j-1][i].temp;
}
}
M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

Dirichlet conditions
Manual Method:
Set rhs values to boundary values
MatZeroRowsColumns()
MatZeroRowsColumnsIS()
MatZeroRowsColumnsLocal()
MatZeroRowsColumnsStencil()

DM Method:
Check stencil for boundary points, Set Jacobian rows to the identity
for (j=info->ys; j<info->ys+info->ym; ++j) {
for (i=info->xs; i<info->xs+info->xm; ++i) {
row.j = j; row.i = i;
/* boundary points */
if (i == 0 || j == 0 || i == info->mx-1 || j == info->my-1) {
v[0] = 2.0*(hydhx + hxdhy);
MatSetValuesStencil(jac,1,&row,1,&row,v,INSERT_VALUES);
}
}
}
M. Knepley (UC)

PETSc

CNRS 12

60 / 156

Managing Discretized Data

SNES Example
Driven Cavity

Velocity-vorticity formulation
Flow driven by lid and/or bouyancy
Logically regular grid
Parallelized with DMDA

Finite difference discretization

Authored by David Keyes
$PETCS_DIR/src/snes/examples/tutorials/ex19.c
M. Knepley (UC)

PETSc

CNRS 12

61 / 156

Managing Discretized Data

Driven Cavity Application Context

typedef struct {
/*----- basic application data
PetscReal lid_velocity;
PetscReal prandtl
PetscReal grashof;
PetscBool draw_contours;
} AppCtx;

-----*/

$PETCS_DIR/src/snes/examples/tutorials/ex19.c

M. Knepley (UC)

PETSc

CNRS 12

62 / 156

Managing Discretized Data

Driven Cavity Residual Evaluation

Residual(SNES snes, Vec X, Vec F, void *ptr) {
AppCtx
*user = (AppCtx *) ptr;
/* local starting and ending grid points */
PetscInt
istart, iend, jstart, jend;
PetscScalar
*f; /* local vector data */
PetscReal
grashof = user->grashof;
PetscReal
prandtl = user->prandtl;
PetscErrorCode ierr;
/* Code to communicate nonlocal ghost point data */
VecGetArray(F, &f);
/* Code to compute local function components */
VecRestoreArray(F, &f);
return 0;
}
PETSc
$PETCS_DIR/src/snes/examples/tutorials/ex19.c

M. Knepley (UC)

CNRS 12

63 / 156

Managing Discretized Data

Better Driven Cavity Residual Evaluation

ResLocal(DMDALocalInfo *info,
PetscScalar **x, PetscScalar **f, void *ctx)
{
for(j = info->ys; j < info->ys+info->ym; ++j) {
for(i = info->xs; i < info->xs+info->xm; ++i) {
u
= x[j][i];
uxx = (2.0*u - x[j][i-1] - x[j][i+1])*hydhx;
uyy = (2.0*u - x[j-1][i] - x[j+1][i])*hxdhy;
f[j][i].u = uxx + uyy - .5*(x[j+1][i].omega-x[j-1][i].omega)*hx;
f[j][i].v = uxx + uyy + .5*(x[j][i+1].omega-x[j][i-1].omega)*hy;
f[j][i].omega = uxx + uyy +
(vxp*(u - x[j][i-1].omega) + vxm*(x[j][i+1].omega - u))*hy +
(vyp*(u - x[j-1][i].omega) + vym*(x[j+1][i].omega - u))*hx 0.5*grashof*(x[j][i+1].temp - x[j][i-1].temp)*hy;
f[j][i].temp = uxx + uyy + prandtl*
((vxp*(u - x[j][i-1].temp) + vxm*(x[j][i+1].temp - u))*hy +
(vyp*(u - x[j-1][i].temp) + vym*(x[j+1][i].temp - u))*hx);
}}}

$PETCS_DIR/src/snes/examples/tutorials/ex19.c

M. Knepley (UC)

PETSc

CNRS 12

64 / 156

Managing Discretized Data

PetscSection

Outline

Managing Discretized Data

FD
PetscSection
FEM
FVM

M. Knepley (UC)

PETSc

CNRS 12

65 / 156

Managing Discretized Data

PetscSection

PetscSection
What Is It?

Similar to PetscLayout, maps point (size, offset)

Processes are replaced by points
Also what we might use for multicore PetscLayout

Boundary conditions are just another PetscSection

Map points to number of constrained dofs
Offsets into integer array of constrained local dofs

Fields are just another PetscSection

Map points to number of field dofs
Offsets into array with all fields

Usable by all DM subclasses

Structured grids with DMDA
Unstructured grids with DMPlex
M. Knepley (UC)

PETSc

CNRS 12

66 / 156

Managing Discretized Data

PetscSection

PetscSection
Why Use It?

PETSc Solvers only understand Integers

Decouples Mesh From Discretization
Mesh does not need to know how dofs are generated,
just how many are attached to each point.
It does not matter whether you use FD, FVM, FEM, etc.

Decouples Mesh from Solver

Solver gets the data layout and partitioning from Vec and Mat,
nothing else from the mesh.
Solver gets restriction/interpolation matrices from DM.

Decouples Discretization from Solver

Solver only gets the field division, nothing else from discretization.
M. Knepley (UC)

PETSc

CNRS 12

67 / 156

Managing Discretized Data

PetscSection

PetscSection
How Do I Build One?

High Level Interface

DMComplexCreateSection(
DM dm, PetscInt dim, PetscInt numFields,
PetscInt numComp[], PetscInt numDof[],
PetscInt numBC, PetscInt bcField[], IS bcPoints[],
PetscSection *section);

Discretization
P1 P0
Q2 Q1
Q2 P1disc

M. Knepley (UC)

Dof/Dimension
[2 0 0 0 | 0 0 0 1]
[2 2 0 0 | 1 0 0 0]
[2 2 0 0 | 0 0 0 3]

PETSc

CNRS 12

68 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

maps p (off , off + 1, . . . , off + dof )
where p [pStart, pEnd), called the chart
ranges can be divided into parts, called fields
prefix sums calculated automatically on setup

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

PetscSectionGetOffset(), PetscSectionGetDof()

where p [pStart, pEnd), called the chart

ranges can be divided into parts, called fields
prefix sums calculated automatically on setup

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

maps p (off , off + 1, . . . , off + dof )
where p [pStart, pEnd), called the chart
ranges can be divided into parts, called fields
prefix sums calculated automatically on setup

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

maps p (off , off + 1, . . . , off + dof )
PetscSectionGetChart()

ranges can be divided into parts, called fields

prefix sums calculated automatically on setup

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

maps p (off , off + 1, . . . , off + dof )
where p [pStart, pEnd), called the chart
ranges can be divided into parts, called fields
prefix sums calculated automatically on setup

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

maps p (off , off + 1, . . . , off + dof )
where p [pStart, pEnd), called the chart
PetscSectionGetFieldOffset(), PetscSectionGetFieldDof()

prefix sums calculated automatically on setup

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

maps p (off , off + 1, . . . , off + dof )
where p [pStart, pEnd), called the chart
ranges can be divided into parts, called fields
prefix sums calculated automatically on setup

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Data Layout

PetscSection defines a data layout

maps p (off , off + 1, . . . , off + dof )
where p [pStart, pEnd), called the chart
ranges can be divided into parts, called fields
PetscSectionSetUp()

M. Knepley (UC)

PETSc

CNRS 12

69 / 156

Managing Discretized Data

PetscSection

Using PetscSection
PetscSection can be used to segment data
Use Vec and IS to store data
Use point p instead of index i
Maps to a set of values instead of just one
We provide a convenience method for extraction

VecGetValuesSection(Vec v, PetscSection s, PetscInt p, PetscScalar **a);

which works in an analogous way to

MatSetValuesStencil(Mat A, PetscInt nr, const MatStencil rs[],
PetscInt nc, const MatStencil cs[],
const PetscScalar v[], InsertMode m);

M. Knepley (UC)

PETSc

CNRS 12

70 / 156

Managing Discretized Data

PetscSection

Using PetscSection
PetscSection can be used to segment data
Use Vec and IS to store data
Use point p instead of index i
Maps to a set of values instead of just one
We can get the layout of coordinates over the mesh
DMPlexGetCoordinateSection(DM dm, PetscSection *s);

where the data is stored in a Vec

DMGetCoordinates(DM dm, Vec *coords);

M. Knepley (UC)

PETSc

CNRS 12

70 / 156

Managing Discretized Data

PetscSection

Using PetscSection
PetscSection can be used to segment data
Use Vec and IS to store data
Use point p instead of index i
Maps to a set of values instead of just one
We can retrieve FEM data from vector without complicated indexing,
DMPlexVecGetClosure(DM dm, PetscSection s, Vec v,
PetscInt cell, PetscInt *, PetscScalar *a[]);

and the same thing works for matrices

DMPlexMatSetClosure(DM dm, PetscSection rs, PetscSection cs, Mat A,
PetscInt p, const PetscScalar v[], InsertMode m);

M. Knepley (UC)

PETSc

CNRS 12

70 / 156

Managing Discretized Data

PetscSection

Constraints

PetscSection allows some dofs to be marked as constrained:

specify the number of constraints for each point,
and their pointwise offsets.
typically used for Dirichlet conditions

M. Knepley (UC)

PETSc

CNRS 12

71 / 156

Managing Discretized Data

PetscSection

Constraints

PetscSection allows some dofs to be marked as constrained:

PetscSectionGetConstraintDof(),
PetscSectionGetFieldConstraintDof()

and their pointwise offsets.

typically used for Dirichlet conditions

M. Knepley (UC)

PETSc

CNRS 12

71 / 156

Managing Discretized Data

PetscSection

Constraints

PetscSection allows some dofs to be marked as constrained:

specify the number of constraints for each point,
and their pointwise offsets.
typically used for Dirichlet conditions

M. Knepley (UC)

PETSc

CNRS 12

71 / 156

Managing Discretized Data

PetscSection

Constraints

PetscSection allows some dofs to be marked as constrained:

specify the number of constraints for each point,
PetscSectionGetConstraintIndices(),
PetscSectionGetFieldConstraintIndices()

typically used for Dirichlet conditions

M. Knepley (UC)

PETSc

CNRS 12

71 / 156

Managing Discretized Data

PetscSection

Constraints

PetscSection allows some dofs to be marked as constrained:

specify the number of constraints for each point,
and their pointwise offsets.
typically used for Dirichlet conditions

M. Knepley (UC)

PETSc

CNRS 12

71 / 156

Managing Discretized Data

PetscSection

Constraints

PetscSection allows some dofs to be marked as constrained:

specify the number of constraints for each point,
and their pointwise offsets.
PetscSectionCreateGlobalSection()

M. Knepley (UC)

PETSc

CNRS 12

71 / 156

Managing Discretized Data

PetscSection

Global Sections

A global section
has no constrained dofs
has only shared dofs which are owned
is layout for DM global vectors
and can be created using
PetscSectionCreateGlobalSection(PetscSection s, PetscSF pointSF,
PetscBool includeConstraints,
PetscSection *gs);

M. Knepley (UC)

PETSc

CNRS 12

72 / 156

Managing Discretized Data

PetscSection

Interaction with PetscSF

We use PetscSF to describe shared points

Composing a point PetscSF and PetscSection, we can build
a global section
a PetscSF for shared dofs
This composability means we can build hierarchies of sections and
pieces of sections.

M. Knepley (UC)

PETSc

CNRS 12

73 / 156

Managing Discretized Data

PetscSection

Interaction with PetscSF

We use PetscSF to describe shared points

Composing a point PetscSF and PetscSection, we can build
PetscSectionCreateGlobalSection()

a PetscSF for shared dofs

This composability means we can build hierarchies of sections and
pieces of sections.

M. Knepley (UC)

PETSc

CNRS 12

73 / 156

Managing Discretized Data

PetscSection

Interaction with PetscSF

We use PetscSF to describe shared points

Composing a point PetscSF and PetscSection, we can build
a global section
a PetscSF for shared dofs
This composability means we can build hierarchies of sections and
pieces of sections.

M. Knepley (UC)

PETSc

CNRS 12

73 / 156

Managing Discretized Data

PetscSection

Interaction with PetscSF

We use PetscSF to describe shared points

Composing a point PetscSF and PetscSection, we can build
a global section
PetscSFCreateSectionSF()

This composability means we can build hierarchies of sections and

pieces of sections.

M. Knepley (UC)

PETSc

CNRS 12

73 / 156

Managing Discretized Data

PetscSection

Subsections
A PetscSection can also be broken apart to represent smaller pieces
of the problem for
subsolves
output
postprocessing
A subsection can be extracted
from a subset of fields
from a subset of points

M. Knepley (UC)

PETSc

CNRS 12

74 / 156

Managing Discretized Data

PetscSection

Subsections
A PetscSection can also be broken apart to represent smaller pieces
of the problem for
subsolves
output
postprocessing
A subsection can be extracted
PetscSectionCreateSubsection()

from a subset of points

M. Knepley (UC)

PETSc

CNRS 12

74 / 156

Managing Discretized Data

PetscSection

M. Knepley (UC)

PETSc

CNRS 12

74 / 156

Managing Discretized Data

PetscSection

M. Knepley (UC)

PETSc

CNRS 12

74 / 156

Managing Discretized Data

PetscSection

Residual Evaluation
I developed a single residual evaluation routine independent of
spatial dimension, cell geometry, and finite element:

F (~u ) = 0
Dim
1
2
3
6

Cell Types
Simplex
Tensor Product
Polyhedral
Prism

Peter Brune, ANL

FEniCS Project

Blaise Bourdin, LSU

Discretizations
Lagrange FEM
H(div) FEM
H(curl) FEM
DG FEM

We have also implemented a polyhedral FVM,

but this required changes to the residual evaluation.
M. Knepley (UC)

PETSc

CNRS 12

75 / 156

Managing Discretized Data

PetscSection

Residual Evaluation
I developed a single residual evaluation routine independent of
spatial dimension, cell geometry, and finite element:

F (~u ) = 0
Dim
1
2
3
6

Cell Types
Simplex
Tensor Product
Polyhedral
Prism

Peter Brune, ANL

FEniCS Project

Blaise Bourdin, LSU

Discretizations
Lagrange FEM
H(div) FEM
H(curl) FEM
DG FEM

We have also implemented a polyhedral FVM,

but this required changes to the residual evaluation.
M. Knepley (UC)

PETSc

CNRS 12

75 / 156

Managing Discretized Data

PetscSection

Residual Evaluation
I developed a single residual evaluation routine independent of
spatial dimension, cell geometry, and finite element:

F (~u ) = 0
Dim
1
2
3
6

Cell Types
Simplex
Tensor Product
Polyhedral
Prism

Peter Brune, ANL

FEniCS Project

Blaise Bourdin, LSU

Discretizations
Lagrange FEM
H(div) FEM
H(curl) FEM
DG FEM

We have also implemented a polyhedral FVM,

but this required changes to the residual evaluation.
M. Knepley (UC)

PETSc

CNRS 12

75 / 156

Managing Discretized Data

PetscSection

Residual Evaluation
I developed a single residual evaluation routine independent of
spatial dimension, cell geometry, and finite element:

F (~u ) = 0
Dim
1
2
3
6

Cell Types
Simplex
Tensor Product
Polyhedral
Prism

Peter Brune, ANL

FEniCS Project

Blaise Bourdin, LSU

Discretizations
Lagrange FEM
H(div) FEM
H(curl) FEM
DG FEM

We have also implemented a polyhedral FVM,

but this required changes to the residual evaluation.
M. Knepley (UC)

PETSc

CNRS 12

75 / 156

Managing Discretized Data

PetscSection

Residual Evaluation
I developed a single residual evaluation routine independent of
spatial dimension, cell geometry, and finite element:

F (~u ) = 0
Dim
1
2
3
6

Cell Types
Simplex
Tensor Product
Polyhedral
Prism

Peter Brune, ANL

FEniCS Project

Blaise Bourdin, LSU

Discretizations
Lagrange FEM
H(div) FEM
H(curl) FEM
DG FEM

We have also implemented a polyhedral FVM,

but this required changes to the residual evaluation.
M. Knepley (UC)

PETSc

CNRS 12

75 / 156

Managing Discretized Data

FEM

Outline

Managing Discretized Data

FD
PetscSection
FEM
FVM

M. Knepley (UC)

PETSc

CNRS 12

76 / 156

Managing Discretized Data

FEM

DMComplex
SNES ex62

P2 P1 Stokes Example
v2
e3

e2
v0

e4
v1

Naively, we have
cl(cell) = [fe0 e1 e2 v0 v1 v2

x(cell) = [ue0 ve0 ue1 ve1 ue2 ve2

uv0 vv0 pv0 uv1 vv1 pv1 uv2 vv2 pv2
M. Knepley (UC)

PETSc

]
CNRS 12

77 / 156

Managing Discretized Data

FEM

DMComplex
SNES ex62

P2 P1 Stokes Example
v2
e3

e2
v0

e4
v1

We reorder so that fields are contiguous

x 0 (cell) = [ue0 ve0 ue1 ve1 ue2 ve2
uv0 vv0 uv1 vv1 uv2 vv2
pv0 pv1 pv2
M. Knepley (UC)

PETSc

]
CNRS 12

77 / 156

Managing Discretized Data

FEM

FEM Integration Model

Proposed by Jed Brown

We consider weak forms dependent only on fields and gradients,

Z
f0 (u, u) + : ~f1 (u, u) = 0.

(1)

Discretizing we have
"
#
X
X
T
T
q
q
q
T
q~ k
q
q
Ee B W f0 (u , u ) +
Dk W f1 (u , u ) = 0
e

fn
uq
Wq
B,D
E
M. Knepley (UC)

(2)

pointwise physics functions

field at a quad point
diagonal matrix of quad weights
basis function matrices which
reduce over quad points
assembly operator
PETSc

CNRS 12

78 / 156

Managing Discretized Data

FEM

Batch Integration

DMPlexComputeResidualFEM(dm, X, F, user)
{
VecSet(F, 0.0);
<Put boundary conditions into local input vector>
<Extract coefficients and geometry for batch>
<Integrate batch of elements>
<Insert batch of element vectors into global vector>
}

M. Knepley (UC)

PETSc

CNRS 12

79 / 156

Managing Discretized Data

FEM

Batch Integration
Set boundary conditions

DMPlexComputeResidualFEM(dm, X, F, user)
{
VecSet(F, 0.0);
DMPlexProjectFunctionLocal(dm, numComponents,
bcFuncs, INSERT_BC_VALUES, X);
<Extract coefficients and geometry for batch>
<Integrate batch of elements>
<Insert batch of element vectors into global vector>
}

M. Knepley (UC)

PETSc

CNRS 12

80 / 156

Managing Discretized Data

FEM

Batch Integration
Extract coefficients and geometry

DMPlexComputeResidualFEM(dm, X, F, user)
{
VecSet(F, 0.0);
<Put boundary conditions into local input vector>
DMPlexGetHeightStratum(dm, 0, &cStart, &cEnd);
for (c = cStart; c < cEnd; ++c) {
DMPlexComputeCellGeometry(dm, c, &v0[c*dim],
&J[c*dim*dim], &invJ[c*dim*dim], &detJ[c]);
DMPlexVecGetClosure(dm, NULL, X, c, NULL, &x);
for (i = 0; i < cellDof; ++i) u[c*cellDof+i] = x[i];
DMPlexVecRestoreClosure(dm, NULL, X, c, NULL, &x);
}
<Integrate batch of elements>
<Insert batch of element vectors into global vector>
}

M. Knepley (UC)

PETSc

CNRS 12

81 / 156

Managing Discretized Data

FEM

Batch Integration
Integrate element batch

DMPlexComputeResidualFEM(dm, X, F, user)
{
VecSet(F, 0.0);
<Put boundary conditions into local input vector>
<Extract coefficients and geometry for batch>
for (field = 0; field < numFields; ++field) {
(*mesh->integrateResidualFEM)(Ne, numFields, field,
quad, u,
v0, J, invJ, detJ,
f0, f1, elemVec);
(*mesh->integrateResidualFEM)(Nr, ...);
}
<Insert batch of element vectors into global vector>
}

M. Knepley (UC)

PETSc

CNRS 12

82 / 156

Managing Discretized Data

FEM

Batch Integration
Insert element vectors

DMPlexComputeResidualFEM(dm, X, F, user)
{
VecSet(F, 0.0);
<Put boundary conditions into local input vector>
<Extract coefficients and geometry for batch>
<Integrate batch of elements>
for (c = cStart; c < cEnd; ++c) {
DMPlexVecSetClosure(dm, NULL, F, c,
&elemVec[c*cellDof], ADD_VALUES);
}
}

M. Knepley (UC)

PETSc

CNRS 12

83 / 156

Managing Discretized Data

FEM

Element Integration

FEMIntegrateResidualBatch(Ne, numFields, field,

quad[], coefficients[],
v0s[], jacobians[], jacobianInv[], jacobianDet[],
f0_func, f1_func)
{
<Loop over batch of elements (e)>
<Loop over quadrature points (q)>
<Make x_q>
<Make u_q and gradU_q>
<Call f_0 and f_1>
<Loop over element vector entries (f, fc)>
<Add contributions from f_0 and f_1>
}

M. Knepley (UC)

PETSc

CNRS 12

84 / 156

Managing Discretized Data

FEM

Element Integration
Calculate xq

FEMIntegrateResidualBatch(...)
{
<Loop over batch of elements (e)>
<Loop over quadrature points (q)>
for (d = 0; d < dim; ++d) {
x[d] = v0[d];
for (d2 = 0; d2 < dim; ++d2) {
x[d] += J[d*dim+d2]*(quadPoints[q*dim+d2]+1);
}
}
<Make x_q>
<Make u_q and gradU_q>
<Call f_0 and f_1>
<Loop over element vector entries (f, fc)>
<Add contributions from f_0 and f_1>
}

M. Knepley (UC)

PETSc

CNRS 12

85 / 156

Managing Discretized Data

FEM

Element Integration
Calculate uq and uq
FEMIntegrateResidualBatch(...)
{
<Loop over batch of elements (e)>
<Loop over quadrature points (q)>
<Make x_q>
for (f = 0; f < numFields; ++f) {
for (b = 0; b < Nb; ++b) {
for (comp = 0; comp < Ncomp; ++comp) {
u[comp] += coefficients[cidx]*basis[q+cidx];
for (d = 0; d < dim; ++d) {
<Transform derivative to real space>
gradU[comp*dim+d] +=
coefficients[cidx]*realSpaceDer[d];
}
}
}
}
<Call f_0 and f_1>
<Loop over element vector entries (f, fc)>
from PETSc
f_0 and f_1>
M.<Add
Knepley contributions
(UC)
CNRS 12

86 / 156

Managing Discretized Data

FEM

Element Integration
Calculate uq and uq
FEMIntegrateResidualBatch(...)
{
<Loop over batch of elements (e)>
<Loop over quadrature points (q)>
<Make x_q>
for (f = 0; f < numFields; ++f) {
for (b = 0; b < Nb; ++b) {
for (comp = 0; comp < Ncomp; ++comp) {
u[comp] += coefficients[cidx]*basis[q+cidx];
for (d = 0; d < dim; ++d) {
realSpaceDer[d] = 0.0;
for (g = 0; g < dim; ++g) {
realSpaceDer[d] +=
invJ[g*dim+d]*basisDer[(q+cidx)*dim+g];
}
gradU[comp*dim+d] +=
coefficients[cidx]*realSpaceDer[d];
}
}
}
M. Knepley (UC)
PETSc
CNRS 12

87 / 156

Managing Discretized Data

FEM

Element Integration
Call f0 and f1

FEMIntegrateResidualBatch(...)
{
<Loop over batch of elements (e)>
<Loop over quadrature points (q)>
<Make x_q>
<Make u_q and gradU_q>
f0_func(u, gradU, x, &f0[q*Ncomp]);
for (i = 0; i < Ncomp; ++i) {
f0[q*Ncomp+i] *= detJ*quadWeights[q];
}
f1_func(u, gradU, x, &f1[q*Ncomp*dim]);
for (i = 0; i < Ncomp*dim; ++i) {
f1[q*Ncomp*dim+i] *= detJ*quadWeights[q];
}
<Loop over element vector entries (f, fc)>
<Add contributions from f_0 and f_1>
}
M. Knepley (UC)

PETSc

CNRS 12

88 / 156

Managing Discretized Data

FEM

Element Integration
Update element vector

FEMIntegrateResidualBatch(...)
{
<Loop over batch of elements (e)>
<Loop over quadrature points (q)>
<Make x_q>
<Make u_q and gradU_q>
<Call f_0 and f_1>
<Loop over element vector entries (f, fc)>
for (q = 0; q < Nq; ++q) {
elemVec[cidx] += basis[q+cidx]*f0[q+comp];
for (d = 0; d < dim; ++d) {
<Transform derivative to real space>
elemVec[cidx] +=
realSpaceDer[d]*f1[(q+comp)*dim+d];
}
}
}
M. Knepley (UC)

PETSc

CNRS 12

89 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

and PetscFEM structs
Specify integration methods for single field element batches
PetscQuadrature

Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
DMPlexSetFEMIntegration()

Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU
DMPlexComputeResidualFEM()

Compute a parallel, multifield Jacobian

Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

DMPlexComputeJacobianFEM()

and DMPlexComputeJacobianActionFEM()

Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
DMPlexComputeL2Diff()

Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
Compute an L2 projection into the element space

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Infrastructure

DMPlex provides support for FEM in the Brown Model:

Specify f0 and ~f1 , a quadrature rule, and an element tabulation
Specify integration methods for single field element batches
Have initial implementations for CPU and GPU

Compute a parallel, multifield residual

Compute a parallel, multifield Jacobian
Compute an L2 norm
DMPlexProjectFunction()

M. Knepley (UC)

PETSc

CNRS 12

90 / 156

Managing Discretized Data

FEM

FEM Geometry

The FEM infrastructure depends on DMPlexComputeCellGeometry()

Quantity
v0
J
invJ
detJ

Description
translation part of the affine map
Jacobian of the map from the reference element
inverse of the Jacobian
Jacobian determinant

It is likely that we will expand this set of geometric quantities.

M. Knepley (UC)

PETSc

CNRS 12

91 / 156

Managing Discretized Data

FEM

FIAT
Finite Element Integrator And Tabulator by Rob Kirby
http://fenicsproject.org/
FIAT understands
Reference element shapes (line, triangle, tetrahedron)
Quadrature rules
Polynomial spaces
Functionals over polynomials (dual spaces)
Derivatives
Can build arbitrary elements by specifying the Ciarlet triple (K , P, P 0 )
FIAT is part of the FEniCS project

M. Knepley (UC)

PETSc

CNRS 12

92 / 156

Managing Discretized Data

FEM

M. Knepley (UC)

PETSc

CNRS 12

92 / 156

Managing Discretized Data

FEM

Condition of the Laplacian

2D P1 Lagrange Elements

Num. Elements
64
128
256
512
256
1024
2048
4096
8192

Longest edge (h)

1/4
2/8
1/8

2/16
1/16

2/32
1/32

2/64
1/64

12.6
25.2
51.5
103.1
207.2
414.3
829.7
1659.4
3319.8

L2 error
0.0174
0.00607
0.00434
0.00153
0.00109
0.000381
0.000271
0.0000952
0.0000678

so we have
0.8h2
M. Knepley (UC)

PETSc

(3)
CNRS 12

93 / 156

Managing Discretized Data

FEM

Condition of the Laplacian

2D P2 Lagrange Elements

Num. Elements
64
128
256
512
256
1024
2048
4096
8192

Longest edge (h)

1/4
2/8
1/8

2/16
1/16

2/32
1/32

2/64
1/64

68.1
137.2
275.6
552.2
1105.6
2212.3
4425.7
8852.6
17708.1

L2 error
2.73e-11
1.64e-10
1.04e-09
7.74e-10
3.26e-09
3.22e-09
1.02e-08
1.13e-08
3.19e-08

so we have
4.3h2
M. Knepley (UC)

PETSc

(4)
CNRS 12

94 / 156

Managing Discretized Data

FEM

The Stokes Problem Strong Form

u + p
u
u|
Z
p

=f
=0
=g
=0

M. Knepley (UC)

PETSc

CNRS 12

95 / 156

Managing Discretized Data

FEM

The Stokes Problem Weak Form

For u, v V and p, q

< v , u > < v , p > =< v , f >

< q, u > = 0
u| = g
Z
p=0

M. Knepley (UC)

PETSc

CNRS 12

96 / 156

Managing Discretized Data

FEM

2D Exact Solution

u = x2 + y2
v = 2x 2 2xy
p =x +y 1
fi = 3

M. Knepley (UC)

PETSc

CNRS 12

97 / 156

Managing Discretized Data

FEM

3D Exact Solution

u = x2 + y2
v = y 2 + z2
w = x 2 + y 2 2(x + y )z
p = x + y + z 3/2
fi = 3

M. Knepley (UC)

PETSc

CNRS 12

98 / 156

Managing Discretized Data

FEM

Condition of the Stokes Operator

2D P2 /P1 Lagrange Elements

Num. Elements
64
128
256
512
256
1024
2048
4096
8192

Longest edge (h)

1/4
2/8
1/8

2/16
1/16

2/32
1/32

2/64
1/64

7909
29522
32300
119053
129883
466023
520163
1121260
2075950

L2 error
6.96e-07
1.41e-07
7.82e-07
1.27e-06
2.28e-06
4.99e-06
6.66e-06
2.97e-05
1.97e-05

so we have
700h2
M. Knepley (UC)

PETSc

(5)
CNRS 12

99 / 156

Managing Discretized Data

FEM

Jacobian
P2 /P1 elements

M. Knepley (UC)

PETSc

CNRS 12

100 / 156

Managing Discretized Data

FVM

Outline

Managing Discretized Data

FD
PetscSection
FEM
FVM

M. Knepley (UC)

PETSc

CNRS 12

101 / 156

Managing Discretized Data

FVM

FVM Geometry

The FVM infrastructure depends on DMPlexComputeCellGeometryFVM(),

which computes
Quantity
vol
centroid
normal

M. Knepley (UC)

Description
cell volume or face area
cell or face centroid
face normal or 0

PETSc

CNRS 12

102 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Physics

TS ex11.c
Advection
Shallow Water
Euler

M. Knepley (UC)

PETSc

CNRS 12

103 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Limiters

TS ex11.c
Minmod
van Leer
van Albada
Sin
Superbee
MC (Barth-Jespersen)

M. Knepley (UC)

PETSc

CNRS 12

104 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Physics Creation

PetscErrorCode PhysicsCreate_Advect(Model mod, Physics phys)

{
Physics_Advect *advect = (Physics_Advect *) phys->data;
const PetscInt inflowids[] = {100,200,300},outflowids[] = {101};
phys->field_desc = PhysicsFields_Advect;
phys->riemann
= PhysicsRiemann_Advect;
/* Register "canned" boundary conditions and defaults ids */
ModelBoundaryRegister(mod, "inflow", PhysicsBoundary_Advect_Inflow,
phys, ALEN(inflowids), inflowids);
ModelBoundaryRegister(mod, "outflow", PhysicsBoundary_Advect_Outflow,
phys, ALEN(outflowids), outflowids);
/* Initial/transient solution with default boundary conditions */
ModelSolutionSetDefault(mod, PhysicsSolution_Advect, phys);
/* Register "canned" functionals */
ModelFunctionalRegister(mod, "Error", &advect->functional.Error,
PhysicsFunctional_Advect, phys);
}

M. Knepley (UC)

PETSc

CNRS 12

105 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Physics Creation

static PetscErrorCode PhysicsCreate_SW(Model mod,Physics phys)

{
Physics_SW *sw = (Physics_SW *) phys->data;
const PetscInt wallids[] = {100,101,200,300};
phys->field_desc = PhysicsFields_SW;
phys->riemann
= PhysicsRiemann_SW;
phys->maxspeed
= PetscSqrtReal(2.0*sw->gravity); /* Mach 1 at depth 2
ModelBoundaryRegister(mod, "wall", PhysicsBoundary_SW_Wall,
phys, ALEN(wallids), wallids);
ModelSolutionSetDefault(mod, PhysicsSolution_SW, phys);
ModelFunctionalRegister(mod, "Height", &sw->functional.Height,
PhysicsFunctional_SW, phys);
ModelFunctionalRegister(mod, "Speed", &sw->functional.Speed,
PhysicsFunctional_SW, phys);
ModelFunctionalRegister(mod, "Energy", &sw->functional.Energy,
PhysicsFunctional_SW, phys);
}

M. Knepley (UC)

PETSc

CNRS 12

105 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Physics Creation

PetscErrorCode PhysicsCreate_Euler(Model mod, Physics phys)

{
PhysicsEuler *eu = (PhysicsEuler *) phys->data;
const PetscInt wallids[] = {100,101,200,300};
phys->field_desc = PhysicsFields_Euler;
phys->riemann
= PhysicsRiemann_Euler_Rusanov;
phys->maxspeed
= 1.0;
ModelBoundaryRegister(mod, "wall", PhysicsBoundary_Euler_Wall,
phys, ALEN(wallids), wallids);
ModelSolutionSetDefault(mod, PhysicsSolution_Euler, phys);
ModelFunctionalRegister(mod, "Speed", &eu->monitor.Speed,
PhysicsFunctional_Euler, phys);
ModelFunctionalRegister(mod, "Energy", &eu->monitor.Energy,
PhysicsFunctional_Euler, phys);
ModelFunctionalRegister(mod, "Density", &eu->monitor.Density,
PhysicsFunctional_Euler, phys);
ModelFunctionalRegister(mod, "Momentum", &eu->monitor.Momentum,
PhysicsFunctional_Euler, phys);
ModelFunctionalRegister(mod, "Pressure", &eu->monitor.Pressure,
PhysicsFunctional_Euler, phys);
}
M. Knepley (UC)

PETSc

CNRS 12

105 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Residual

We begin by localizing and applying boundary conditions:

RHSFunction(TS ts, PetscReal time, Vec X, Vec F, void *ctx) {
TSGetDM(ts, &dm);
DMGetLocalVector(dm, &locX);
DMGlobalToLocalBegin(dm, X, INSERT_VALUES, locX);
DMGlobalToLocalEnd(dm, X, INSERT_VALUES, locX);
ApplyBC(dm, time, locX, user);
VecZeroEntries(F);
(*user->RHSFunctionLocal)(dm, dmFace, dmCell, time, locX, F, user);
DMRestoreLocalVector(dm, &locX);
}

M. Knepley (UC)

PETSc

CNRS 12

106 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Residual

By default, we call the Riemann solver for local faces:

RHSFunctionLocal_Upwind(DM dm, DM dmFace, DM dmCell, PetscReal time,
Vec locX, Vec F, User user) {
DMPlexGetHeightStratum(dm, 1, &fStart, &fEnd);
for (face = fStart; face < fEnd; ++face) {
DMPlexGetLabelValue(dm, "ghost", face, &ghost);
if (ghost >= 0) continue;
DMPlexGetSupport(dm, face, &cells);
DMPlexPointLocalRead(dmFace, face, facegeom, &fg);
DMPlexPointLocalRead(dmCell, cells[0], cellgeom, &cgL);
DMPlexPointLocalRead(dmCell, cells[1], cellgeom, &cgR);
DMPlexPointLocalRead(dm, cells[0], x, &xL);
DMPlexPointLocalRead(dm, cells[1], x, &xR);
DMPlexPointGlobalRef(dm, cells[0], f, &fL);
DMPlexPointGlobalRef(dm, cells[1], f, &fR);
(*phys->riemann)(phys, fg->centroid, fg->normal, xL, xR, flux);
for (i = 0; i < phys->dof; ++i) {
if (fL) fL[i] -= flux[i] / cgL->volume;
if (fR) fR[i] += flux[i] / cgR->volume;
}
}
}
M. Knepley (UC)

PETSc

CNRS 12

106 / 156

Managing Discretized Data

FVM

Second Order TVD Finite Volume Method

Boundary Conditions

Boundary conditions are applied on marked faces

PetscErrorCode ApplyBC(DM dm, PetscReal time, Vec locX, User user) {
VecGetArrayRead(user->facegeom, &facegeom);
VecGetArray(locX, &x);
for (fs = 0; fs < numFS; ++fs) {
ModelBoundaryFind(mod,ids[fs],&bcFunc,&bcCtx);
DMPlexGetStratumIS(dm, name, ids[fs], &faceIS);
ISGetLocalSize(faceIS, &numFaces);
ISGetIndices(faceIS, &faces);
for (f = 0; f < numFaces; ++f) {
const PetscInt face = faces[f], *cells;
DMPlexPointLocalRead(dmFace, face, facegeom, &fg);
DMPlexGetSupport(dm, face, &cells);
DMPlexPointLocalRead(dm, cells[0], x, &xI);
DMPlexPointLocalRef(dm, cells[1], x, &xG);
(*bcFunc)(mod, time, fg->centroid, fg->normal, xI, xG, bcCtx);
}
}
}

M. Knepley (UC)

PETSc

CNRS 12

107 / 156

Advanced Solvers

Outline

Managing Discretized Data

Advanced Solvers
Fieldsplit
Multigrid
Nonlinear Solvers
Timestepping

M. Knepley (UC)

PETSc

CNRS 12

108 / 156

Advanced Solvers

The Great Solver Schism: Monolithic or Split?

Monolithic
Direct solvers

Split

Coupled Schwarz

Physics-split Schwarz
(based on relaxation)

Coupled Neumann-Neumann
(need unassembled matrices)

Physics-split Schur
(based on factorization)
approximate commutators
SIMPLE, PCD, LSC
segregated smoothers
Augmented Lagrangian
parabolization for stiff
waves

Coupled multigrid
X Need to understand local
spectral and compatibility
properties of the coupled
system

X Need to understand global

coupling strengths
Preferred data structures depend on which method is used.
Interplay with geometric multigrid.
M. Knepley (UC)

PETSc

CNRS 12

109 / 156

Advanced Solvers

User Solve

MPI_Comm comm;
SNES snes;
DM dm;
Vec u;
SNESCreate(comm, &snes);
SNESSetDM(snes, dm);
SNESSetFromOptions(snes);
DMCreateGlobalVector(dm, &u);
SNESSolve(snes, NULL, u);

M. Knepley (UC)

PETSc

CNRS 12

110 / 156

Advanced Solvers

Fieldsplit

Outline

Advanced Solvers
Fieldsplit
Multigrid
Nonlinear Solvers
Timestepping

M. Knepley (UC)

PETSc

CNRS 12

111 / 156

Advanced Solvers

Fieldsplit

FieldSplit Preconditioner

Analysis
Use ISes to define fields
Decouples PC from problem definition

Synthesis
Additive, Multiplicative, Schur
Commutes with Multigrid

M. Knepley (UC)

PETSc

CNRS 12

112 / 156

Advanced Solvers

Fieldsplit

FieldSplit Customization
Analysis
-pc_fieldsplit_<split num>_fields 2,1,5
-pc_fieldsplit_detect_saddle_point

Synthesis
-pc_fieldsplit_type
-pc_fieldsplit_real_diagonal
Use diagonal blocks of operator to build PC

Schur complements
-pc_fieldsplit_schur_precondition
<self,user,diag>
How to build preconditioner for S
-pc_fieldsplit_schur_factorization_type
<diag,lower,upper,full>
Which off-diagonal parts of the block factorization to use
M. Knepley (UC)

PETSc

CNRS 12

113 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh

M. Knepley (UC)

A
BT

B
0

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Block-Jacobi (Exact)

-ksp_type gmres -pc_type fieldsplit -pc_fieldsplit_type additive

-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_type preonly -fieldsplit_pressure_pc_type jacobi

A 0
0 I

M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Block-Jacobi (Inexact)

-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type additive

-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_ksp_type preonly -fieldsplit_pressure_pc_type jacobi

0
A
0 I

M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Gauss-Seidel (Inexact)

-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type multiplicative

-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_ksp_type preonly -fieldsplit_pressure_pc_type jacobi

B
A
0 I

M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Gauss-Seidel (Inexact)

-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type multiplicative

-pc_fieldsplit_0_fields 1 -pc_fieldsplit_1_fields 0
-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_ksp_type preonly -fieldsplit_pressure_pc_type jacobi

M. Knepley (UC)

I BT

0 A

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Diagonal Schur Complement
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type diag
-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none

0
A

0 S

M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Lower Schur Complement
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type lower
-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none

A
BT

M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Upper Schur Complement
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type upper
-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type gamg
-fieldsplit_pressure_ksp_type minres -fieldsplit_pressure_pc_type none

B
A

M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Uzawa
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type upper
-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_type richardson
-fieldsplit_pressure_ksp_max_its 1

A B

M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Full Schur Complement
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type full
-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi

M. Knepley (UC)

I
0
A 0
I A1 B
0 S
B T A1 I
0
I

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
SIMPLE
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type full
-fieldsplit_velocity_ksp_type preonly -fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-10 -fieldsplit_pressure_pc_type jacobi
-fieldsplit_pressure_inner_ksp_type preonly
-fieldsplit_pressure_inner_pc_type jacobi
-fieldsplit_pressure_upper_ksp_type preonly
-fieldsplit_pressure_upper_pc_type jacobi

M. Knepley (UC)

A
0
I
0
I DA1 B
B T A1 I
0 B T DA1 B
0
I

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex62: P2 /P1 Stokes Problem on Unstructured Mesh
Least-Squares Commutator
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type full
-pc_fieldsplit_schur_precondition self
-fieldsplit_velocity_ksp_type gmres -fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-5 -fieldsplit_pressure_pc_type lsc

M. Knepley (UC)

A
0
I
0
I A1 B
LSC
B T A1 I
0
I
0 S

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex31: P2 /P1 Stokes Problem with Temperature on Unstructured Mesh
Additive Schwarz + Full Schur Complement
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type additive
-pc_fieldsplit_0_fields 0,1 -pc_fieldsplit_1_fields 2
-fieldsplit_0_ksp_type fgmres -fieldsplit_0_pc_type fieldsplit
-fieldsplit_0_pc_fieldsplit_type schur
-fieldsplit_0_pc_fieldsplit_schur_factorization_type full
-fieldsplit_0_fieldsplit_velocity_ksp_type preonly
-fieldsplit_0_fieldsplit_velocity_pc_type lu
-fieldsplit_0_fieldsplit_pressure_ksp_rtol 1e-10
-fieldsplit_0_fieldsplit_pressure_pc_type jacobi
-fieldsplit_temperature_ksp_type preonly
-fieldsplit_temperature_pc_type lu

0
A
I
0
I A1 B
0
T 1

I
0
I
B A

0 S
0
LT
M. Knepley (UC)

PETSc

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

Solver Configuration: No New Code

ex31: P2 /P1 Stokes Problem with Temperature on Unstructured Mesh
Upper Schur Comp. + Full Schur Comp. + Least-Squares Comm.
-ksp_type fgmres -pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_0_fields 0,1 -pc_fieldsplit_1_fields 2
-pc_fieldsplit_schur_factorization_type upper
-fieldsplit_0_ksp_type fgmres -fieldsplit_0_pc_type fieldsplit
-fieldsplit_0_pc_fieldsplit_type schur
-fieldsplit_0_pc_fieldsplit_schur_factorization_type full
-fieldsplit_0_fieldsplit_velocity_ksp_type preonly
-fieldsplit_0_fieldsplit_velocity_pc_type lu
-fieldsplit_0_fieldsplit_pressure_ksp_rtol 1e-10
-fieldsplit_0_fieldsplit_pressure_pc_type jacobi
-fieldsplit_temperature_ksp_type gmres
-fieldsplit_temperature_pc_type lsc!

0
A
I
0
I A1 B

B T A1

0 S
0

M. Knepley (UC)

PETSc

SLSC

CNRS 12

114 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

FEM Setup
./bin/pythonscripts/PetscGenerateFEMQuadrature.py
2 2 2 1 laplacian
2 1 1 1 gradient
src/snes/examples/tutorials/ex62.h

M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

Jacobi
ex62
-run_type full -bc_type dirichlet -show_solution 0
-refinement_limit 0.00625 -interpolate 1
-snes_monitor_short -snes_converged_reason
-snes_view
-ksp_gmres_restart 100 -ksp_rtol 1.0e-9
-ksp_monitor_short
-pc_type jacobi

M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

Block diagonal
ex62
-run_type full -bc_type dirichlet -show_solution 0
-refinement_limit 0.00625 -interpolate 1
-snes_monitor_short -snes_converged_reason
-snes_view
-ksp_type fgmres -ksp_gmres_restart 100
-ksp_rtol 1.0e-9 -ksp_monitor_short
-pc_type fieldsplit -pc_fieldsplit_type additive
-fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_pc_type jacobi

M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

Block triangular

ex62
-run_type full -bc_type dirichlet -show_solution 0
-refinement_limit 0.00625 -interpolate 1
-snes_monitor_short -snes_converged_reason
-snes_view
-ksp_type fgmres -ksp_gmres_restart 100
-ksp_rtol 1.0e-9 -ksp_monitor_short
-pc_type fieldsplit -pc_fieldsplit_type multiplicati
-fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_pc_type jacobi

M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

Diagonal Schur complement

ex62
-run_type full -bc_type dirichlet -show_solution 0
-refinement_limit 0.00625 -interpolate 1
-snes_monitor_short -snes_converged_reason
-snes_view
-ksp_type fgmres -ksp_gmres_restart 100
-ksp_rtol 1.0e-9 -ksp_monitor_short
-pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type diag
-fieldsplit_velocity_ksp_type gmres
-fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-10
-fieldsplit_pressure_pc_type jacobi
M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

Upper triangular Schur complement

ex62
-run_type full -bc_type dirichlet -show_solution 0
-refinement_limit 0.00625 -interpolate 1
-snes_monitor_short -snes_converged_reason
-snes_view
-ksp_type fgmres -ksp_gmres_restart 100
-ksp_rtol 1.0e-9 -ksp_monitor_short
-pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type upper
-fieldsplit_velocity_ksp_type gmres
-fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-10
-fieldsplit_pressure_pc_type jacobi
M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

Lower triangular Schur complement

ex62
-run_type full -bc_type dirichlet -show_solution 0
-refinement_limit 0.00625 -interpolate 1
-snes_monitor_short -snes_converged_reason
-snes_view
-ksp_type fgmres -ksp_gmres_restart 100
-ksp_rtol 1.0e-9 -ksp_monitor_short
-pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type lower
-fieldsplit_velocity_ksp_type gmres
-fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-10
-fieldsplit_pressure_pc_type jacobi
M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

SNES ex62
Preconditioning

Full Schur complement

ex62
-run_type full -bc_type dirichlet -show_solution 0
-refinement_limit 0.00625 -interpolate 1
-snes_monitor_short -snes_converged_reason
-snes_view
-ksp_type fgmres -ksp_gmres_restart 100
-ksp_rtol 1.0e-9 -ksp_monitor_short
-pc_type fieldsplit -pc_fieldsplit_type schur
-pc_fieldsplit_schur_factorization_type full
-fieldsplit_velocity_ksp_type gmres
-fieldsplit_velocity_pc_type lu
-fieldsplit_pressure_ksp_rtol 1e-10
-fieldsplit_pressure_pc_type jacobi
M. Knepley (UC)

PETSc

CNRS 12

115 / 156

Advanced Solvers

Fieldsplit

Programming with Options

ex55: Allen-Cahn problem in 2D
constant mobility
triangular elements
Geometric multigrid method for saddle point variational inequalities:

./ex55 -ksp_type fgmres -pc_type mg -mg_levels_ksp_type fgmres

-mg_levels_pc_type fieldsplit -mg_levels_pc_fieldsplit_detect_saddle_point
-mg_levels_pc_fieldsplit_type schur -da_grid_x 65 -da_grid_y 65
-mg_levels_pc_fieldsplit_factorization_type full
-mg_levels_pc_fieldsplit_schur_precondition user
-mg_levels_fieldsplit_1_ksp_type gmres -mg_coarse_ksp_type preonly
-mg_levels_fieldsplit_1_pc_type none
-mg_coarse_pc_type svd
-mg_levels_fieldsplit_0_ksp_type preonly
-mg_levels_fieldsplit_0_pc_type sor
-pc_mg_levels 5
-mg_levels_fieldsplit_0_pc_sor_forward -pc_mg_galerkin
-snes_vi_monitor -ksp_monitor_true_residual -snes_atol 1.e-11
-mg_levels_ksp_monitor -mg_levels_fieldsplit_ksp_monitor
-mg_levels_ksp_max_it 2 -mg_levels_fieldsplit_ksp_max_it 5
M. Knepley (UC)

PETSc

CNRS 12

116 / 156

Advanced Solvers

Fieldsplit

Programming with Options

ex55: Allen-Cahn problem in 2D

Run flexible GMRES with 5 levels of multigrid as the preconditioner
./ex55 -ksp_type fgmres -pc_type mg -pc_mg_levels 5
-da_grid_x 65 -da_grid_y 65

Use the Galerkin process to compute the coarse grid operators

-pc_mg_galerkin

Use SVD as the coarse grid saddle point solver

-mg_coarse_ksp_type preonly -mg_coarse_pc_type svd

M. Knepley (UC)

PETSc

CNRS 12

117 / 156

Advanced Solvers

Fieldsplit

Programming with Options

ex55: Allen-Cahn problem in 2D

Run flexible GMRES with 5 levels of multigrid as the preconditioner
./ex55 -ksp_type fgmres -pc_type mg -pc_mg_levels 5
-da_grid_x 65 -da_grid_y 65

Use the Galerkin process to compute the coarse grid operators

-pc_mg_galerkin

Use SVD as the coarse grid saddle point solver

-mg_coarse_ksp_type preonly -mg_coarse_pc_type svd

M. Knepley (UC)

PETSc

CNRS 12

117 / 156

Advanced Solvers

Fieldsplit

Programming with Options

ex55: Allen-Cahn problem in 2D

Run flexible GMRES with 5 levels of multigrid as the preconditioner
./ex55 -ksp_type fgmres -pc_type mg -pc_mg_levels 5
-da_grid_x 65 -da_grid_y 65

Use the Galerkin process to compute the coarse grid operators

-pc_mg_galerkin

Use SVD as the coarse grid saddle point solver

-mg_coarse_ksp_type preonly -mg_coarse_pc_type svd

M. Knepley (UC)

PETSc

CNRS 12

117 / 156

Advanced Solvers

Fieldsplit

Programming with Options

ex55: Allen-Cahn problem in 2D

Run flexible GMRES with 5 levels of multigrid as the preconditioner
./ex55 -ksp_type fgmres -pc_type mg -pc_mg_levels 5
-da_grid_x 65 -da_grid_y 65

Use the Galerkin process to compute the coarse grid operators

-pc_mg_galerkin

Use SVD as the coarse grid saddle point solver

-mg_coarse_ksp_type preonly -mg_coarse_pc_type svd

M. Knepley (UC)

PETSc

CNRS 12

117 / 156

Advanced Solvers

Fieldsplit

Programming with Options

ex55: Allen-Cahn problem in 2D
Smoother: Flexible GMRES (2 iterates) with a Schur complement PC
-mg_levels_ksp_type fgmres -mg_levels_pc_fieldsplit_detect_saddle_point
-mg_levels_ksp_max_it 2 -mg_levels_pc_type fieldsplit
-mg_levels_pc_fieldsplit_type schur
-mg_levels_pc_fieldsplit_factorization_type full
-mg_levels_pc_fieldsplit_schur_precondition diag

Schur complement solver: GMRES (5 iterates) with no preconditioner

-mg_levels_fieldsplit_1_ksp_type gmres
-mg_levels_fieldsplit_1_pc_type none -mg_levels_fieldsplit_ksp_max_it 5

Schur complement action: Use only the lower diagonal part of A00
-mg_levels_fieldsplit_0_ksp_type preonly
-mg_levels_fieldsplit_0_pc_type sor
-mg_levels_fieldsplit_0_pc_sor_forward

M. Knepley (UC)

PETSc

CNRS 12

118 / 156

Advanced Solvers

Fieldsplit

Programming with Options

Schur complement solver: GMRES (5 iterates) with no preconditioner

-mg_levels_fieldsplit_1_ksp_type gmres
-mg_levels_fieldsplit_1_pc_type none -mg_levels_fieldsplit_ksp_max_it 5

Schur complement action: Use only the lower diagonal part of A00
-mg_levels_fieldsplit_0_ksp_type preonly
-mg_levels_fieldsplit_0_pc_type sor
-mg_levels_fieldsplit_0_pc_sor_forward

M. Knepley (UC)

PETSc

CNRS 12

118 / 156

Advanced Solvers

Fieldsplit

Programming with Options

Schur complement solver: GMRES (5 iterates) with no preconditioner

-mg_levels_fieldsplit_1_ksp_type gmres
-mg_levels_fieldsplit_1_pc_type none -mg_levels_fieldsplit_ksp_max_it 5

Schur complement action: Use only the lower diagonal part of A00
-mg_levels_fieldsplit_0_ksp_type preonly
-mg_levels_fieldsplit_0_pc_type sor
-mg_levels_fieldsplit_0_pc_sor_forward

M. Knepley (UC)

PETSc

CNRS 12

118 / 156

Advanced Solvers

Fieldsplit

Programming with Options

Schur complement solver: GMRES (5 iterates) with no preconditioner

-mg_levels_fieldsplit_1_ksp_type gmres
-mg_levels_fieldsplit_1_pc_type none -mg_levels_fieldsplit_ksp_max_it 5

Schur complement action: Use only the lower diagonal part of A00
-mg_levels_fieldsplit_0_ksp_type preonly
-mg_levels_fieldsplit_0_pc_type sor
-mg_levels_fieldsplit_0_pc_sor_forward

M. Knepley (UC)

PETSc

CNRS 12

118 / 156

Advanced Solvers

Fieldsplit

Null spaces

For a single matrix, use

MatSetNullSpace(J, nullSpace);

to alter the KSP, and

MatSetNearNullSpace(J, nearNullSpace);

to set the coarse basis for AMG.

But this will not work for dynamically created operators.

M. Knepley (UC)

PETSc

CNRS 12

119 / 156

Advanced Solvers

Fieldsplit

Null spaces

For a single matrix, use

MatSetNullSpace(J, nullSpace);

to alter the KSP, and

MatSetNearNullSpace(J, nearNullSpace);

to set the coarse basis for AMG.

But this will not work for dynamically created operators.

M. Knepley (UC)

PETSc

CNRS 12

119 / 156

Advanced Solvers

Fieldsplit

Null spaces
Field Split

Can attach a nullspace to the IS that creates a split,

PetscObjectCompose(pressureIS, "nullspace",
(PetscObject) nullSpacePres);

If the DM makes the IS, use

PetscObject pressure;
DMGetField(dm, 1, &pressure);
PetscObjectCompose(pressure, "nullspace",
(PetscObject) nullSpacePres);

M. Knepley (UC)

PETSc

CNRS 12

120 / 156

Advanced Solvers

Multigrid

Outline

Advanced Solvers
Fieldsplit
Multigrid
Nonlinear Solvers
Timestepping

M. Knepley (UC)

PETSc

CNRS 12

121 / 156

Advanced Solvers

Multigrid

AMG

Why not use AMG?

Of course we will try AMG
GAMG, -pc_type gamg
ML, -download-ml, -pc_type ml
BoomerAMG, -download-hypre, -pc_type hypre
-pc_hypre_type boomeramg

Problems with
vector character
anisotropy
scalability of setup time

M. Knepley (UC)

PETSc

CNRS 12

122 / 156

Advanced Solvers

Multigrid

AMG

Why not use AMG?

Of course we will try AMG
GAMG, -pc_type gamg
ML, -download-ml, -pc_type ml
BoomerAMG, -download-hypre, -pc_type hypre
-pc_hypre_type boomeramg

Problems with
vector character
anisotropy
scalability of setup time

M. Knepley (UC)

PETSc

CNRS 12

122 / 156

Advanced Solvers

Multigrid

AMG

Why not use AMG?

Of course we will try AMG
GAMG, -pc_type gamg
ML, -download-ml, -pc_type ml
BoomerAMG, -download-hypre, -pc_type hypre
-pc_hypre_type boomeramg

Problems with
vector character
anisotropy
scalability of setup time

M. Knepley (UC)

PETSc

CNRS 12

122 / 156

Advanced Solvers

Multigrid

Multigrid with DM

Allows multigrid with some simple command line options

-pc_type mg, -pc_mg_levels
-pc_mg_type, -pc_mg_cycle_type, -pc_mg_galerkin
-mg_levels_1_ksp_type, -mg_levels_1_pc_type
-mg_coarse_ksp_type, -mg_coarse_pc_type
-da_refine, -ksp_view
Interface also works with GAMG and 3rd party packages like ML

M. Knepley (UC)

PETSc

CNRS 12

123 / 156

Advanced Solvers

Multigrid

A 2D Problem

Problem has:
1,640,961 unknowns (on the fine level)
8,199,681 nonzeros

./ex5

Options
-da_grid_x 21 -da_grid_y 21
-ksp_rtol 1.0e-9
-da_refine 6
-pc_type mg
-pc_mg_levels 4
-snes_monitor -snes_view

M. Knepley (UC)

PETSc

Explanation
Original grid is 21x21
Solver tolerance
6 levels of refinement
4 levels of multigrid
Describe solver

CNRS 12

124 / 156

Advanced Solvers

Multigrid

A 3D Problem
Problem has:
1,689,600 unknowns (on the fine level)
89,395,200 nonzeros

./ex48

Options
-M 5 -N 5
-da_refine 5
-ksp_rtol 1.0e-9
-thi_mat_type baij
-pc_type mg
-pc_mg_levels 4
-snes_monitor -snes_view

M. Knepley (UC)

PETSc

Explanation
Coarse problem size
5 levels of refinement
Solver tolerance
Needs SOR
4 levels of multigrid
Describe solver

CNRS 12

125 / 156

Advanced Solvers

Nonlinear Solvers

Outline

Advanced Solvers
Fieldsplit
Multigrid
Nonlinear Solvers
Timestepping

M. Knepley (UC)

PETSc

CNRS 12

126 / 156

Advanced Solvers

Nonlinear Solvers

3rd Party Solvers in PETSc

Complete table of solvers
1
Sequential LU
ILUDT (SPARSEKIT2, Yousef Saad, U of MN)
EUCLID & PILUT (Hypre, David Hysom, LLNL)
ESSL (IBM)
SuperLU (Jim Demmel and Sherry Li, LBNL)
Matlab
UMFPACK (Tim Davis, U. of Florida)
LUSOL (MINOS, Michael Saunders, Stanford)
2

Parallel LU
MUMPS (Patrick Amestoy, IRIT)
SPOOLES (Cleve Ashcroft, Boeing)
SuperLU_Dist (Jim Demmel and Sherry Li, LBNL)

Parallel Cholesky
DSCPACK (Padma Raghavan, Penn. State)
MUMPS (Patrick Amestoy, Toulouse)
CHOLMOD (Tim Davis, Florida)

XYTlib - parallel direct solver (Paul Fischer and Henry Tufo, ANL)
M. Knepley (UC)

PETSc

CNRS 12

127 / 156

Advanced Solvers

Nonlinear Solvers

3rd Party Preconditioners in PETSc

Complete table of solvers
1

Parallel ICC
BlockSolve95 (Mark Jones and Paul Plassman, ANL)

Parallel ILU
PaStiX (Faverge Mathieu, INRIA)

Parallel Sparse Approximate Inverse

Parasails (Hypre, Edmund Chow, LLNL)
SPAI 3.0 (Marcus Grote and Barnard, NYU)

Sequential Algebraic Multigrid

RAMG (John Ruge and Klaus Steuben, GMD)
SAMG (Klaus Steuben, GMD)

Parallel Algebraic Multigrid

Prometheus (Mark Adams, PPPL)
BoomerAMG (Hypre, LLNL)
ML (Trilinos, Ray Tuminaro and Jonathan Hu, SNL)
M. Knepley (UC)

PETSc

CNRS 12

127 / 156

Advanced Solvers

Nonlinear Solvers

Always use SNES

Always use SNES instead of KSP:

No more costly than linear solver
Can accomodate unanticipated nonlinearities
Automatic iterative refinement
Callback interface can take advantage of problem structure
Jed actually recommends TS. . .

M. Knepley (UC)

PETSc

CNRS 12

128 / 156

Advanced Solvers

Nonlinear Solvers

Always use SNES

Always use SNES instead of KSP:

No more costly than linear solver
Can accomodate unanticipated nonlinearities
Automatic iterative refinement
Callback interface can take advantage of problem structure
Jed actually recommends TS. . .

M. Knepley (UC)

PETSc

CNRS 12

128 / 156

Advanced Solvers

Nonlinear Solvers

Flow Control for a PETSc Application

Main Routine
Timestepping Solvers (TS)

Nonlinear Solvers (SNES)

Linear Solvers (KSP)

PETSc

Preconditioners (PC)

Application
Initialization
M. Knepley (UC)

Function
Evaluation
PETSc

Jacobian
Evaluation

Postprocessing
CNRS 12

129 / 156

Advanced Solvers

Nonlinear Solvers

SNES Paradigm

The SNES interface is based upon callback functions

FormFunction(), set by SNESSetFunction()
FormJacobian(), set by SNESSetJacobian()
When PETSc needs to evaluate the nonlinear residual F (x),
Solver calls the users function
User function gets application state through the ctx variable
PETSc never sees application data

M. Knepley (UC)

PETSc

CNRS 12

130 / 156

Advanced Solvers

Nonlinear Solvers

SNES Function

User provided function calculates the nonlinear residual:

PetscErrorCode (*func)(SNES snes,Vec x,Vec r,void *ctx)

x: The current solution

r: The residual
ctx: The user context passed to SNESSetFunction()
Use this to pass application information, e.g. physical constants

M. Knepley (UC)

PETSc

CNRS 12

131 / 156

Advanced Solvers

Nonlinear Solvers

SNES Jacobian
User provided function calculates the Jacobian:
(*func)(SNES snes,Vec x,Mat *J,Mat *M,MatStructure *flag,void *ctx)

x: The current solution

J: The Jacobian
M: The Jacobian preconditioning matrix (possibly J itself)
ctx: The user context passed to SNESSetJacobian()
Use this to pass application information, e.g. physical constants

Possible MatStructure values are:

SAME_NONZERO_PATTERN
DIFFERENT_NONZERO_PATTERN

Alternatively, you can use

matrix-free finite difference approximation, -snes_mf
finite difference approximation with coloring, -snes_fd
M. Knepley (UC)

PETSc

CNRS 12

132 / 156

Advanced Solvers

Nonlinear Solvers

SNES Variants
Picard iteration
Line search/Trust region strategies
Quasi-Newton
Nonlinear CG/GMRES
Nonlinear GS/ASM
Nonlinear Multigrid (FAS)
Variational inequality approaches
M. Knepley (UC)

PETSc

CNRS 12

133 / 156

Advanced Solvers

Nonlinear Solvers

New methods in SNES

LS, TR Newton-type with line search and trust region
NRichardson Nonlinear Richardson, usually preconditioned
VIRS, VISS reduced space and semi-smooth methods
for variational inequalities
QN Quasi-Newton methods like BFGS
NGMRES Nonlinear GMRES
NCG Nonlinear Conjugate Gradients
SORQN SOR quasi-Newton
GS Nonlinear Gauss-Seidel sweeps
FAS Full approximation scheme (nonlinear multigrid)
MS Multi-stage smoothers (in FAS for hyperbolic problems)
Shell Your method, often used as a (nonlinear) preconditioner
M. Knepley (UC)

PETSc

CNRS 12

134 / 156

Advanced Solvers

Nonlinear Solvers

Finite Difference Jacobians

PETSc can compute and explicitly store a Jacobian via 1st-order FD
Dense
Activated by -snes_fd
Computed by SNESDefaultComputeJacobian()

Sparse via colorings (default)

Coloring is created by MatFDColoringCreate()
Computed by SNESDefaultComputeJacobianColor()

Can also use Matrix-free Newton-Krylov via 1st-order FD

Activated by -snes_mf without preconditioning
Activated by -snes_mf_operator with user-defined
preconditioning
Uses preconditioning matrix from SNESSetJacobian()

M. Knepley (UC)

PETSc

CNRS 12

135 / 156

Advanced Solvers

Nonlinear Solvers

Driven Cavity Problem

SNES ex19.c
./ex19 -lidvelocity 100 -grashof 1e2
-da_grid_x 16 -da_grid_y 16 -da_refine 2
-snes_monitor_short -snes_converged_reason -snes_view

M. Knepley (UC)

PETSc

CNRS 12

136 / 156

Advanced Solvers

Nonlinear Solvers

Driven Cavity Problem

SNES ex19.c
./ex19 -lidvelocity 100 -grashof 1e2
-da_grid_x 16 -da_grid_y 16 -da_refine 2
-snes_monitor_short -snes_converged_reason -snes_view
lid velocity = 100, prandtl # = 1, grashof # = 100
0 SNES Function norm 768.116
1 SNES Function norm 658.288
2 SNES Function norm 529.404
3 SNES Function norm 377.51
4 SNES Function norm 304.723
5 SNES Function norm 2.59998
6 SNES Function norm 0.00942733
7 SNES Function norm 5.20667e-08
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 7

M. Knepley (UC)

PETSc

CNRS 12

136 / 156

Advanced Solvers

Nonlinear Solvers

Driven Cavity Problem

SNES ex19.c
./ex19 -lidvelocity 100 -grashof 1e3
-da_grid_x 16 -da_grid_y 16 -da_refine 2
-snes_monitor_short -snes_converged_reason -snes_view

M. Knepley (UC)

PETSc

CNRS 12

136 / 156

Advanced Solvers

Nonlinear Solvers

Driven Cavity Problem

SNES ex19.c
./ex19 -lidvelocity 100 -grashof 1e3
-da_grid_x 16 -da_grid_y 16 -da_refine 2
-snes_monitor_short -snes_converged_reason -snes_view
lid velocity = 100, prandtl # = 1, grashof # = 10000
0 SNES Function norm 785.404
1 SNES Function norm 663.055
2 SNES Function norm 519.583
3 SNES Function norm 360.87
4 SNES Function norm 245.893
5 SNES Function norm 1.8117
6 SNES Function norm 0.00468828
7 SNES Function norm 4.417e-08
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 7

M. Knepley (UC)

PETSc

CNRS 12

136 / 156

Advanced Solvers

Nonlinear Solvers

Driven Cavity Problem

SNES ex19.c
./ex19 -lidvelocity 100 -grashof 1e5
-da_grid_x 16 -da_grid_y 16 -da_refine 2
-snes_monitor_short -snes_converged_reason -snes_view

M. Knepley (UC)

PETSc

CNRS 12

136 / 156

Advanced Solvers

Nonlinear Solvers

Driven Cavity Problem

SNES ex19.c
./ex19 -lidvelocity 100 -grashof 1e5
-da_grid_x 16 -da_grid_y 16 -da_refine 2
-snes_monitor_short -snes_converged_reason -snes_view

lid velocity = 100, prandtl # = 1, grashof # = 100000

0 SNES Function norm 1809.96
Nonlinear solve did not converge due to DIVERGED_LINEAR_SOLVE iterations 0

M. Knepley (UC)

PETSc

CNRS 12

136 / 156

Advanced Solvers

Nonlinear Solvers

Driven Cavity Problem

SNES ex19.c
./ex19 -lidvelocity 100 -grashof 1e5
-da_grid_x 16 -da_grid_y 16 -da_refine 2 -pc_type lu
-snes_monitor_short -snes_converged_reason -snes_view
lid velocity = 100, prandtl # = 1, grashof # = 100000
0 SNES Function norm 1809.96
1 SNES Function norm 1678.37
2 SNES Function norm 1643.76
3 SNES Function norm 1559.34
4 SNES Function norm 1557.6
5 SNES Function norm 1510.71
6 SNES Function norm 1500.47
7 SNES Function norm 1498.93
8 SNES Function norm 1498.44
9 SNES Function norm 1498.27
10 SNES Function norm 1498.18
11 SNES Function norm 1498.12
12 SNES Function norm 1498.11
13 SNES Function norm 1498.11
14 SNES Function norm 1498.11
...
M. Knepley (UC)

PETSc

CNRS 12

136 / 156

Advanced Solvers

Nonlinear Solvers

Why isnt SNES converging?

The Jacobian is wrong (maybe only in parallel)
Check with -snes_type test and -snes_mf_operator
-pc_type lu

The linear system is not solved accurately enough

Check with -pc_type lu
Check -ksp_monitor_true_residual, try right preconditioning

The Jacobian is singular with inconsistent right side

Use MatNullSpace to inform the KSP of a known null space
Use a different Krylov method or preconditioner

The nonlinearity is just really strong

Run with -info or -snes_ls_monitor to see line search
Try using trust region instead of line search -snes_type tr
Try grid sequencing if possible -snes_grid_sequence
Use a continuation

M. Knepley (UC)

PETSc

CNRS 12

137 / 156

Advanced Solvers

Nonlinear Solvers

Nonlinear Preconditioning

PC preconditions KSP

SNES preconditions SNES

-ksp_type gmres

-snes_type ngmres

-pc_type richardson

-npc_snes_type nrichardson

M. Knepley (UC)

PETSc

CNRS 12

138 / 156

Advanced Solvers

Nonlinear Solvers

Nonlinear Preconditioning

PC preconditions KSP

SNES preconditions SNES

-ksp_type gmres

-snes_type ngmres

-pc_type richardson

-npc_snes_type nrichardson

M. Knepley (UC)

PETSc

CNRS 12

138 / 156

Advanced Solvers

Nonlinear Solvers

Nonlinear Use Cases

Warm start Newton
-snes_type newtonls
-npc_snes_type nrichardson -npc_snes_max_it 5
Cleanup noisy Jacobian
-snes_type ngmres -snes_ngmres_m 5
-npc_snes_type newtonls
Additive-Schwarz Preconditioned Inexact Newton
-snes_type aspin -snes_npc_side left
-npc_snes_type nasm -npc_snes_nasm_type restrict

M. Knepley (UC)

PETSc

CNRS 12

139 / 156

Advanced Solvers

Nonlinear Solvers

Nonlinear Preconditioning
Also called globalization
./ex19 -lidvelocity 100 -grashof 5e4 -da_refine 4 -snes_monitor_short
-snes_type newtonls -snes_converged_reason
-pc_type lu

lid
0
1
2
3
4
5
.
.
.
21
22
23
24
.
.
.

velocity = 100, prandtl # = 1, grashof # = 50000

SNES Function norm 1228.95
SNES Function norm 1132.29
SNES Function norm 1026.17
SNES Function norm 925.717
SNES Function norm 924.778
SNES Function norm 836.867
SNES
SNES
SNES
SNES

Function
Function
Function
Function

M. Knepley (UC)

norm
norm
norm
norm

585.143
585.142
585.142
585.142

PETSc

CNRS 12

140 / 156

Advanced Solvers

Nonlinear Solvers

Nonlinear Preconditioning
Also called globalization
./ex19 -lidvelocity 100 -grashof 5e4 -da_refine 4 -snes_monitor_short
-snes_type fas -snes_converged_reason
-fas_levels_snes_type gs -fas_levels_snes_max_it 6

lid velocity = 100, prandtl # = 1, grashof # = 50000

0 SNES Function norm 1228.95
1 SNES Function norm 574.793
2 SNES Function norm 513.02
3 SNES Function norm 216.721
4 SNES Function norm 85.949
Nonlinear solve did not converge due to DIVERGED_INNER iterations 4

M. Knepley (UC)

PETSc

CNRS 12

140 / 156

Advanced Solvers

Nonlinear Solvers

lid velocity = 100, prandtl # = 1, grashof # = 50000

0 SNES Function norm 1228.95
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 12
1 SNES Function norm 574.793
Nonlinear solve did not converge due to DIVERGED_MAX_IT its 50
2 SNES Function norm 513.02
Nonlinear solve did not converge due to DIVERGED_MAX_IT its 50
3 SNES Function norm 216.721
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 22
4 SNES Function norm 85.949
Nonlinear solve did not converge due to DIVERGED_LINE_SEARCH its 42
Nonlinear solve did not converge due to DIVERGED_INNER iterations 4

M. Knepley (UC)

PETSc

CNRS 12

140 / 156

Advanced Solvers

Nonlinear Solvers

lid velocity = 100, prandtl # = 1, grashof # = 50000

0 SNES Function norm 1228.95
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
.
.
.
47 SNES Function norm 78.8401
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
48 SNES Function norm 73.1185
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
49 SNES Function norm 78.834
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
50 SNES Function norm 73.1176
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE
.
.
.
M. Knepley (UC)

PETSc

its 6

its 5
its 6
its 5
its 6

CNRS 12

140 / 156

Advanced Solvers

Nonlinear Solvers

Nonlinear Preconditioning
Also called globalization
./ex19 -lidvelocity 100 -grashof 5e4 -da_refine 4 -snes_monitor_short
-snes_type nrichardson -npc_snes_max_it 1 -snes_converged_reason
-npc_snes_type fas -npc_fas_coarse_snes_converged_reason
-npc_fas_levels_snes_type gs -npc_fas_levels_snes_max_it 6
-npc_fas_coarse_snes_linesearch_type basic

lid velocity = 100, prandtl # = 1, grashof # = 50000

0 SNES Function norm 1228.95
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 6
1 SNES Function norm 552.271
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 27
2 SNES Function norm 173.45
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 45
.
.
.
43 SNES Function norm 3.45407e-05
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE its 2
44 SNES Function norm 1.6141e-05
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE its 2
45 SNES Function norm 9.13386e-06
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 45
M. Knepley (UC)

PETSc

CNRS 12

140 / 156

Advanced Solvers

Nonlinear Solvers

lid velocity = 100, prandtl # = 1, grashof # = 50000

0 SNES Function norm 1228.95
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 6
1 SNES Function norm 538.605
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 13
2 SNES Function norm 178.005
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 24
.
.
.
27 SNES Function norm 0.000102487
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE its 2
28 SNES Function norm 4.2744e-05
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE its 2
29 SNES Function norm 1.01621e-05
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 29
M. Knepley (UC)

PETSc

CNRS 12

140 / 156

Advanced Solvers

Nonlinear Solvers

Nonlinear Preconditioning
Also called globalization
./ex19 -lidvelocity 100 -grashof 5e4 -da_refine 4 -snes_monitor_short
-snes_type ngmres -npc_snes_max_it 1 -snes_converged_reason
-npc_snes_type fas -npc_fas_coarse_snes_converged_reason
-npc_fas_levels_snes_type newtonls -npc_fas_levels_snes_max_it 6
-npc_fas_levels_snes_linesearch_type basic
-npc_fas_levels_snes_max_linear_solve_fail 30
-npc_fas_levels_ksp_max_it 20 -npc_fas_levels_snes_converged_reason
-npc_fas_coarse_snes_linesearch_type basic
lid velocity = 100, prandtl # = 1, grashof # = 50000
0 SNES Function norm 1228.95
Nonlinear solve did not converge due to DIVERGED_MAX_IT its 6
.
.
.
Nonlinear solve converged due to CONVERGED_SNORM_RELATIVE its 1
.
.
.
1 SNES Function norm 0.1935
2 SNES Function norm 0.0179938
3 SNES Function norm 0.00223698
4 SNES Function norm 0.000190461
5 SNES Function norm 1.6946e-06
Nonlinear solve converged due to CONVERGED_FNORM_RELATIVE iterations 5
M. Knepley (UC)

PETSc

CNRS 12

140 / 156

Advanced Solvers

Nonlinear Solvers

Hierarchical Krylov
This tests a hierarchical Krylov method
mpiexec -n 4 ./ex19 -da_refine 4 -snes_view
-ksp_type fgmres -pc_type bjacobi -pc_bjacobi_blocks 2
-sub_ksp_type gmres -sub_pc_type bjacobi -sub_ksp_max_it 2
-sub_sub_ksp_type preonly -sub_sub_pc_type ilu
SNES Object: 4 MPI processes
type: newtonls
KSP Object:
4 MPI processes
type: fgmres
PC Object:
4 MPI processes
type: bjacobi
block Jacobi: number of blocks = 2
KSP Object:(sub_) 2 MPI processes
type: gmres
PC Object:(sub_) 2 MPI processes
type: bjacobi
block Jacobi: number of blocks = 2
KSP Object: (sub_sub_)
1 MPI processes
type: preonly
PC Object: (sub_sub_)
1 MPI processes
type: ilu
ILU: out-of-place factorization
M. Knepley (UC)

PETSc

CNRS 12

141 / 156

Advanced Solvers

Nonlinear Solvers

Hierarchical Krylov
This tests a hierarchical Krylov method
mpiexec -n 4 ./ex19 -da_refine 4 -snes_view
-ksp_type fgmres -pc_type bjacobi -pc_bjacobi_blocks 2
-sub_ksp_type gmres -sub_pc_type bjacobi -sub_ksp_max_it 2
-sub_sub_ksp_type preonly -sub_sub_pc_type ilu
PC Object:
4 MPI processes
type: bjacobi
block Jacobi: number of blocks = 2
PC Object:(sub_) 2 MPI processes
type: bjacobi
block Jacobi: number of blocks = 2
PC Object: (sub_sub_)
1 MPI processes
type: ilu
ILU: out-of-place factorization
Mat Object:
1 MPI processes
type: seqaij
rows=2500, cols=2500, bs=4
Mat Object:
2 MPI processes
type: mpiaij
rows=4900, cols=4900, bs=4
Mat Object:
4 MPI processes
type: mpiaij
rows=9604, cols=9604, bs=4
M. Knepley (UC)

PETSc

CNRS 12

141 / 156

Advanced Solvers

Nonlinear Solvers

Visualizing Solvers
This shows how to visualize a nested solver configuration:
./ex19 -da_refine 1 -pc_type fieldsplit -fieldsplit_x_velocity_pc_type mg
-fieldsplit_x_velocity_mg_coarse_pc_type svd
-snes_view draw -draw_pause -2 -geometry 0,0,600,600

M. Knepley (UC)

PETSc

CNRS 12

142 / 156

Advanced Solvers

Timestepping

Outline

Advanced Solvers
Fieldsplit
Multigrid
Nonlinear Solvers
Timestepping

M. Knepley (UC)

PETSc

CNRS 12

143 / 156

Advanced Solvers

Timestepping

What about TS?

Didnt Time Integration

Suck in PETSc?
Yes, it did . . .

until Jed, Emil, and Peter rewrote it =

M. Knepley (UC)

PETSc

CNRS 12

144 / 156

Advanced Solvers

Timestepping

What about TS?

Didnt Time Integration

Suck in PETSc?
Yes, it did . . .

until Jed, Emil, and Peter rewrote it =

M. Knepley (UC)

PETSc

CNRS 12

144 / 156

Advanced Solvers

Timestepping

What about TS?

Didnt Time Integration

Suck in PETSc?
Yes, it did . . .

until Jed, Emil, and Peter rewrote it =

M. Knepley (UC)

PETSc

CNRS 12

144 / 156

Advanced Solvers

Timestepping

Some TS methods
TSSSPRK104 10-stage, fourth order, low-storage, optimal explicit
SSP Runge-Kutta ceff = 0.6 (Ketcheson 2008)
TSARKIMEX2E second order, one explicit and two implicit stages,
L-stable, optimal (Constantinescu)
TSARKIMEX3 (and 4 and 5), L-stable (Kennedy and Carpenter, 2003)
TSROSWRA3PW three stage, third order, for index-1 PDAE, A-stable,
R() = 0.73, second order strongly A-stable embedded
method (Rang and Angermann, 2005)
TSROSWRA34PW2 four stage, third order, L-stable, for index 1
PDAE, second order strongly A-stable embedded method
(Rang and Angermann, 2005)
TSROSWLLSSP3P4S2C four stage, third order, L-stable implicit, SSP
explicit, L-stable embedded method (Constantinescu)
M. Knepley (UC)

PETSc

CNRS 12

145 / 156

Advanced Solvers

Timestepping

IMEX time integration in PETSc

Additive Runge-Kutta IMEX methods
= F (t, x)
G(t, x, x)
J = Gx + Gx
User provides:
FormRHSFunction(ts,t,x,F,void *ctx)
FormIFunction(ts,t,x,xdot,G,void *ctx)
FormIJacobian(ts,t,x,xdot,alpha,J,J_p,mstr,void *ctx)

Single step interface so user can have own time loop

Choice of explicit method, e.g. SSP
L-stable DIRK for stiff part G
Orders 2 through 5, embedded error estimates
Dense output, hot starts for Newton
More accurate methods if G is linear, also Rosenbrock-W
Can use preconditioner from classical semi-implicit methods
Extensible adaptive controllers, can change order within a family
Easy to register new methods: TSARKIMEXRegister()
M. Knepley (UC)

PETSc

CNRS 12

146 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Equations

TS ex22.c
ut + a1 ux = k1 u + k2 v + s1
vt + a2 vx =

k1 u k2 v + s2

Upstream boundary condition:

u(0, t) = 1 sin(12t)4

M. Knepley (UC)

PETSc

CNRS 12

147 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Equations

TS ex22.c
ut + a1 ux = k1 u + k2 v + s1
vt + a2 vx =

k1 u k2 v + s2

FormIFunction(TS ts, PetscReal t, Vec X, Vec Xdot, Vec F, void *ptr) {

TSGetDM(ts, &da);
DMDAGetLocalInfo(da, &info);
DMDAVecGetArray(da, X, &x);
DMDAVecGetArray(da, Xdot, &xdot);
DMDAVecGetArray(da, F, &f);
/* Compute function over the locally owned part of the grid */
for (i = info.xs; i < info.xs+info.xm; ++i) {
f[i][0] = xdot[i][0] + k[0]*x[i][0] - k[1]*x[i][1] - s[0];
f[i][1] = xdot[i][1] - k[0]*x[i][0] + k[1]*x[i][1] - s[1];
}
DMDAVecRestoreArray(da, X, &x);
DMDAVecRestoreArray(da, Xdot, &xdot);
DMDAVecRestoreArray(da, F, &f);
}
M. Knepley (UC)

PETSc

CNRS 12

147 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Equations

TS ex22.c
ut + a1 ux = k1 u + k2 v + s1
vt + a2 vx =

k1 u k2 v + s2

FormIJacobian(TS ts, PetscReal t, Vec X, Vec Xdot, PetscReal a, Mat *J,

Mat *Jpre, MatStructure *str, void *ptr) {
for (i = info.xs; i < info.xs+info.xm; ++i) {
PetscScalar v[2][2];
v[0][0] = a + k[0]; v[0][1] = -k[1];
v[1][0] =
-k[0]; v[1][1] = a+k[1];
MatSetValuesBlocked(*Jpre, 1, &i, 1, &i, &v[0][0], INSERT_VALUES);
}
MatAssemblyBegin(*Jpre, MAT_FINAL_ASSEMBLY);
MatAssemblyEnd(*Jpre, MAT_FINAL_ASSEMBLY);
if (*J != *Jpre) {
MatAssemblyBegin(*J, MAT_FINAL_ASSEMBLY);
MatAssemblyEnd(*J, MAT_FINAL_ASSEMBLY);
}
}
M. Knepley (UC)

PETSc

CNRS 12

147 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Equations

TS ex22.c
ut + a1 ux = k1 u + k2 v + s1
vt + a2 vx =

k1 u k2 v + s2

FormRHSFunction(TS ts, PetscReal t, Vec X, Vec F, void *ptr) {

PetscReal u0t[2] = {1. - PetscPowScalar(sin(12*t),4.),0};
DMGetLocalVector(da, &Xloc);
DMGlobalToLocalBegin(da, X, INSERT_VALUES, Xloc);
DMGlobalToLocalEnd(da, X, INSERT_VALUES, Xloc);
for (i = info.xs; i < info.xs+info.xm; ++i) {
/* CALCULATE RESIDUAL f[i][j] */
}
}

M. Knepley (UC)

PETSc

CNRS 12

147 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Equations

TS ex22.c
ut + a1 ux = k1 u + k2 v + s1
vt + a2 vx =

k1 u k2 v + s2

for (i = info.xs; i < info.xs+info.xm; ++i) {

for (j = 0; j < 2; ++j) {
const PetscReal a = a[j]/hx;
if (i == 0)
f[i][j] =
a*(1/3*u0t[j] + 1/2*x[i][j] - x[i+1][j] + 1/6*x[i+2][j]);
else if (i == 1)
f[i][j] =
a*(-1/12*u0t[j] + 2/3*x[i-1][j] - 2/3*x[i+1][j] + 1/12*x[i+2][j]);
else if (i == info.mx-2) f[i][j] =
a*(-1/6*x[i-2][j] + x[i-1][j] - 1/2*x[i][j] - 1/3*x[i+1][j]);
else if (i == info.mx-1) f[i][j] =
a*(-x[i][j] + x[i-1][j]);
else
f[i][j] =
a*(-1/12*x[i-2][j] + 2/3*x[i-1][j] - 2/3*x[i+1][j] + 1/12*x[i+2][j]);
}
}
M. Knepley (UC)

PETSc

CNRS 12

147 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Parameters

TS ex22.c
a1 = 1,

k1 = 106 ,

s1 = 0,

a2 = 0,

k2 = 2k1 ,

s2 = 1

M. Knepley (UC)

PETSc

CNRS 12

148 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Initial conditions

TS ex22.c
u(x, 0) = 1 + s2 x
k0
s1
v (x, 0) = u(x, 0) +
k1
k1
PetscErrorCode FormInitialSolution(TS ts, Vec X, void *ctx) {
TSGetDM(ts, &da);
DMDAGetLocalInfo(da, &info);
DMDAVecGetArray(da, X, &x);
/* Compute function over the locally owned part of the grid */
for (i = info.xs; i < info.xs+info.xm; ++i) {
PetscReal r = (i+1)*hx;
PetscReal ik = user->k[1] != 0.0 ? 1.0/user->k[1] : 1.0;
x[i][0] = 1 + user->s[1]*r;
x[i][1] = user->k[0]*ik*x[i][0] + user->s[1]*ik;
}
DMDAVecRestoreArray(da, X, &x);
}
M. Knepley (UC)

PETSc

CNRS 12

149 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Initial conditions

PETSc

CNRS 12

149 / 156

Advanced Solvers

Timestepping

Stiff linear advection-reaction test problem

Examples

TS ex22.c
./ex22 -da_grid_x 200 -ts_monitor_draw_solution -ts_arkimex_type 4
-ts_adapt_type none
./ex22 -da_grid_x 200 -ts_monitor_draw_solution -ts_type rosw
-ts_dt 1e-3 -ts_adapt_type none
./ex22 -da_grid_x 200 -ts_monitor_draw_solution -ts_type rosw
-ts_rosw_type sandu3 -ts_dt 5e-3 -ts_adapt_type none
./ex22 -da_grid_x 200 -ts_monitor_draw_solution -ts_type rosw
-ts_rosw_type ra34pw2 -ts_adapt_monitor

M. Knepley (UC)

PETSc

CNRS 12

150 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Equations

TS ex25.c
ut uxx = A (B + 1)u + u 2 v
vt vxx = Bu u 2 v
Boundary conditions:
u(0, t) = u(1, t) = 1
v (0, t) = v (1, t) = 3

M. Knepley (UC)

PETSc

CNRS 12

151 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Equations

TS ex25.c
ut uxx = A (B + 1)u + u 2 v
vt vxx = Bu u 2 v
FormIFunction(TS ts, PetscReal t, Vec X, Vec Xdot, Vec F, void *ptr) {
DMGlobalToLocalBegin(da, X, INSERT_VALUES, Xloc);
DMGlobalToLocalEnd(da, X, INSERT_VALUES, Xloc);
for (i = info.xs; i < info.xs+info.xm; ++i) {
if (i == 0) {
f[i].u = hx * (x[i].u - uleft);
f[i].v = hx * (x[i].v - vleft);
} else if (i == info.mx-1) {
f[i].u = hx * (x[i].u - uright);
f[i].v = hx * (x[i].v - vright);
} else {
f[i].u = hx * xdot[i].u - alpha * (x[i-1].u - 2.*x[i].u + x[i+1].u)
f[i].v = hx * xdot[i].v - alpha * (x[i-1].v - 2.*x[i].v + x[i+1].v)
}
}
}
M. Knepley (UC)

PETSc

CNRS 12

151 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Equations

TS ex25.c
ut uxx = A (B + 1)u + u 2 v
vt vxx = Bu u 2 v
FormIJacobian(TS ts, PetscReal t, Vec X, Vec Xdot, PetscReal a, Mat *J,
Mat *Jpre, MatStructure *str, void *ptr) {
for (i = info.xs; i < info.xs+info.xm; ++i) {
if (i == 0 || i == info.mx-1) {
const PetscInt
row = i,col = i;
const PetscScalar vals[2][2] = {{hx,0},{0,hx}};
MatSetValuesBlocked(*Jpre,1,&row,1,&col,&vals[0][0],INSERT_VALUES);
} else {
const PetscInt
row = i,col[] = {i-1,i,i+1};
const PetscScalar dL = -alpha/hx,dC = 2*alpha/hx,dR = -alpha/hx;
const PetscScalar v[2][3][2] = {{{dL,0},{a*hx+dC,0},{dR,0}},
{{0,dL},{0,a*hx+dC},{0,dR}}};
MatSetValuesBlocked(*Jpre,1,&row,3,col,&v[0][0][0],INSERT_VALUES);
}
}
}
M. Knepley (UC)

PETSc

CNRS 12

151 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Equations

TS ex25.c
ut uxx = A (B + 1)u + u 2 v
vt vxx = Bu u22 v
FormRHSFunction(TS ts, PetscReal t, Vec X, Vec F, void *ptr) {
TSGetDM(ts, &da);
DMDAGetLocalInfo(da, &info);
DMDAVecGetArray(da, X, &x);
DMDAVecGetArray(da, F, &f);
for (i = info.xs; i < info.xs+info.xm; ++i) {
PetscScalar u = x[i].u, v = x[i].v;
f[i].u = hx*(A - (B+1)*u + u*u*v);
f[i].v = hx*(B*u - u*u*v);
}
DMDAVecRestoreArray(da, X, &x);
DMDAVecRestoreArray(da, F, &f);
}

M. Knepley (UC)

PETSc

CNRS 12

151 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Parameters

TS ex25.c
A = 1,

M. Knepley (UC)

B = 3,

PETSc

= 1/50

CNRS 12

152 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Initial conditions

TS ex25.c
u(x, 0) = 1 + sin(2x)
v (x, 0) = 3
PetscErrorCode FormInitialSolution(TS ts, Vec X, void *ctx) {
TSGetDM(ts, &da);
DMDAGetLocalInfo(da, &info);
DMDAVecGetArray(da, X, &x);
/* Compute function over the locally owned part of the grid */
for (i = info.xs; i < info.xs+info.xm; ++i) {
PetscReal xi = i*hx;
x[i].u = uleft*(1-xi) + uright*xi + sin(2*PETSC_PI*xi);
x[i].v = vleft*(1-xi) + vright*xi;
}
DMDAVecRestoreArray(da, X, &x);
}

M. Knepley (UC)

PETSc

CNRS 12

153 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Initial conditions

M. Knepley (UC)

PETSc

CNRS 12

153 / 156

Advanced Solvers

Timestepping

1D Brusselator reaction-diffusion
Examples

TS ex25.c
./ex25 -da_grid_x 20 -ts_monitor_draw_solution -ts_type rosw
-ts_dt 5e-2 -ts_adapt_type none
./ex25 -da_grid_x 20 -ts_monitor_draw_solution -ts_type rosw
-ts_rosw_type 2p -ts_dt 5e-2
./ex25 -da_grid_x 20 -ts_monitor_draw_solution -ts_type rosw
-ts_rosw_type 2p -ts_dt 5e-2 -ts_adapt_type none

M. Knepley (UC)

PETSc

CNRS 12

154 / 156

Advanced Solvers

Timestepping

Second Order TVD Finite Volume Method

Example

TS ex11.c
./ex11 -f $PETSC_DIR/share/petsc/datafiles/meshes/sevenside.exo
./ex11 -f
$PETSC_DIR/share/petsc/datafiles/meshes/sevenside-quad-15.exo
./ex11 -f $PETSC_DIR/share/petsc/datafiles/meshes/sevenside.exo
-ts_type rosw

M. Knepley (UC)

PETSc

CNRS 12

155 / 156

Conclusions

PETSc can help you:

easily construct a code to test your ideas
Lots of code construction, management, and debugging tools

scale an existing code to large or distributed machines

Using FormFunctionLocal() and scalable linear algebra

incorporate more scalable or higher performance algorithms

Such as domain decomposition, fieldsplit, and multigrid

tune your code to new architectures

Using profiling tools and specialized implementations

M. Knepley (UC)

PETSc

CNRS 12

156 / 156

Conclusions

PETSc can help you:

easily construct a code to test your ideas
Lots of code construction, management, and debugging tools

scale an existing code to large or distributed machines

Using FormFunctionLocal() and scalable linear algebra

incorporate more scalable or higher performance algorithms

Such as domain decomposition, fieldsplit, and multigrid

tune your code to new architectures

Using profiling tools and specialized implementations

M. Knepley (UC)

PETSc

CNRS 12

156 / 156

Conclusions

PETSc can help you:

easily construct a code to test your ideas
Lots of code construction, management, and debugging tools

scale an existing code to large or distributed machines

Using FormFunctionLocal() and scalable linear algebra

incorporate more scalable or higher performance algorithms

Such as domain decomposition, fieldsplit, and multigrid

tune your code to new architectures

Using profiling tools and specialized implementations

M. Knepley (UC)

PETSc

CNRS 12

156 / 156

Conclusions

PETSc can help you:

easily construct a code to test your ideas
Lots of code construction, management, and debugging tools

scale an existing code to large or distributed machines

Using FormFunctionLocal() and scalable linear algebra

incorporate more scalable or higher performance algorithms

Such as domain decomposition, fieldsplit, and multigrid

tune your code to new architectures

Using profiling tools and specialized implementations

M. Knepley (UC)

PETSc

CNRS 12

156 / 156

Conclusions

PETSc can help you:

easily construct a code to test your ideas
Lots of code construction, management, and debugging tools

scale an existing code to large or distributed machines

Using FormFunctionLocal() and scalable linear algebra

incorporate more scalable or higher performance algorithms

Such as domain decomposition, fieldsplit, and multigrid

tune your code to new architectures

Using profiling tools and specialized implementations

M. Knepley (UC)

PETSc

CNRS 12

156 / 156

Set Theory Notes
No ratings yet
Set Theory Notes
11 pages
10th Maths Public Important Questions
80% (10)
10th Maths Public Important Questions
2 pages
AKTU Syllabus CS 3rd Yr
No ratings yet
AKTU Syllabus CS 3rd Yr
2 pages
IKS Sample Case Study Topics
No ratings yet
IKS Sample Case Study Topics
5 pages
Numerical Libraries For Petascale Computing: Brett Bode William Gropp
No ratings yet
Numerical Libraries For Petascale Computing: Brett Bode William Gropp
34 pages
Intro To PETSc
No ratings yet
Intro To PETSc
111 pages
TI 83 Plus Graphing Calculator For Dummies 1st Edition C. C. Edwards Instant Download
100% (4)
TI 83 Plus Graphing Calculator For Dummies 1st Edition C. C. Edwards Instant Download
61 pages
Ch-4 - Linear Equations in Two Variables - NCERT Exemplar
No ratings yet
Ch-4 - Linear Equations in Two Variables - NCERT Exemplar
30 pages
Fault Simulation
No ratings yet
Fault Simulation
65 pages
Data - Parallel Algorithms On Gpus
No ratings yet
Data - Parallel Algorithms On Gpus
31 pages
Imp Questions Vsaq's Maths - 1a PDF
No ratings yet
Imp Questions Vsaq's Maths - 1a PDF
4 pages
Misconceptions and Nature of Math
No ratings yet
Misconceptions and Nature of Math
22 pages
PEA Botswana 2019 Primary Catalogue
No ratings yet
PEA Botswana 2019 Primary Catalogue
48 pages
The Parma Polyhedra Library User's Manual (Version 1.2)
No ratings yet
The Parma Polyhedra Library User's Manual (Version 1.2)
542 pages
Solving Pdes With Cuda
No ratings yet
Solving Pdes With Cuda
34 pages
Petsc Manual
No ratings yet
Petsc Manual
310 pages
User's Guide of Med Memory V 3.2
No ratings yet
User's Guide of Med Memory V 3.2
67 pages
Manual
No ratings yet
Manual
282 pages
03 - Limits, Indeterminate Forms
No ratings yet
03 - Limits, Indeterminate Forms
8 pages
Comprehensive Statistics Guide
No ratings yet
Comprehensive Statistics Guide
81 pages
Lecture 5 4
No ratings yet
Lecture 5 4
13 pages
Osdi23 Slides Zhao
No ratings yet
Osdi23 Slides Zhao
68 pages
Unit 4
No ratings yet
Unit 4
12 pages
Mixed Series Practice Worksheet
No ratings yet
Mixed Series Practice Worksheet
5 pages
User's Guide: Partial Differential Equation Toolbox™
No ratings yet
User's Guide: Partial Differential Equation Toolbox™
446 pages
Precalculus - ANSWERSHEET Q1 M7 8
No ratings yet
Precalculus - ANSWERSHEET Q1 M7 8
2 pages
Class 12 Mathematics Topic Wise Line by Line Questions Chapter 5 Applications of Derivatives
No ratings yet
Class 12 Mathematics Topic Wise Line by Line Questions Chapter 5 Applications of Derivatives
82 pages
Retrospective Mixture View of Experiments
No ratings yet
Retrospective Mixture View of Experiments
18 pages
Tetrahedral Mesh Generation Guide
No ratings yet
Tetrahedral Mesh Generation Guide
65 pages
Tetgen Manual 1 5 0
No ratings yet
Tetgen Manual 1 5 0
103 pages
INSA - Generic - Toolbox For FEM
No ratings yet
INSA - Generic - Toolbox For FEM
1 page
Prandtl Glauert Report: TH TH
No ratings yet
Prandtl Glauert Report: TH TH
11 pages
PETSc Tutorial: Scalable PDE Solutions
No ratings yet
PETSc Tutorial: Scalable PDE Solutions
231 pages
Structured Meshing with PINNs
No ratings yet
Structured Meshing with PINNs
13 pages
SPSS Instruction - Chapter 8
No ratings yet
SPSS Instruction - Chapter 8
20 pages
Assign 3
No ratings yet
Assign 3
8 pages
Advanced Volume Visualization Techniques
No ratings yet
Advanced Volume Visualization Techniques
24 pages
Experience of Developing Sparse Matrix Algorithms and Software For Sustainablity
No ratings yet
Experience of Developing Sparse Matrix Algorithms and Software For Sustainablity
22 pages
PETSc Guide for Computational Scientists
No ratings yet
PETSc Guide for Computational Scientists
25 pages
Petsc Users Manual: Mathematics and Computer Science Division
No ratings yet
Petsc Users Manual: Mathematics and Computer Science Division
222 pages
PG 0013
No ratings yet
PG 0013
1 page
JCP Symmpois Published
No ratings yet
JCP Symmpois Published
19 pages
PG 0012
No ratings yet
PG 0012
1 page
Array-Based Mesh Data Structures
No ratings yet
Array-Based Mesh Data Structures
20 pages
Matrix Computation On The GPU
No ratings yet
Matrix Computation On The GPU
455 pages
Articles CAF Symmetric FSM Published
No ratings yet
Articles CAF Symmetric FSM Published
9 pages
Solving Pdes With Petsc: William Gropp
No ratings yet
Solving Pdes With Petsc: William Gropp
92 pages
PETSc Tutorial
No ratings yet
PETSc Tutorial
132 pages
PETSc Manual PDF
No ratings yet
PETSc Manual PDF
272 pages
3D Mesh & Point Cloud SDK
No ratings yet
3D Mesh & Point Cloud SDK
5 pages
CUDA for Parallel Computing Experts
No ratings yet
CUDA for Parallel Computing Experts
33 pages
Linear and Non-Linear Relations Syllabus
No ratings yet
Linear and Non-Linear Relations Syllabus
3 pages
A Programming Language For The Fem: Freefem++: F. Hecht
No ratings yet
A Programming Language For The Fem: Freefem++: F. Hecht
186 pages
Function: 'Xdata' 'Ydata' 'Zdata'
No ratings yet
Function: 'Xdata' 'Ydata' 'Zdata'
2 pages
Matrix Lyon
No ratings yet
Matrix Lyon
37 pages
F13 341 Book Sec 8-4
No ratings yet
F13 341 Book Sec 8-4
2 pages
2207.05209 Fourier Neural Operator With Learned Deformations
No ratings yet
2207.05209 Fourier Neural Operator With Learned Deformations
17 pages
Lecture16 CE72.12FEM - Meshing
No ratings yet
Lecture16 CE72.12FEM - Meshing
26 pages
A FEM Algorithm in Octave: June 2000
100% (1)
A FEM Algorithm in Octave: June 2000
39 pages
Flintstones Cannon Balls and Calculus - Proceedengs
No ratings yet
Flintstones Cannon Balls and Calculus - Proceedengs
10 pages
MESH2D - Automatic 2D Mesh Generation
No ratings yet
MESH2D - Automatic 2D Mesh Generation
5 pages
Rapid Simulation of Hydraulic Fracturing Using A Planar 3D Model
No ratings yet
Rapid Simulation of Hydraulic Fracturing Using A Planar 3D Model
26 pages
Manual Freefem
No ratings yet
Manual Freefem
140 pages
Partial Differential Equation Toolbox Release Notes
No ratings yet
Partial Differential Equation Toolbox Release Notes
48 pages
2 6
No ratings yet
2 6
8 pages
CUDA Optimization for Jacobi Method
No ratings yet
CUDA Optimization for Jacobi Method
4 pages
CUDA Libraries and CUDA Fortran: Massimiliano Fatica
No ratings yet
CUDA Libraries and CUDA Fortran: Massimiliano Fatica
55 pages
Multi-Core Strategies For Particle Me: John R. Williams, David Holmes and Peter Tilke
No ratings yet
Multi-Core Strategies For Particle Me: John R. Williams, David Holmes and Peter Tilke
7 pages
CUDA Libraries for Developers
No ratings yet
CUDA Libraries for Developers
86 pages
06 Finite Elements Catalogs Options
No ratings yet
06 Finite Elements Catalogs Options
28 pages
Maths 1B LT
No ratings yet
Maths 1B LT
2 pages
NLMF (I, Options)
No ratings yet
NLMF (I, Options)
5 pages
Petsc Developers Manual: Argonne National Laboratory
No ratings yet
Petsc Developers Manual: Argonne National Laboratory
32 pages
2D Finite-Element Mesh Tools
No ratings yet
2D Finite-Element Mesh Tools
33 pages
Xiong Qiang 2007
No ratings yet
Xiong Qiang 2007
142 pages
PHY112 Electricity & Magnetism Lab Manual
No ratings yet
PHY112 Electricity & Magnetism Lab Manual
72 pages
FreeFem - PDE Solver
No ratings yet
FreeFem - PDE Solver
32 pages
Ujian Akhir Tengah Semester 3 TRIA
No ratings yet
Ujian Akhir Tengah Semester 3 TRIA
22 pages
Ilp Application
No ratings yet
Ilp Application
2 pages
Computer Graphics 2 - Object Representations: Tom Thorne
No ratings yet
Computer Graphics 2 - Object Representations: Tom Thorne
43 pages
Gender Impact on STEM Math Performance
No ratings yet
Gender Impact on STEM Math Performance
9 pages
Ufc User Manual
No ratings yet
Ufc User Manual
131 pages
Pramod Kumbha R
No ratings yet
Pramod Kumbha R
88 pages
A FEM Alghorithm in Octave
No ratings yet
A FEM Alghorithm in Octave
39 pages
Inverse Variation Exercises
No ratings yet
Inverse Variation Exercises
4 pages
Mathematics
No ratings yet
Mathematics
4 pages
Regula Falsi Method in Maths-3
No ratings yet
Regula Falsi Method in Maths-3
10 pages
Matlab for Differential Equations
No ratings yet
Matlab for Differential Equations
9 pages

PETSc Tutorial

Uploaded by

PETSc Tutorial

Uploaded by

The

Portable Extensible Toolkit for Scientific Computing

Never believe anything,

Never believe anything,

The PETSc Team

What I Need From You

Tell me if you do not understand

How We Can Help at the Tutorial

Point out relevant documentation

How We Can Help at the Tutorial

Point out relevant documentation

How We Can Help at the Tutorial

Point out relevant documentation

How We Can Help at the Tutorial

Point out relevant documentation

Managing Discretized Data

The DM interface uses the local callback functions to

Structured Meshes (DMDA)

Structured Meshes (DMDA)

DMDA is a topology interface on structured grids

Provides local and global vectors

Handles ghost values coherence

Structured Meshes (DMDA)

Callbacks are registered using

When PETSc needs to evaluate the nonlinear residual F(x),

Structured Meshes (DMDA)

Structured Meshes (DMDA)

DMDA Global Numberings

Structured Meshes (DMDA)

DMDA Global vs. Local Numbering

Structured Meshes (DMDA)

DMDA Local Function

(*lfunc)(DMDALocalInfo *info, PetscScalar **x, PetscScalar **r, void *ctx)

info: All layout and numbering information

Structured Meshes (DMDA)

Bratu Residual Evaluation

ResLocal(DMDALocalInfo *info, PetscScalar **x, PetscScalar **f, void *ctx)

Structured Meshes (DMDA)

DMDA Local Jacobian

User provided function calculates the Jacobian (in 2D)

info: All layout and numbering information

Structured Meshes (DMDA)

Bratu Jacobian Evaluation

Structured Meshes (DMDA)

The DMDA object contains only layout (topology) information

Global vectors are parallel

Local vectors are sequential (and usually temporary)

includes ghost and boundary values!

Structured Meshes (DMDA)

Two-step process enables overlapping

gvec provides the data

Finishes the communication

The process can be reversed with DALocalToGlobalBegin/End().

Structured Meshes (DMDA)

Structured Meshes (DMDA)

Setting Values on Regular Grids

Each row or column is actually a MatStencil

The values are the same logically dense block in row/col

Structured Meshes (DMDA)

DMDACreate2d(comm, bdX, bdY, type, M, N, m, n, dof, s, lm[], ln[], DMDA *d

bd: Specifies boundary behavior

type: Specifies stencil

M/N: Number of grid points in x/y-direction

s: The stencil width

Structured Meshes (DMDA)

We use SNES ex5

Shows both the DA and coordinate DA:

ex5 -da_grid_x 10 -da_grid_y 10 -dm_view draw -draw_pause -1

${PETSC_ARCH}/bin/mpiexec -n 4 ex5 -da_grid_x 10 -da_grid_y 10

Shows PETSc numbering

Structured Meshes (DMDA)

Evaluate only the local portion

Use MatSetValuesStencil() to convert (i,j,k) to indices

Unstructured Meshes (DMPlex)

Unstructured Meshes (DMPlex)

Compare different mesh types

(lfunc)(DMDALocalInfo info, PetscScalar x, PetscScalar r, void *ctx)

ResLocal(DMDALocalInfo info, PetscScalar x, PetscScalar f, void ctx)