Independent University, Bangladesh
Department of Computer Science and Engineering
Course Title: Introduction to High Performance Computing
Course Code: Autumn-2024-CSC471
SECTION 1: (T) 06:30 PM --- 9:30 PM
Presented by
Dr. Rubaiyat Islam
Adjunct Faculty, IUB.
Omdena Bangladesh Chapter Lead
Crypto-economist Consultant
Sifchain Finance, USA.
HPC ARCHITECTURE
• Purpose: To insure a basic understanding of computer
hardware and infrastructure
• Brief discussion of typical supercomputer
• Components discussed: processors, memory, network,
storage
• Focus: processors and memory
2
ARCHITECTURE OF A SUPERCOMPUTER
• Highest level to lowest level architecture:
• System
• Nodes
• Cores/memory/storage
• Network interconnects the components of the system as a
whole
3
RACKS
• Supercomputers can encompass entire
rooms
• Components of system mounted in racks
• Nice cabinets with rails
• Can purchase standard racks or customize
• RMACC Summit – 10 racks
• Nine compute and one storage
• Racks are black metal cabinets that
enclose infrastructure
4
RACKS (2)
• Behind is cooling unit
• Moves hot air through heat exchangers
• Keeps optimal temperature
• Infrastructure is compute, networking,
or storage
• Storage – disks
• Network – high speed switches and cable
• Compute is of interest here
• 8 chassis of 4 nodes each (32 total)
5
COMPUTE NODE – CPU SOCKETS
• Sit within sockets on node, under large silver heat sinks (2 sockets, 2 CPUs)
• Within CPUs are cores, main processing power, and memory
• This node has 12 cores per socketed CPU, 24 total
• Power from amount of cores
6
COMPUTE NODE – MEMORY
• Memory cards are eight green, thin cards (RAM)
• Shared memory on node
• Eight 16 GB memory cards per node
• Also memory in socked CPUs (cache and shared between cores on one socket)
7
INTERCONNECT
• Supercomputers work together as one big unit to solve larger
problems
• Provide large processing power
• In theory use entire system! Much bigger than laptop!
• To work together must have an interconnect
• Access to memory and computing power
• Nodes talk to each other
• OmniPath, Infiniband
• How would use this system to solve a larger problem?
8
SOFTWARE INSTALLATION
• Managing loaded software can be a headache
• Make sure that correct versions are available
• Make sure that software dependencies for package A don’t
interfere with Package B
• If simply load software in a directory can run into these
issues on a shared system
• Want to use a package manager
• However usually a combination of both on HPC systems
MODULES
• Environment modules allow centers to provide multiple
versions of software and load dependencies seamlessly
• A module is a package that contains all of the files required
to run the software, including libraries
• Will load required dependencies
• Users can access software using a few simple commands
WORKING WITH MODULES
• See a list of available modules
module avail
• Load a module
• Adds software to your $PATH
• May also load dependencies
• May also unload other versions or dependencies that would conflict
• module load <name_of_module>/<version>
• Example: module load hdf5
WORKING WITH MODULES
• See a list of available modules List of loaded modules
module avail module list
• Unload a module Unload all modules
module unload
module purge
• Discover information about module
• module spider <name_of module>/<version>
• Example: module spider mpich
• Tells you about dependencies, the package, etc.
ALLOCATION
DEFINIOTION
HOW ARE ALLOCATION ARE
USED?
INSTALLING YOUR OWN
SOFTWARE
• Sometimes the cluster you are working on does not have
the software you need
• General process:
• Download software
• Install software
• Read instructions
• Install dependencies
• Compile
• Use
INSTALLING YOUR OWN
SOFTWARE
• What might this look like?
• Clone some files from Git
• Download a docker or singularity image
• Install from a file
• Olden days – install from a disk
• Just get the files on the compute system you’re installing on
• Install additional software
• Run make
• ./install_file
18
NODE TYPES
• HPC infrastructure comprised of several components
• Different node types
• Specific nodes vary by center
• Three general types:
• Login nodes
• Compile nodes
• Compute nodes
19
LOGIN NODES
• Where you typically land when logging into the system
• Not a place for heavy computation
• Not a place for running memory intensive applications
• Great for:
• Script editing
• Job submission
20
COMPILE NODES
• Place to compile code
• Same software stack and compilers as compute nodes
• When compile code on this node should run on compute
node
• Only certain languages require compiling
• C, C++, Fortran
• Not for Python, R, Matlab
• Do not have compile nodes for course
21
COMPUTE NODES
• Where the submitted jobs run
• Accessible indirectly through job scheduler
• Heavy computational load
RUNNING JOBS
• What is a “job”?
• Batch jobs
• Submit job that will be executed in background
• Can create a text file containing information about the job
• Submit the job file to a queue
• Interactive jobs
• Work interactively at the command line of a compute node
• Login to compute node
JOB SCHEDULING
• On a supercomputer, jobs are scheduled rather than just run
instantly at the command line
• Shared system
• Jobs are put in a queue until resources are available
• Need software that will distribute the jobs appropriately and
manage the resources
• Simple Linux Utility for Resource Management (Slurm)
• Keeps track of what nodes are busy/available, and what jobs are
queued or running
• Tells the resource manager when to run which job on the
available resources
LINUX 6 COMMANDS
REMOTE LOGIN
26
REMOTE SYSTEMS
• A remote system is one that you are accessing from
another computer
• Unless you have built a cluster at home, or work in an HPC
center, most HPC systems will require remote access
• Two ways one interacts with a remote system
• Logging in
• File transfer
27
LOGGING IN
• Generally, one uses an ssh protocol to login to a remote
system
• Provides a secure channel over which one can remotely
connect
• Authenticate connection through keys, public and private
• Example:
ssh [email protected]
Might have some flags after the ssh
28
FILE TRANSFER
• Recommend several ways
• Depends on your needs and size of data
• scp, sftp, wget, rsync, Globus file transfer
• scp and sftp are good because they are secure
• Example (several ways to do this):
scp /home/username/file.txt
[email protected]:/home/username
scp
[email protected]:/home/username/file.txt .
29
TYPICAL TYPES OF FILE
• The three types of storage spaces users are typically
allocated on HPC infrastructure:
• Home
• Projects or Work
• Scratch
• Each space is important for different reasons, and
understanding the difference between each is imperative
30
HOME
• /home is intended for the use of the owner of this space
only
• It is found at /home/$USER or ~
• Usually this space is backed up
• Also generally allocated a small amount of space – on the
order of 5 GB, varies
• Usually where you land when you login
• Test: login, type pwd
31
PROJECTS OR WORK
• Generally a space for mid-level size data
• Might have approximately 250-500 GB of space available
• Sometimes backed up
• For us: /projects/$USER
• Type: cd /projects/$USER
32
SCRATCH
• Scratch space is provided on most HPC systems
• Usually a much large quota available
• Temporary space
• Usually not backed up
• Type: cd /scratch/$USER
33
WHAT BELONGS WHERE?
• /home
• Scripts
• Code
• Very small files
• Inappropriate for sharing files with others
• Inappropriate for job output
• /projects
• Code/files/libraries relevant for any software you are installing (if you want to share files with others)
• Mid-level size input files
• Appropriate for sharing files with others
• Inappropriate for job output
• /scratch
• Output from running jobs
• Large files
• Appropriate for sharing files with others
• THIS IS NOT APPROPRIATE FOR LONG TERM STORAGE
OUTLINE
• Part 1: Intro to Linux
• Linux Overview
• Shells and environments
• Commands
• Files, Directories, Filesystems
• Part 2: Job Submission
• General Info
• Simple batch jobs
• Running programs, MPI
• Interactive jobs
34
PART 1: LINUX
35
LINUX OVERVIEW
• Part of the Unix-like family of operating systems.
• Started in early ‘90s by Linus Torvalds.
• Typically refers only to the kernel with software from the GNU
project and elsewhere layered on top to form a complete OS.
Most is open source.
• Several distributions are available; from enterprise-grade, like
RHEL or SUSE, to more consumer-focused, like Ubuntu.
• Runs on everything from embedded systems to
supercomputers.
36
WHY USE LINUX
• Default operating system on virtually all HPC systems
• Extremely flexible and not overbearing
• Fast and powerful
• Many potent tools for software development
• You can get started with a few basic commands and build from
there
37
SECURE SHELL (SSH)
• To a remote system, use Secure Shell (SSH)
• From Windows
• Non-GUI SSH application: Windows PowerShell
• GUI SSH application: PuTTY
• Putty is preferred method.
• Hostname: login.rc.colorado.edu
or…
• Hostname: tlogin1.rc.colorado.edu
• From Linux, Mac OS X terminal, ssh on the command line
38
RC ACCESS: LOGGING IN
• If you have an RMACC RC account already, login as follows from a terminal:
$ ssh <username>@login.rc.colorado.edu
# Where username is your identikey
• If you do not have an RMACC RC account use one of our temporary accounts:
$ ssh user<XXXX>@tlogin1.rc.colorado.edu
# Where user<XXXX> is your temporary username
39
USEFUL SSH OPTIONS
• -X or -Y
• Allows X-windows to be forwarded back to your local display
• -o TCPKeepAlive=yes
• Sends occasional communication to the SSH server even when you’re
not typing, so firewalls along the network path won’t drop your “idle”
connection
40
The Shell
• Parses and interprets typed input
• Passes results to the OS and returns results as appropriate.
• Shells
• Bourne-Again (bash) – Widely used user friendly shell. Default on Summit.
• T (tcsh) – C Shell with extended features and C syntax. Also very common.
• Features
• Tab completion
• History and command-line editing
• Scripting and programming
• Built-in utilities
41
Shells
User
Space
User
Shell
Command Applicatio
s ns
Linux Kernel
Kernel
Space
Hardware
42
Command Anatomy
flag paramete
s r
command tar -c -f archive.tar mydir target
• Case-sensitive
• Order of flags may be important
• Flags may not mean the same thing when used with different commands
43
The most important Linux command:
man
$ man <command>
$ man -k
<keyword>
Note: You can google commands too!
https://man7.org/linux/man-pages/man1/man.1.html
44
Filesystem Commands
Command Description
pwd prints full path to current directory
cd changes directory; can use full or relative path as target
mkdir creates a subdirectory in the current directory
rmdir removes an empty directory
rm removes a file (rm -r removes a directory and all its contents)
cp copies a file
mv moves (or renames) a file or directory
ls lists the contents of a directory (ls -l gives detailed listing)
chmod/chown change permissions or ownership
df displays filesystems and their sizes
du shows disk usage (du -sk shows size of a directory and its contents in KB)
45
File Editing Commands
Command Description
less displays a file one screen at a time
cat prints entire file to the screen
head prints the first few lines of a file
tail prints the last few lines of a file (with -f shows in realtime the end of a file that may
be changing)
diff shows differences between two files
grep prints lines containing a string or other regular expression
tee prints the output of a command and copies the output to a file
sort sorts lines in a file
find searches for files that meet specified criteria
wc count words, lines, or characters in a file
46
Environments
• Set up using shell and environment variables
• shell: only effective in the current shell itself
• environment: carry forward to subsequent commands or shells
• Set default values at login time using .bash_profile
(or .profile). Non-login interactive shells will read
.bashrc instead.
• var_name[=value] (shell)
• export VAR_NAME[=value] (environment)
• env (shows current variables)
• $VAR_NAME (refers to value of variable)
47
Important variables
• PATH: directories to search for commands
• HOME: home directory
• DISPLAY: screen where graphical output will appear
• MANPATH: directories to search for manual pages
• LANG: current language encoding
• PWD: current working directory
• USER: username
• LD_LIBRARY_PATH: directories to search for shared objects
(dynamically-loaded libs)
• LM_LICENSE_FILE: files to search for FlexLM software licenses
48
The Linux Filesystem
• System of arranging files on disk
• Consists of directories (folders) that can contain files or other
directories
• Levels in full paths separated by forward slashes, e.g.
/home/nunez/scripts/analyze_data.sh
• Case-sensitive; spaces in names discouraged
• Some shorthand: Symbol Description
. Current directory
.. The directory 1 Level Above
~ The home directory
- Previous directory when used with cd
49
Filesystem MULTIPLE USERS
/
bin usr home
Relative path
/local /<username> ../../usr/local
/bin /documents
/hpc
/usr/local/bin
/notes.txt
Absolute path
/home/<username>/documents/hpc/notes.txt
50
Navigating the Filesystem
• Examples:
• ls
• mkdir
• cd
• rm
• Permissions (modes)
51
File Editing
• nano – simple and intuitive to get started with; not very feature-ful;
keyboard driven
• vi/vim – universal; keyboard driven; powerful but some learning
curve required
• emacs – keyboard or GUI versions; helpful extensions for
programmers; well-documented
• LibreOffice – for WYSIWYG
• Use a local editor via an SFTP program to remotely edit files.
52
Modes/Permissions
• 3 classes of users:
• User (u) aka “owner”
• Group (g)
• Other (o)
• 3 types of permissions:
• Read (r)
• Write (w)
• Execute (x)
53
Modes
• chmod changes modes:
To add write and execute permission for your group:
chmod g+wx filename
To remove execute permission for others:
chmod o-x filename
To set only read and execute for your group and others:
chmod go=rx filename
54
THANK YOU
55