0% found this document useful (0 votes)

58 views17 pages

Overview

MOSIX (multi-computer OS) is An operating system-like management system for distributed-memory architectures, such as clusters and multi-clusters. It is a Single-Systems Image (SSI) that Provides Continuous feedback about the state of resources. Users can login on any node and need not know where their programs run.

Uploaded by

Simon Fernandez Marquez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

58 views17 pages

Overview

Uploaded by

Simon Fernandez Marquez

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

Overview of MOSIX2

Prof. Amnon Barak Department of Computer Science The Hebrew University

http:// www . MOSIX . Org
July 2009

Copyright Amnon Barak 2009

Background
Clusters, multi-clusters (intra-organizational Grids) and Clouds are popular platforms for HPC Typically, users need to run multiple jobs with minimal burden how the resources are managed
Prefer not to:
Modify applications Copy files or login to different nodes Lose jobs when some nodes are disconnected

Users dont know (and doesnt care):

What is the configuration, the status and the locations of the nodes Availability of resources, e.g. CPU speed, load, free memory, etc.
2

Copyright Amnon Barak 2009

Traditional management packages

Most cluster management packages are batch dispatchers that place the burden of management on the users For example, these packages:
Use static assignment of jobs to nodes
May lose jobs when nodes are disconnected

Not transparent to applications

May require to link application with special libraries View the cluster as a set of independent nodes
One user per node, cluster partition for multi-users
Copyright Amnon Barak 2009

Traditional management packages

Applications
One-way assignment of jobs, no feedback

Dispatcher

Dual 4Core

2Core

4Core

Independent Linux workstations and servers

A failed node
Copyright Amnon Barak 2009

What is MOSIX (Multi-computer OS)

An operating system-like management system for distributed-memory architectures, such as clusters and multi-clusters, including remote clusters on Clouds Main feature: Single-Systems Image (SSI) Users can login on any node and need not know where their programs run Automatic resource discovery
Continuous monitoring of the state of the resources

Dynamic workload distribution by process migration

Automatic load-balancing Automatic migration from slower to faster nodes and from nodes that run out of free memory
Copyright Amnon Barak 2009

MOSIX is a unifying management layer

Applications
SSI
Continuous feedback about the state of resources

Transparent

MOSIX management
All the active nodes run like one server with many CPUs
6

Copyright Amnon Barak 2009

MOSIX Version 1
Can manage a single cluster
Main features:
Provides a SSI by process migration Supports scalable file systems

9 major releases developed for Unix, BSD, BDSI, Linux-2.2 and Linux-2.4
Production installations since 1989 Based on Linux since 1998
Copyright Amnon Barak 2009

MOSIX Version 2 (MOSIX2)

Can manage clusters and multi-clusters, with some tools for running applications on Clouds
Developed for Linux-2.6 Geared for High Performance Computing (HPC), especially for application with moderate amounts of I/O Main features:
Provides a SSI by process migration Process migration within a cluster and among different clusters Secure run time environment (sandbox) for guest processes Live queuing - queued jobs preserve their full generic Linux environment Supports batch jobs, checkpoint and recovery
Copyright Amnon Barak 2009

Running applications in a MOSIX cluster

MOSIX recognizes 2 type of processes: Linux processes - are not affected by MOSIX
Usually administrative tasks that are not suitable for migration Processes that use features not supported by MOSIX, e.g. threads MOSIX processes -usually applications that can benefit from migration All such processes are created by the ``mosrun''command They are started from standard Linux executables, but run in an environment that allows each process to migrate from one node to another Each MOSIX process has a unique home-node, which is usually the node in which the process was created Linux processes created by the ``mosrun -E'' command can still benefit from MOSIX, e.g be assigned to the least loaded nodes

Copyright Amnon Barak 2009

Examples: running interactive jobs

Possible ways to run myprog:
> myprog - run as a Linux process on the local node > mosrun myprog - run as a MOSIX process in the local cluster > mosrun -b myprog - assign the process to the least loaded node > mosrun -b m700 myprog - assign the process only to a nodes with 700MB of free memory > mosrun E -b m700 myprog - run as a native Linux job > mosrun M -b m700 myprog - run a MOSIX job whose home node can be any node in the local cluster

Copyright Amnon Barak 2009

Running batch jobs

To run 2000 instances of myprog on a multi-cluster
> mosrun G b m700 q S64 myfile

-G assign the job to a node in another cluster -S64 run up to 64 jobs at a time from the queue myfile a file with a list of 2000 jobs

Copyright Amnon Barak 2009

How does it work

Automatic resource discovery by a gossip algorithm Provides each node with the latest info about the cluster/multi-cluster resources (e.g free nodes) All the nodes disseminate information about relevant resources: speed, load, memory, local/remote I/O, IPC Info exchanged in a random fashion - to support scalable configurations and overcome failures Useful for high volume transaction processing
Example: a compilation farm - assign the next compilation to least loaded node

Copyright Amnon Barak 2009

Dynamic workload distribution

A set of algorithms that match between required and available resources
Geared to maximize the performance Initial allocation of processes to the best available nodes in the users private cluster
Not to nodes outside the private cluster Automatic load-balancing Automatic migration from slower to faster nodes Authorized processes move to idle nodes in other clusters

Multi-cluster-wide process migration

Outcome: users need not know the current state of the cluster and the multi-cluster resources
Copyright Amnon Barak 2009

Core technologies
Process migration move the process context to a remote node OS virtualization layer allow migrated processes to run in remote nodes, away from their creation (home) nodes

OS Virtualization layer

MOSIX Link reroute syscalls

OS Virtualization layer

Linux

Home node

Remote node

Gu e st

ca l

A migrated process

Copyright Amnon Barak 2009

The OS virtualization layer

Provides the necessary support for migrated processes
By intercepting and forwarding most system-calls to the home node

Result: migrated processes seem to be running in their respective home nodes

The users home-node environment is preserved No need to change applications, copy files or login to remote nodes or to link applications with any library Migrated processes run in a sandbox

Outcome: users get the illusion of running on one node

Drawback: increased communication and virtualization overheads
Reasonable vs. added cluster/multi-cluster services (see next slide)

Copyright Amnon Barak 2009

Reasonable overhead:
Linux vs. migrated MOSIX process times (Sec.), 1Gbit-Ethernet
Application RC SW JEL BLAT

Local - Linux process

Total I/O (MB)
Migrated process- same cluster slowdown

723.4
0

627.9
90

601.2
206

611.6
476

725.7
0.32%

637.1
1.47%

608.2
1.16%

620.1
1.39%

Migrated process across 1Km campus slowdown

Sample applications: RC = CPU-bound job JEL = electron motion
Copyright Amnon Barak 2009

727.0
0.5%

639.5
1.85%

608.3
1.18%

621.8
1.67%

SW = proteins sequences BLAT = protein alignments

Main multi-cluster features

Administrating a multi-cluster
Priorities among different clusters

Scheduling and monitoring

Supports batch jobs, checkpoint and recovery

Supports disruptive configurations MOSIX Reach the Clouds (MRC)

Copyright Amnon Barak 2009

Administrating a multi-cluster
A federation of x86 (both 32-bit and 64-bit) clusters, servers and workstations whose owners wish to cooperate from time to time Collectively administrated
Each owner maintains its private cluster Determine the priorities vs. other clusters Clusters can join or leave the multi-cluster at any time

Dynamic partition of nodes to private virtual clusters

Users of a group access the multi-cluster via their private clusters and workstations

Process migration among different cluster

Outcome: each cluster and the multi-cluster performs like a single computer with multiple processors
Why an intra-organizational Grid: due to trust

Copyright Amnon Barak 2009

The priority scheme

Cluster owners can assign priorities to processes from other clusters
Local and higher priority processes force out lower priority processes
c2 Symmetrically c1 symmetrically(C1-C2) or asymmetrically(C3-C4)

Pairs of clusters could be shared, A cluster could be shared (C6) among other clusters (C5, C7) or blocked for migration from other clusters (C7) Dynamic partitions of nodes to private
virtual clusters

A-symmetrically

Outcome: flexible use of nodes in shared clusters

c7
19

Copyright Amnon Barak 2009

When priorities are needed

Scenario 1: one cluster, some users run many jobs, depriving other users from their fair share
Solution: partition the cluster to several sub-clusters and allow each user to login to only one sub-cluster

Users in each sub-cluster can still benefit from idle nodes in the other sub-clusters Processes of local users (in each sub-cluster) has higher priority over guest processes from other sub-clusters

Scenario 2: some users run long jobs while other user need to run (from time to time) short jobs Scenario 3: several groups using a shared cluster
Sysadmin can assign different priorities to each group
Copyright Amnon Barak 2009

Scheduling and monitoring

Batch jobs run as Linux processes in different nodes Checkpoint & recovery - time basis, manually or by the program Live queuing queued jobs maintain an organic connection with
their Unix environment Queue management provides means for tracing jobs, changing priorities, order of execution, for running parallel e.g. MPI jobs Queued jobs are released gradually in a manner that prevents flooding the local cluster or other clusters

Built-in on-line monitor for the local cluster resources On-line web monitor of the multi-cluster and each cluster
http://www.mosix.org/webmon
21

Copyright Amnon Barak 2009

Example: queuing
With the -q flag, mosrun places the job in a queue Jobs from all the nodes in each cluster share one queue Queue policy: first-come-first-serve, with several exceptions Users can assign priorities to their jobs, using the q{pri} option
The lower the value of pre the higher priority The default priority is 50. It can be changed by the sysadmin Running jobs with pri < 50 should be coordinated with the clusters manager

Out-of-order and fair-share

These options allow to instantly start a fix number of jobs per user, overriding the queue

Examples:
> mosrun q b m1000 myprog (queue a MOSIX program to run in the cluster) > mosrun q60 G b -J1 myprog (queue a low priority job to run in a different cluster)

Copyright Amnon Barak 2009

>mosrun q30 E m500 myprog (queue a high priority batch job)

mosq view and control the queue

mosq list list the jobs waiting in the queue mosq listall list jobs already running from the queue and jobs waiting in the queue Mosq delete {pid} delete a waiting job from the queue Mosq run {pid} run a waiting process now Mosq cngpri {newpri}{pid} change the priority of a waiting job Mosq advance {pid} move a waiting job to the head of its priority group within the queue Mosq retard {pid} move a waiting job to the end of its priority group within the queue

More options in the mosq manual

Disruptive configurations
When a cluster is disconnected:
All guest processes move out To available remote nodes or to the home cluster All migrated processes from that cluster move back
Returning processes are frozen (image stored) on disks Frozen processes are reactivated gradually

Outcome:
Long running processes are preserved No overloading of nodes
Copyright Amnon Barak 2009

MOSIX Reach the Clouds (MRC)

MRC is a tool that allows applications to run in remote nodes on Clouds, without pre-copying files to these nodes Main features: Runs on both MOSIX clusters and Linux computers (with unmodified kernel) No need to pre-copy files to remote clusters Applications can access both local and remote files Supports file sharing among different computers Stdin/out/err are preserved locally Can be combined with "mosrun" on remote MOSIX clusters
Copyright Amnon Barak 2009

Hebrew University multi-cluster campus Grid (HUGI)

17 production MOSIX clusters ~350 nodes, ~750 CPUs
In Life-sciences, Med-school, Chemistry and Computer Science Sample applications that our users are running: Nano-technology Molecular dynamics Protein folding, Genomics (BLAT, SW) Weather forecasting Navier-Stokes equations and turbulence (CFD) CPU simulator of new hardware design (SimpleScalar)

Priorities among HUGI clusters

CS student Farm

100

20
CS Theory group cluster

20 50

100

Biology1

CS general Priority for Accepting processes From cluster

Theory Student Farm Biology1 Biology2

Biology 2 Priority for accepting processes From cluster

Theory

Priority
20 Blocked Blocked 50

20
CS General Cluster

20 50
Biology2

Priority
50 Blocked 70 20 27

Student Farm CS General

Biology1

Day use: idle shared nodes allocated to users

HUGI Chemistry
Computer Science

Life Sciences

Student farms

Group 2 clusters

Group 1 cluster

Student and guest processes

Guest processes from Group 1

Night use: most nodes are allocated to one group

HUGI
Computer Science

Student farms

Group 2 clusters

Group 1 cluster

Web monitor: www.MOSIX.org/webmon

Display: Total number of nodes/CPUs Number of nodes in each cluster Average load

Zooming on each cluster

Display: Load Free/used memory Swap space Uptime Users

Conclusions
MOSIX2 is a comprehensive set of tools for automatic management of Linux clusters and multi-clusters Self-management algorithms for dynamic allocation of system-wide resources
Cross clusters performance nearly identical to a cluster

Many supporting tools for ease of use

MRC for running applications on Clouds

Includes an installation script and manuals Can run in native mode or on top of virtual machine packages, e.g. VMware, Xen, MS Virtual Server, over an unmodified OS (Linux, Windows, OS X)
Copyright Amnon Barak 2009

How to obtain a copy of MOSIX

A free, unlimited trial copy is provided to faculty, staff and researchers for use in academic, research and non-profit organizations A free, limited evaluation copy is provided for nonprofit use Non-academics copies are available

Details at

http://www.MOSIX.org

Zte Lte FDD Lr18 Feature List
100% (4)
Zte Lte FDD Lr18 Feature List
57 pages
Cluster Computing
No ratings yet
Cluster Computing
57 pages
DFT Sequential Depth & Test Techniques
100% (2)
DFT Sequential Depth & Test Techniques
34 pages
3 MOSIX-slides 1 - 2
No ratings yet
3 MOSIX-slides 1 - 2
20 pages
Guide Magv
No ratings yet
Guide Magv
112 pages
HPC Solutions for Linux Clusters
No ratings yet
HPC Solutions for Linux Clusters
8 pages
Barak 1998
No ratings yet
Barak 1998
12 pages
Linux Cluster Computing with MOSIX
No ratings yet
Linux Cluster Computing with MOSIX
8 pages
MW Mad Min Guide
No ratings yet
MW Mad Min Guide
841 pages
Vxflex Data Sheet
No ratings yet
Vxflex Data Sheet
5 pages
Live Migration of Virtual Machines
No ratings yet
Live Migration of Virtual Machines
26 pages
CC - Unit 1
No ratings yet
CC - Unit 1
34 pages
Cluster Vision
100% (1)
Cluster Vision
25 pages
Openmosix: MD Nasim Alam Bachelor of Technology
No ratings yet
Openmosix: MD Nasim Alam Bachelor of Technology
18 pages
22 Clusters Slides
No ratings yet
22 Clusters Slides
61 pages
User Manual
No ratings yet
User Manual
116 pages
VDesktop, Docker, Kubernetes, Serverless Computing - Tagged
No ratings yet
VDesktop, Docker, Kubernetes, Serverless Computing - Tagged
52 pages
Lecture 9 Virtualization-1
No ratings yet
Lecture 9 Virtualization-1
18 pages
Chapter 6
No ratings yet
Chapter 6
27 pages
Clúster Knoppix
No ratings yet
Clúster Knoppix
10 pages
Workreport: Clusters With ELX
No ratings yet
Workreport: Clusters With ELX
27 pages
New Trends in IT
100% (1)
New Trends in IT
33 pages
ESX Short Presentation
No ratings yet
ESX Short Presentation
19 pages
Virtualization and Cloud Computing: Norman Wilde Thomas Huber
No ratings yet
Virtualization and Cloud Computing: Norman Wilde Thomas Huber
69 pages
Chapter 6
No ratings yet
Chapter 6
104 pages
Name Designation Email
No ratings yet
Name Designation Email
28 pages
ESX Server 3i Presentation
No ratings yet
ESX Server 3i Presentation
50 pages
VUG - VIO Beyond The Basics
No ratings yet
VUG - VIO Beyond The Basics
74 pages
Low Cost Supercomputing: Parallel Processing On Linux Clusters
No ratings yet
Low Cost Supercomputing: Parallel Processing On Linux Clusters
43 pages
Final Executive Report 2
No ratings yet
Final Executive Report 2
19 pages
Network Operating System (NOS)
No ratings yet
Network Operating System (NOS)
26 pages
Vmware ESX Server
No ratings yet
Vmware ESX Server
36 pages
How Linux Works What Every Superuser Should Know Brian Ward Instant Download
No ratings yet
How Linux Works What Every Superuser Should Know Brian Ward Instant Download
154 pages
University of Peshawar
No ratings yet
University of Peshawar
7 pages
Waldspurger
No ratings yet
Waldspurger
15 pages
Lecture 8 ICT723
No ratings yet
Lecture 8 ICT723
41 pages
Clustering For Massive Parallelism
No ratings yet
Clustering For Massive Parallelism
3 pages
1b. VMware Introduction
No ratings yet
1b. VMware Introduction
34 pages
Cluster Stack Basics
No ratings yet
Cluster Stack Basics
25 pages
Ch3 Processes
No ratings yet
Ch3 Processes
26 pages
05-Lecture 04-Lec - Services Overview 04-Lec - Services Overview
No ratings yet
05-Lecture 04-Lec - Services Overview 04-Lec - Services Overview
33 pages
Ct5054ni WK05 L 93481
No ratings yet
Ct5054ni WK05 L 93481
41 pages
Computing Cluster Design Guide
No ratings yet
Computing Cluster Design Guide
168 pages
Linux Cluster White Paper
No ratings yet
Linux Cluster White Paper
17 pages
Paperless Technology
No ratings yet
Paperless Technology
17 pages
Distributed Systems Slides-Lesson4
No ratings yet
Distributed Systems Slides-Lesson4
36 pages
Low Cost Supercomputing: Parallel Processing On Linux Clusters
No ratings yet
Low Cost Supercomputing: Parallel Processing On Linux Clusters
43 pages
ds3 Pro
No ratings yet
ds3 Pro
72 pages
Windows Server HyperV As Mission Critical Platform
No ratings yet
Windows Server HyperV As Mission Critical Platform
22 pages
L4NIT - Window Server Administration
No ratings yet
L4NIT - Window Server Administration
73 pages
Vmware Overview
No ratings yet
Vmware Overview
30 pages
Is It Time To Get Rid of The Linux OS Model in The Cloud
No ratings yet
Is It Time To Get Rid of The Linux OS Model in The Cloud
6 pages
Clearpath Os 2200 Software Series: Deployment On Vmware Vsphere Best Practices Guide
No ratings yet
Clearpath Os 2200 Software Series: Deployment On Vmware Vsphere Best Practices Guide
48 pages
Network OS Guide for IT Professionals
No ratings yet
Network OS Guide for IT Professionals
41 pages
Chapter 2 - Operating Systems
No ratings yet
Chapter 2 - Operating Systems
12 pages
Chapter 2 - Operating Systems
No ratings yet
Chapter 2 - Operating Systems
12 pages
VMware Basic Presentation
No ratings yet
VMware Basic Presentation
24 pages
RFC 3031 - Multiprotocol Label Switching Architecture
No ratings yet
RFC 3031 - Multiprotocol Label Switching Architecture
61 pages
VHDL Coding Tips and Tricks
No ratings yet
VHDL Coding Tips and Tricks
209 pages
BDA Presentations
No ratings yet
BDA Presentations
26 pages
Call Forwarding Services
No ratings yet
Call Forwarding Services
28 pages
Final Project C++
No ratings yet
Final Project C++
9 pages
EOA3630 UserManual v1.0 20100218
No ratings yet
EOA3630 UserManual v1.0 20100218
58 pages
Efficient Courier Tracking System Project
100% (2)
Efficient Courier Tracking System Project
5 pages
Car Parking System
No ratings yet
Car Parking System
23 pages
AoIP With Multipoint A-Interface
No ratings yet
AoIP With Multipoint A-Interface
22 pages
ERNW Checklist Tomcat7 Hardening
No ratings yet
ERNW Checklist Tomcat7 Hardening
13 pages
Delhi Technological University: Computer Communication Networks
No ratings yet
Delhi Technological University: Computer Communication Networks
9 pages
Smart Energy Meter
100% (1)
Smart Energy Meter
18 pages
Siprotec 7vk87 Profile
No ratings yet
Siprotec 7vk87 Profile
2 pages
Ethereum Node on Raspberry Pi Guide
No ratings yet
Ethereum Node on Raspberry Pi Guide
20 pages
Invoice
No ratings yet
Invoice
2 pages
2003-11 HUB The Computer Paper
No ratings yet
2003-11 HUB The Computer Paper
60 pages
Hotel Management System Correct Final SRS
No ratings yet
Hotel Management System Correct Final SRS
10 pages
Snop Sis
No ratings yet
Snop Sis
8 pages
Secured Steganography To Send Seceret Message: Project ID: 1029
No ratings yet
Secured Steganography To Send Seceret Message: Project ID: 1029
33 pages
AT89C51 Microcontroller
No ratings yet
AT89C51 Microcontroller
8 pages
Introduction To Data Communication: Book: Data Communications and Networking
No ratings yet
Introduction To Data Communication: Book: Data Communications and Networking
51 pages
Detailed Lesson Plan in Computer 4 - 055554
No ratings yet
Detailed Lesson Plan in Computer 4 - 055554
16 pages
7 Krone Nos. 3 468 10-Pair Krone Module (As Per Customer Back Mount Frame & Old Box Use)
No ratings yet
7 Krone Nos. 3 468 10-Pair Krone Module (As Per Customer Back Mount Frame & Old Box Use)
2 pages
Data Encoding and Compression
No ratings yet
Data Encoding and Compression
31 pages
MICA2 Datasheet
No ratings yet
MICA2 Datasheet
2 pages
Embedded Systems Basics and Applications
No ratings yet
Embedded Systems Basics and Applications
27 pages
PATIENT - BILLING - SOFTWARE1-rk Project
No ratings yet
PATIENT - BILLING - SOFTWARE1-rk Project
152 pages
Addis Ababa University: Institute of Technology
No ratings yet
Addis Ababa University: Institute of Technology
23 pages

Overview

Uploaded by

Overview

Uploaded by

Overview of MOSIX2

Prof. Amnon Barak Department of Computer Science The Hebrew University

Copyright Amnon Barak 2009

Users dont know (and doesnt care):

Copyright Amnon Barak 2009

Traditional management packages

Not transparent to applications

Traditional management packages

Independent Linux workstations and servers

What is MOSIX (Multi-computer OS)

Dynamic workload distribution by process migration

MOSIX is a unifying management layer

Copyright Amnon Barak 2009

MOSIX Version 2 (MOSIX2)

Running applications in a MOSIX cluster

Copyright Amnon Barak 2009

Examples: running interactive jobs

Copyright Amnon Barak 2009

Running batch jobs

Copyright Amnon Barak 2009

How does it work

Copyright Amnon Barak 2009

Dynamic workload distribution

Multi-cluster-wide process migration

MOSIX Link reroute syscalls

Copyright Amnon Barak 2009

The OS virtualization layer

Result: migrated processes seem to be running in their respective home nodes

Outcome: users get the illusion of running on one node

Copyright Amnon Barak 2009

Local - Linux process

Migrated process across 1Km campus slowdown

SW = proteins sequences BLAT = protein alignments

Main multi-cluster features

Scheduling and monitoring

Supports disruptive configurations MOSIX Reach the Clouds (MRC)

Copyright Amnon Barak 2009

Dynamic partition of nodes to private virtual clusters

Process migration among different cluster

Copyright Amnon Barak 2009

The priority scheme

Outcome: flexible use of nodes in shared clusters

Copyright Amnon Barak 2009

When priorities are needed

Scheduling and monitoring

Copyright Amnon Barak 2009

Out-of-order and fair-share

Copyright Amnon Barak 2009

>mosrun q30 E m500 myprog (queue a high priority batch job)

mosq view and control the queue

More options in the mosq manual

MOSIX Reach the Clouds (MRC)

Hebrew University multi-cluster campus Grid (HUGI)

Copyright Amnon Barak 2009

Priorities among HUGI clusters

CS general Priority for Accepting processes From cluster

Biology 2 Priority for accepting processes From cluster

Student Farm CS General

Copyright Amnon Barak 2009

Day use: idle shared nodes allocated to users

Student and guest processes

Guest processes from Group 1

Night use: most nodes are allocated to one group

Copyright Amnon Barak 2009

Web monitor: www.MOSIX.org/webmon

Copyright Amnon Barak 2009

Zooming on each cluster

Display: Load Free/used memory Swap space Uptime Users

Copyright Amnon Barak 2009

Many supporting tools for ease of use

How to obtain a copy of MOSIX

Copyright Amnon Barak 2009

You might also like