Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
8 views71 pages

Lecture8 Resource Management

Uploaded by

hamnakhalid200
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views71 pages

Lecture8 Resource Management

Uploaded by

hamnakhalid200
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 71

FECE

Resource Management

Dr. Laeeq Ahmed


Department of Computer Science
Jalozai Campus

5 th Jan, 2023

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 1/ 1


Motivation

► Rapid innovation in cloud computing.

► No single framework optimal for all applications.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 2/ 1


Motivation

► Rapid innovation in cloud computing.

► No single framework optimal for all applications.

► Running each framework on its dedicated cluster:


• Expensive
• Hard to share data

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 2/ 1


Proposed Solution

Running multiple frameworks on a single cluster

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 3/ 1


Proposed Solution

Running multiple frameworks on a single cluster

Maximize utilization
Share data between frameworks

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 3/ 1


Two Resource Management Systems ...

► Mesos

► YARN

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 4/ 1


Two Resource Management Systems ...

► Mesos

► YARN

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 4/ 1


Mesos

Mesos
A common resource sharing layer, over which diverse
frameworks can run

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 5/ 1


Mesos

Mesos
A common resource sharing layer, over which diverse
frameworks can run

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 5/ 1


Mesos Goals

► High utilization of resources

► Support diverse frameworks (current and future)

► Scalability to 10,000’s of nodes

► Reliability in face of failures

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 6/ 1


Computation Model

► A framework (e.g., Hadoop, Spark) manages and runs one or more


jobs.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 7/ 1


Computation Model

► A framework (e.g., Hadoop, Spark) manages and runs one or more


jobs.

► A job consists of one or more tasks.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 7/ 1


Computation Model

► A framework (e.g., Hadoop, Spark) manages and runs one or more


jobs.

► A job consists of one or more tasks.

► A task (e.g., map, reduce) consists of one or more processes running


on same machine.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 7/ 1


Computation Model

► A framework (e.g., Hadoop, Spark) manages and runs one or more


jobs.

► A job consists of one or more tasks.

► A task (e.g., map, reduce) consists of one or more processes running


on same machine.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 7/ 1


Mesos Design Elements

► Fine-grained sharing

► Resource offers

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 8/ 1


Fine-Grained Sharing

► Allocation at the level of tasks within a job.

► Improves utilization, latency, and data locality.

Coarse-grained sharing Fine-grained sharing

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 9/ 1


Resource Offer

► Offer available resources to frameworks, let them pick which re-


sources to use and which tasks to launch.

► Keeps Mesos simple, lets it support future frameworks.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 10 / 1


Question?
How to schedule resource offering among frameworks?

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 11 / 1


Schedule Frameworks

► Global scheduler

► Distributed scheduler

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 12 / 1


Global Scheduler (1/2)

► Job requirements
• Response time
• Throughput
• Availability

► Job execution plan


• Task DAG
• Inputs/outputs

► Estimates
• Task duration
• Input sizes
• Transfer sizes

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 13 / 1


Global Scheduler (2/2)

► Advantages
• Can achieve optimal schedule.

► Disadvantages
• Complexity: hard to scale and ensure resilience.
• Hard to anticipate future frameworks requirements.
• Need to refactor existing frameworks.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 14 / 1


Distributed Scheduler (1/3)

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 15 / 1


Distributed Scheduler (2/3)

► Unit of allocation: resource offer


• Vector of available resources on a node
• For example, node1: < 1CPU,1GB >, node2: < 4CPU,16GB >

► Master sends resource offers to frameworks.

► Frameworks select which offers to accept and which tasks to run.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 16 / 1


Distributed Scheduler (3/3)

► Advantages
• Simple: easier to scale and make resilient.
• Easy to port existing frameworks, support new ones.

► Disadvantages
• Distributed scheduling decision: not optimal.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 17 / 1


Mesos Architecture (1/4)

► Slaves continuously send status updates about resources to the Master.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 18 / 1


Mesos Architecture (2/4)

► Pluggable scheduler picks framework to send an offer to.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 19 / 1


Mesos Architecture (3/4)

► Framework scheduler selects resources and provides tasks.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 20 / 1


Mesos Architecture (4/4)

► Framework executors launch tasks.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 21 / 1


Question?
How to allocate resources of different types?

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 22 / 1


Single Resource: Fair Sharing
CPU
► n users want to share a resource, e.g., CPU.
• Solution: allocate each 1
n of the shared resource.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 23 / 1


Single Resource: Fair Sharing
CPU
► n users want to share a resource, e.g., CPU.
• Solution: allocate each 1
n of the shared resource.

► Generalized by max-min fairness.


• Handles if a user wants less than its fair share.
• E.g., user 1 wants no more than 20%.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 23 / 1


Single Resource: Fair Sharing
CPU
► n users want to share a resource, e.g., CPU.
• Solution: allocate each 1
n of the shared resource.

► Generalized by max-min fairness.


• Handles if a user wants less than its fair share.
• E.g., user 1 wants no more than 20%.

► Generalized by weighted max-min fairness.


• Give weights to users according to importance.
• E.g., user 1 gets weight 1, user 2 weight 2.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 23 / 1


Max-Min Fairness

► 1 resource: CPU

► Total resources: 20 CPU

► User 1 has x tasks and wants < 1CPU > per task

► User 2 has y tasks and wants < 2CPU > per task

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 24 / 1


Max-Min Fairness

► 1 resource: CPU

► Total resources: 20 CPU

► User 1 has x tasks and wants < 1CPU > per task

► User 2 has y tasks and wants < 2CPU > per task

max(x,y) (maximize allocation)

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 24 / 1


Max-Min Fairness

► 1 resource: CPU

► Total resources: 20 CPU

► User 1 has x tasks and wants < 1CPU > per task

► User 2 has y tasks and wants < 2CPU > per task

max(x,y) (maximize allocation)


subject to
x + 2y ≤ 20 (CPU constraint)
x = 2y

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 24 / 1


Max-Min Fairness

► 1 resource: CPU

► Total resources: 20 CPU

► User 1 has x tasks and wants < 1CPU > per task

► User 2 has y tasks and wants < 2CPU > per task

max(x,y) (maximize allocation)


subject to
x + 2y ≤ 20 (CPU constraint)
x = 2y
so
x = 10
y=5

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 24 / 1


Why is Fair Sharing Useful?

► Proportional allocation: user 1 gets weight 2, user 2 weight 1.

► Priorities: give user 1 weight 1000, user 2 weight 1.

► Reservations: ensure user 1 gets 10% of a resource, so give user 1


weight 10, sum weights ≤ 100.

► Isolation policy: users cannot affect others beyond their fair share.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 25 / 1


Properties of Max-Min Fairness

► Share guarantee
• Each user can get at least 1n of the resource.
• But will get less if her demand is less.

► Strategy proof
• Users are not better off by asking for more than they need.
• Users have no reason to lie.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 26 / 1


Properties of Max-Min Fairness

► Share guarantee
• Each user can get at least 1n of the resource.
• But will get less if her demand is less.

► Strategy proof
• Users are not better off by asking for more than they need.
• Users have no reason to lie.

► Max-Min fairness is the only reasonable mechanism with these two


properties.

► Widely used: OS, networking, datacenters, ...

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 26 / 1


Example of Max-Min Fairness

Total number of available CPUs =20 and we have four Users

User 1 asks for 5 CPUs

User 2 asks for 3 CPUs

User 3 asks for 9 CPUs

User 4 asks for 7 CPUs 20/4 = 5 CPU each

User 2 will get 3 CPUs then 17/3 = 5.66

User 1 will get 5 CPUs then 12/2= 6

User 3 and 4 will get 6 CPUs each


Question?
When is Max-Min Fairness NOT Enough?

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 27 / 1


Question?
When is Max-Min Fairness NOT Enough?

Need to schedule multiple, heterogeneous resources, e.g.,


CPU, memory, etc.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 27 / 1


Problem

► Single resource example


• 1 resource: CPU
• User 1 wants < 1CPU > per task
• User 2 wants < 2CPU > per task

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 28 / 1


Problem

► Single resource example


• 1 resource: CPU
• User 1 wants < 1CPU > per task
• User 2 wants < 2CPU > per task

► Multi-resource example
• 2 resources: CPUs and mem
• User 1 wants < 1CPU,4GB > per task
• User 2 wants < 2CPU,1GB > per task

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 28 / 1


Problem

► Single resource example


• 1 resource: CPU
• User 1 wants < 1CPU > per task
• User 2 wants < 2CPU > per task

► Multi-resource example
• 2 resources: CPUs and mem
• User 1 wants < 1CPU,4GB > per task
• User 2 wants < 2CPU,1GB > per task

• What is a fair allocation?

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 28 / 1


A Natural Policy (1/2)

► Asset fairness: give weights to resources (e.g., 1 CPU = 1 GB) and


equalize total value given to each user.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 29 / 1


A Natural Policy (1/2)

► Asset fairness: give weights to resources (e.g., 1 CPU = 1 GB) and


equalize total value given to each user.

► Total resources: 28 CPU and 56GB RAM (e.g., 1 CPU = 2 GB)


• User 1 has x tasks and wants < 1CPU,2GB > per task
• User 2 has y tasks and wants < 1CPU,4GB > per task

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 29 / 1


A Natural Policy (1/2)

► Asset fairness: give weights to resources (e.g., 1 CPU = 1 GB) and


equalize total value given to each user.

► Total resources: 28 CPU and 56GB RAM (e.g., 1 CPU = 2 GB)


• User 1 has x tasks and wants < 1CPU,2GB > per task
• User 2 has y tasks and wants < 1CPU,4GB > per task

► Asset fairness yields:


max(x,y)
x + y ≤ 28
2x + 4y ≤ 56
4x = 6y
User 1: x = 12: < 43%CPU,43%GB > ( Σ = 86%)
User 2: y = 8: < 28%CPU,57%GB > (Σ = 86%)
Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 29 / 1
A Natural Policy (2/2)

► Problem: violates share grantee.

► User 1 gets less than 50% of both CPU and RAM.

► Better off in a separate cluster with half the resources.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 30 / 1


Challenge

► Can we find a fair sharing policy that provides:


• Share guarantee
• Strategy-proofness

► Can we generalize max-min fairness to multiple resources?

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 31 / 1


Proposed Solution

Dominant Resource Fairness (DRF)

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 32 / 1


Dominant Resource Fairness (DRF) (1/2)

► Dominant resource of a user: the resource that user has the biggest
share of.
• Total resources: < 8CPU,5GB >
• User 1 allocation: < 2CPU,1GB >
2 = 25%CPU and 1 = 20%RAM
8 5
• Dominant resource of User 1 is CPU (25% > 20%)

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 33 / 1


Dominant Resource Fairness (DRF) (1/2)

► Dominant resource of a user: the resource that user has the biggest
share of.
• Total resources: < 8CPU,5GB >
• User 1 allocation: < 2CPU,1GB >
2 = 25%CPU and 1 = 20%RAM
8 5
• Dominant resource of User 1 is CPU (25% > 20%)

► Dominant share of a user: the fraction of the dominant resource she


is allocated.
• User 1 dominant share is 25%.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 33 / 1


Dominant Resource Fairness (DRF) (2/2)

► Apply max-min fairness to dominant shares: give every user an equal


share of her dominant resource.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 34 / 1


Dominant Resource Fairness (DRF) (2/2)

► Apply max-min fairness to dominant shares: give every user an equal


share of her dominant resource.

► Equalize the dominant share of the users.


• Total resources: < 9CPU,18GB >
1
• User 1 wants < 1CPU,4GB >; Dominant resource: RAM 9
< 418
3
• User 2 wants < 3CPU,1GB >; Dominant resource: CPU 9
> 118

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 34 / 1


Dominant Resource Fairness (DRF) (2/2)

► Apply max-min fairness to dominant shares: give every user an equal


share of her dominant resource.

► Equalize the dominant share of the users.


• Total resources: < 9CPU,18GB >
1
• User 1 wants < 1CPU,4GB >; Dominant resource: RAM 9
< 418
3
• User 2 wants < 3CPU,1GB >; Dominant resource: CPU 9
> 118

► max(x,y)
x + 3y ≤ 9
4x + y ≤ 18
4x = 3y
18 9
User 1: x = 3: < 33%CPU,66%GB >
User 2: y = 2: < 66%CPU,16%GB >

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 34 / 1


Online DRF Scheduler

► Whenever there are available resources and tasks to run:


Schedule a task to the user with the smallest dominant share.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 35 / 1


Two Resource Management Systems ...

► Mesos

► YARN

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 36 / 1


YARN

YARN
Yet Another Resource Negotiator

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 37 / 1


YARN Architecture

► Resource Manager (RM)

► Application Master (AM)

► Node Manager (NM)

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 38 / 1


YARN Architecture - Resource Manager (1/2)
► One per cluster
• Central: global view
• Enable global properties
• Fairness, capacity, locality

► Job requests are submitted to RM.


• To start a job (application), RM finds a container to spawn AM.

► Container
• Logical bundle of resources (CPU/memory).

► No static resource partitioning.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 39 / 1


YARN Architecture - Resource Manager (2/2)

► Only handles an overall resource profile for each application.


• Local optimization is up to the application.

► Preemption
• Request resources back from an application.
• Checkpoint snapshot instead of explicitly killing jobs / migrate
computation to other containers.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 40 / 1


YARN Architecture - Application Manager (1/2)

► The head of a job.

► Runs as a container.

► Request resources from RM.


• # of containers/resource per container/locality ...

► Dynamically changing resource consumption,


based on the containers it receives from the RM.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 41 / 1


YARN Architecture - Application Manager (2/2)

► Requests are late-binding.


• The process spawned is not bound to the request, but to the lease.
• The conditions that caused the AM to issue the request may not
remain true when it receives its resources.

► Can run any user code, e.g., MapReduce, Spark, etc.

► AM determines the semantics of the success or failure of the con-


tainer.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 42 / 1


YARN Architecture - Node Manager (1/2)

► The worker daemon.

► Registers with RM.

► One per node.

► Report resources to RM: memory, CPU, ...

► Containers are described by a Container Launch Context (CLC).


• The command necessary to create the process
• Environment variables
• Security tokens
• ...

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 43 / 1


YARN Architecture - Node Manager (2/2)

► Configure the environment for task execution.

► Garbage collection.

► Auxiliary services.
• A process may produce data that persist beyond the life of the
container.
• Output intermediate data between map and reduce tasks.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 44 / 1


YARN Framework (1/2)

► Submitting the application: passing a CLC for the AM to the RM.

► When RM starts the AM, AM gets registered with the RM.


• AM Periodically advertise its liveness and requirements over
the heartbeat protocol.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 45 / 1


YARN Framework (2/2)

► Once the RM allocates the required resources, AM can construct


a CLC to launch the container on the corresponding NM.
• It monitors the status of the running container and stop it when the
resource should be reclaimed.

► Once the AM is done with its work, it should unregister from the
RM and exit cleanly.

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 46 / 1


Mesos vs. YARN

► Similarities:
• Both have schedulers at two levels.

► Differences:
• Mesos is an offer-based resource manager, whereas YARN has a
request-based approach.
• Mesos uses framework schedulers for inter-job scheduling, whereas
YARN uses per-job optimization through AM (however, per-job AM
has higher overhead compare to Mesos).

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 47 / 1


Summary

► Resource management: Mesos and YARN

► Mesos
• Offered-based (Push based Scheduling)
• Max-Min fairness: DRF

► YARN
• Request-based (Pull based Scheduling)
• RM, AM, NM

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 48 / 1


Questions?
Acknowledgements
Some slides were derived from Ion Stoica and Ali Ghodsi slides
(Berkeley University), and Wei-Chiu Chuang slides (Purdue University).

Laeeq Ahmed (FECE) Resource Management Jab 5th, 2023 49 / 1

You might also like