Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Commit 74d6cf5

Browse files
committed
describe routing algo changes
1 parent 9c1cf85 commit 74d6cf5

File tree

1 file changed

+63
-22
lines changed

1 file changed

+63
-22
lines changed

source/advanced_guides/in_depth/shared/_request_load_balancing.html.md.erb

Lines changed: 63 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Request load balancing
22
<%= render_partial("/shared/current_selection", locals: { disabled_selections: [:integration] }) %>
33

4-
At its core, Passenger is a process manager and HTTP request router. In order to minimize response times, and to distribute load over multiple CPU cores for optimal performance, Passenger load balances requests over processes in a "least busy process first" manner. This article explains the implications and details of our request load balancing mechanism.
4+
At its core, Passenger is a process manager and HTTP request router. In order to minimize response times, and to distribute load over multiple CPU cores for optimal performance, Passenger load balances requests over processes in an intelligent manner. This article explains the implications and details of our request load balancing mechanism.
55

66
**Table of contents**
77

@@ -15,15 +15,15 @@ At its core, Passenger is a process manager and HTTP request router. In order to
1515
<% else %>
1616
<%= language_name %> applications can only handle 1 request at the same time.
1717
<% end %>
18-
With Passenger this is improved by running multiple processes of the application using pooled application groups. For each request from the incoming request queue, a non-busy (free) application process is selected to handle the request.
18+
With Passenger this is improved by running multiple processes of the application using pooled application groups. For each request from the incoming request queue, a non-completely-busy (free) application process is selected to handle the request.
1919

2020
<div><img class="request-load-balancing" src="<%= url_for "/images/request_load_balancing.png" %>" alt="request load balancing"></div>
2121

2222
<% if language_type == :ruby %>
23-
For thread-safe Ruby apps it is also possible to enable [multithreading](<%= url_for "/references/config_reference/nginx/#passenger_concurrency_model" %>), which allows the application processes to concurrently handle multiple requests at the same time -- up to the amount of threads configured. In this case, Passenger forwards the request to the instance that is currently handling the least number of requests.
23+
For thread-safe Ruby apps it is also possible to enable [multithreading](<%= url_for "/references/config_reference/nginx/#passenger_concurrency_model" %>), which allows the application processes to concurrently handle multiple requests at the same time -- up to the amount of threads configured. In this case, Passenger forwards the request to the instance that meets the criteria described [below](#intelligent-routing).
2424
<% end %>
2525

26-
If all application processes and threads are busy, Passenger spawns a new instance, up to the [process limit](<%= url_for "/references/config_reference/nginx/#passenger_concurrency_model" %>) for the group. The amount of processes of the groups combined may also not exceed the limit for the application pool. If either limit is reached, the request remains in the queue (which has its own [limit](#request-queue-overflow)).
26+
If all application processes and threads are completely busy, Passenger spawns a new instance, up to the [process limit](<%= url_for "/references/config_reference/nginx/#passenger_concurrency_model" %>) for the group. The amount of processes of the groups combined may also not exceed the limit for the application pool. If either limit is reached, the request remains in the queue (which has its own [limit](#request-queue-overflow)).
2727

2828
<% elsif language_type == :nodejs || language_type == :meteor %>
2929
<%= language_name %> applications normally execute in a single thread/process, using a single CPU core. Passenger enables running multiple instances of the application (multiple processes) using pooled application groups, distributing requests to the process that is currently handling the least amount of requests.
@@ -58,31 +58,70 @@ A core concept in the load balancing algorithm is that of the **maximum process
5858

5959
For this reason, load balancing requests between multiple processes is beneficial.
6060

61-
## Least-busy-process-first routing
61+
### App Generations
62+
63+
Another core concept in the load balancing algorithm is that of the **app generations**. Every time Passenger is asked to [restart an app group](<%= url_for "/advanced_guides/troubleshooting/nginx/restart_app.html" %>), a counter is incremented. Processes created after the restart are labeled with a generation which is taken from this counter. Passenger usually performs a blocking restart, which means that there is only one generation of processes alive at a time, however if you initiate a [rolling restart](<%= url_for "/advanced_guides/deployment_and_scaling/nginx/zero_downtime_redeployments/ruby/index.html" %>), multiple generations can be alive at the same time, and Passenger will take this into account when load balancing requests.
64+
65+
## Intelligent routing
6266

6367
### Algorithm summary
6468

65-
Passenger keeps a list of application processes. For each application process, Passenger keeps track of how many requests it is currently handling. When a new request comes in, Passenger routes the request to the process that is handling the least number of requests (the one that is "least busy").
69+
Passenger keeps a list of application processes. For each application process, Passenger keeps track of: the app generation that resulted in it being started, when it was started, and how many requests it is currently handling. When a new request comes in, Passenger routes the request to the process that meets the following criteria:
70+
71+
<ol>
72+
<li>is not completely busy,</li>
73+
<li>is part of the newest available app generation<sup>1</sup> meeting the previous condition,</li>
74+
<li>is the oldest process meeting the previous conditions,</li>
75+
<li>is the least busy process meeting the previous conditions.</li>
76+
</ol>
77+
78+
[1] Unless blocked by <a href="<%= url_for "/advanced_guides/deployment_and_scaling/standalone/deployment_error_resistance.html" %>">Deployment Error Resistance</a>.
6679

6780
<a name="algorithm_ordered"></a>
6881

69-
### First available process in the list has highest priority
82+
### Process with the highest app generation in the list has highest priority
83+
84+
If there are multiple processes that are not completely busy, then Passenger will pick one from the list with the highest app generation:
85+
86+
For example, suppose that there are 3 application processes:
87+
88+
Process A: gen 1, spawned 1s ago, handling 0 requests
89+
Process B: gen 1, spawned 1s ago, handling 0 requests
90+
Process C: gen 2, spawned 1s ago, handling 0 requests
91+
92+
Process C will be chosen for the next incoming request.
93+
94+
This speeds the rate at which requests are handled by the newer generation of app processes.
95+
96+
### Oldest available process in a generation has highest priority
7097

71-
If there are multiple processes that have the least busyness, then Passenger will pick the first one in the list. For example, suppose that there are 3 application processes:
98+
If there are multiple processes from the newest generation, that not completely busy; then Passenger will pick the one that was started first, from the list.
7299

73-
Process A: handling 1 request
74-
Process B: handling 0 requests
75-
Process C: handling 0 requests
100+
For example, suppose that there are 3 application processes:
76101

77-
On the next request, Passenger will always pick B, never C.
102+
Process A: gen 2, spawned 30s ago, handling 0 requests
103+
Process B: gen 3, spawned 20s ago, handling 0 requests
104+
Process C: gen 3, spawned 10s ago, handling 0 requests
78105

79-
This property is used by the [dynamic process scaling](<%= url_for "/advanced_guides/in_depth/ruby/dynamic_scaling_of_app_processes/index.html" %>) algorithm. Dynamic process scaling works by shutting down processes that haven't received requests for a while (processes that are "idle"). By routing to the first process with least busyiness (instead of, say, a random one, using round-robin), Passenger gives other processes the chance to become idle and thus eligible for shutdown.
106+
Process B will be chosen for the next incoming request.
80107

81-
Another advantage of picking the first process is that it improves application-level caching. Since the first process is the most likely candidate for load balancing, it will have the most chance to keep its cache warm. Examples of such caches include: in-memory hash tables, JIT caches, etc.
108+
This property is used by the [dynamic process scaling](<%= url_for "/advanced_guides/in_depth/ruby/dynamic_scaling_of_app_processes/index.html" %>) algorithm. Dynamic process scaling works by shutting down processes that haven't received requests for a while (processes that are "idle"). By routing to the oldest process that is not completely busy (instead of, say, a random one, using round-robin), Passenger gives other processes the chance to become idle and thus eligible for shutdown.
109+
110+
Another advantage of picking the oldest process is that it improves caching. Since the oldest process is the most likely candidate for load balancing, it will have the most chance to keep its cache warm, and therefore process requests more quickly.
111+
112+
### Least busy process as tie breaker
113+
114+
If there are multiple processes that have met all the previous conditions, then Passenger will pick the least busy one. For example, suppose that there are 3 application processes:
115+
116+
Process A: gen 3, spawned 10s ago, handling 1 requests
117+
Process B: gen 3, spawned 10s ago, handling 0 requests
118+
Process C: gen 3, spawned 10s ago, handling 2 requests
119+
120+
Process B will be chosen for the next incoming request.
82121

83122
### Traffic may appear unbalanced between processes
84123

85-
Because Passenger [prefers to load balance to the first request](#algorithm_ordered), traffic may appear unbalanced between processes. Here is an example from `passenger-status`:
124+
Because Passenger [prefers to load balance to the oldest request in the generation](#algorithm_ordered), traffic may appear unbalanced between processes. Here is an example from `passenger-status`:
86125

87126
~~~
88127
/var/www/phusion_blog/current/public:
@@ -113,7 +152,7 @@ Instead, the Passenger implementation can be compared to using a single (shared)
113152
<% if language_min_concurrency == 1 %>
114153
### Example with maximum concurrency 1
115154

116-
Suppose that you have 3 application processes, and each process's maximum concurrency is 1. When the application is idle, none of the processes are handling any requests:
155+
Suppose that you have 3 application processes, spawned in order, and each process's maximum concurrency is 1. When the application is idle, none of the processes are handling any requests:
117156

118157
Process A [ ]
119158
Process B [ ]
@@ -125,7 +164,7 @@ When a new request comes in (let's call this α), Passenger will decide to route
125164
Process B [ ]
126165
Process C [ ]
127166

128-
Suppose that, while α is still in progress, a new requests comes in (which we call β). That request will be load balanced to process B because it is the least busy one:
167+
Suppose that, while α is still in progress, a new request comes in (which we call β). That request will be load balanced to process B because it is the oldest not-completely-busy process:
129168

130169
Process A [α]
131170
Process B [β]
@@ -143,6 +182,8 @@ If another request comes in (which we call ɣ), that request will be routed to A
143182
Process B [β]
144183
Process C [ ]
145184

185+
This keeps process A's caches warm, and may allow process C to idle and be shutdown sooner.
186+
146187
<% end %>
147188
<% if language_max_concurrency != 1 %>
148189
### Example with maximum concurrency 4
@@ -163,15 +204,15 @@ When a new request comes in (which we call α, Passenger will decide to route th
163204
Process A [α ]
164205
Process B [ ]
165206

166-
Suppose that, while α is still in progress, 1 more request comes in (which we call β). That request will be load balanced to process B because it is the least busy one:
207+
Suppose that, while α is still in progress, 1 more request comes in (which we call β). That request will be load balanced to process A because it is the oldest not-completely-busy one:
167208

168-
Process A [α ]
169-
Process B [β ]
209+
Process A [αβ ]
210+
Process B [ ]
170211

171212
Suppose that another request comes in (which we call ɣ). That will be load balanced to process A again, not to B:
172213

173-
Process A [αɣ ]
174-
Process B [β ]
214+
Process A [αβɣ ]
215+
Process B [ ]
175216

176217
<% end %>
177218

0 commit comments

Comments
 (0)