|
| 1 | +# High Availability |
| 2 | + |
| 3 | +High Availability (HA) mode solves for horizontal scalability and automatic |
| 4 | +failover within a single region. When in HA mode, Coder continues using a single |
| 5 | +Postgres endpoint. |
| 6 | +[GCP](https://cloud.google.com/sql/docs/postgres/high-availability), |
| 7 | +[AWS](https://docs.aws.amazon.com/prescriptive-guidance/latest/saas-multitenant-managed-postgresql/availability.html), |
| 8 | +and other cloud vendors offer fully-managed HA Postgres services that pair |
| 9 | +nicely with Coder. |
| 10 | + |
| 11 | +For Coder to operate correctly, Coderd instances should have low-latency |
| 12 | +connections to each other so that they can effectively relay traffic between |
| 13 | +users and workspaces no matter which Coderd instance users or workspaces connect |
| 14 | +to. We make a best-effort attempt to warn the user when inter-Coderd latency is |
| 15 | +too high, but if requests start dropping, this is one metric to investigate. |
| 16 | + |
| 17 | +We also recommend that you deploy all Coderd instances such that they have |
| 18 | +low-latency connections to Postgres. Coderd often makes several database |
| 19 | +round-trips while processing a single API request, so prioritizing low-latency |
| 20 | +between Coderd and Postgres is more important than low-latency between users and |
| 21 | +Coderd. |
| 22 | + |
| 23 | +Note that this latency requirement applies _only_ to Coder services. Coder will |
| 24 | +operate correctly even with few seconds of latency on workspace <-> Coder and |
| 25 | +user <-> Coder connections. |
| 26 | + |
| 27 | +## Setup |
| 28 | + |
| 29 | +Coder automatically enters HA mode when multiple instances simultaneously |
| 30 | +connect to the same Postgres endpoint. |
| 31 | + |
| 32 | +HA brings one configuration variable to set in each Coderd node: |
| 33 | +`CODER_DERP_SERVER_RELAY_URL`. The HA nodes use these URLs to communicate with |
| 34 | +each other. Inter-node communication is only required while using the embedded |
| 35 | +relay (default). If you're using [custom relays](./README.md#custom-relays), |
| 36 | +Coder ignores `CODER_DERP_SERVER_RELAY_URL` since Postgres is the sole |
| 37 | +rendezvous for the Coder nodes. |
| 38 | + |
| 39 | +`CODER_DERP_SERVER_RELAY_URL` will never be `CODER_ACCESS_URL` because |
| 40 | +`CODER_ACCESS_URL` is a load balancer to all Coder nodes. |
| 41 | + |
| 42 | +Here's an example 3-node network configuration setup: |
| 43 | + |
| 44 | +| Name | `CODER_HTTP_ADDRESS` | `CODER_DERP_SERVER_RELAY_URL` | `CODER_ACCESS_URL` | |
| 45 | +| --------- | -------------------- | ----------------------------- | ------------------------ | |
| 46 | +| `coder-1` | `*:80` | `http://10.0.0.1:80` | `https://coder.big.corp` | |
| 47 | +| `coder-2` | `*:80` | `http://10.0.0.2:80` | `https://coder.big.corp` | |
| 48 | +| `coder-3` | `*:80` | `http://10.0.0.3:80` | `https://coder.big.corp` | |
| 49 | + |
| 50 | +## Kubernetes |
| 51 | + |
| 52 | +If you installed Coder via |
| 53 | +[our Helm Chart](../../install/kubernetes.md#4-install-coder-with-helm), just |
| 54 | +increase `coder.replicaCount` in `values.yaml`. |
| 55 | + |
| 56 | +If you installed Coder into Kubernetes by some other means, insert the relay URL |
| 57 | +via the environment like so: |
| 58 | + |
| 59 | +```yaml |
| 60 | +env: |
| 61 | + - name: POD_IP |
| 62 | + valueFrom: |
| 63 | + fieldRef: |
| 64 | + fieldPath: status.podIP |
| 65 | + - name: CODER_DERP_SERVER_RELAY_URL |
| 66 | + value: http://$(POD_IP) |
| 67 | +``` |
| 68 | +
|
| 69 | +Then, increase the number of pods. |
| 70 | +
|
| 71 | +## Up next |
| 72 | +
|
| 73 | +- [Read more on Coder's networking stack](./README.md) |
| 74 | +- [Install on Kubernetes](../../install/kubernetes.md) |
0 commit comments