Thanks to visit codestin.com
Credit goes to github.com

Skip to content

chore(docs): tweak replica verbiage on reference architectures #16076

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 14, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions docs/admin/infrastructure/validated-architectures/1k-users.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,9 +12,9 @@ tech startups, educational units, or small to mid-sized enterprises.

### Coderd nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|---------------------|---------------------|-----------------|------------|-------------------|
| Up to 1,000 | 2 vCPU, 8 GB memory | 1-2 / 1 coderd each | `n1-standard-2` | `t3.large` | `Standard_D2s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|---------------------|--------------------------|-----------------|------------|-------------------|
| Up to 1,000 | 2 vCPU, 8 GB memory | 1-2 nodes, 1 coderd each | `n1-standard-2` | `t3.large` | `Standard_D2s_v3` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| Up to 1,000 | 2 vCPU, 8 GB memory | 1-2 nodes, 1 coderd each | `n1-standard-2` | `t3.large` | `Standard_D2s_v3` |
| Up to 1,000 | 2 vCPU, 8 GB memory | 1-2 nodes | `n1-standard-2` | `t3.large` | `Standard_D2s_v3` |

Is it technically possible to run more than 1 coderd on each node? If yes does this benefit any of the use cases or customers? Why would someone run multiple coderd on a single node?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it technically possible to run more than 1 coderd on each node?

Yes, this can happen automatically during a rollout or during node unavailability.
Note that we do set a pod anti-affinity rule [1] in our Helm chart to prefer spreading out replicas across multiple nodes.

If yes does this benefit any of the use cases or customers?
Why would someone run multiple coderd on a single node?

As far as I'm aware, the main reason to do this would be more for redundancy in case one or more pods become unavilable for whatever reason.

The only other reason I could imagine for running multiple replicas on a single node is to spread out connections across more coderd replicas to minimize the user-facing impact of a single pod failing. However, this won't protect against a failure of the underlying node.

I'll defer to @spikecurtis to weigh in more on the pros and cons of running multiple replicas per node.

[1] https://github.com/coder/coder/blob/main/helm/coder/values.yaml#L223-L237

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In any reference architectures we should always recommend having 1 coderd per node.

There are generally 2 reasons for multiple replicas: fault tolerance and scale.

For fault tolerance, you want the replicas spread out into different failure domains. Having all replicas on the same node means you aren't tolerant of node-level faults. There might still be some residual value in being tolerant to replica level faults: e.g. software crashes, OOM. But, most people would rather the higher fault tolerance.

For scale, coderd is written to take advantage of multiple CPU cores in one process, so there is no scale advantage of putting multiple coderd instances on a single node. In fact, it's likely bad for scale since you have multiple processes competing for resources, and extra overhead of coderd to coderd communication.


**Footnotes**:

Expand All @@ -23,19 +23,19 @@ tech startups, educational units, or small to mid-sized enterprises.

### Provisioner nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|--------------------------------|------------------|--------------|-------------------|
| Up to 1,000 | 8 vCPU, 32 GB memory | 2 nodes / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------------------|------------------|--------------|-------------------|
| Up to 1,000 | 8 vCPU, 32 GB memory | 2 nodes, 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |

**Footnotes**:

- An external provisioner is deployed as Kubernetes pod.

### Workspace nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------------|------------------|--------------|-------------------|
| Up to 1,000 | 8 vCPU, 32 GB memory | 64 / 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|------------------------------|------------------|--------------|-------------------|
| Up to 1,000 | 8 vCPU, 32 GB memory | 64 nodes, 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |

**Footnotes**:

Expand All @@ -48,4 +48,4 @@ tech startups, educational units, or small to mid-sized enterprises.

| Users | Node capacity | Replicas | Storage | GCP | AWS | Azure |
|-------------|---------------------|----------|---------|--------------------|---------------|-------------------|
| Up to 1,000 | 2 vCPU, 8 GB memory | 1 | 512 GB | `db-custom-2-7680` | `db.t3.large` | `Standard_D2s_v3` |
| Up to 1,000 | 2 vCPU, 8 GB memory | 1 node | 512 GB | `db-custom-2-7680` | `db.t3.large` | `Standard_D2s_v3` |
20 changes: 10 additions & 10 deletions docs/admin/infrastructure/validated-architectures/2k-users.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@ deployment reliability under load.

### Coderd nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------------|-----------------|-------------|-------------------|
| Up to 2,000 | 4 vCPU, 16 GB memory | 2 nodes / 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|------------------------|-----------------|-------------|-------------------|
| Up to 2,000 | 4 vCPU, 16 GB memory | 2 nodes, 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |

### Provisioner nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|--------------------------------|------------------|--------------|-------------------|
| Up to 2,000 | 8 vCPU, 32 GB memory | 4 nodes / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------------------|------------------|--------------|-------------------|
| Up to 2,000 | 8 vCPU, 32 GB memory | 4 nodes, 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |

**Footnotes**:

Expand All @@ -36,9 +36,9 @@ deployment reliability under load.

### Workspace nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|--------------------------|------------------|--------------|-------------------|
| Up to 2,000 | 8 vCPU, 32 GB memory | 128 / 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------------------|------------------|--------------|-------------------|
| Up to 2,000 | 8 vCPU, 32 GB memory | 128 nodes, 16 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |

**Footnotes**:

Expand All @@ -51,7 +51,7 @@ deployment reliability under load.

| Users | Node capacity | Replicas | Storage | GCP | AWS | Azure |
|-------------|----------------------|----------|---------|---------------------|----------------|-------------------|
| Up to 2,000 | 4 vCPU, 16 GB memory | 1 | 1 TB | `db-custom-4-15360` | `db.t3.xlarge` | `Standard_D4s_v3` |
| Up to 2,000 | 4 vCPU, 16 GB memory | 1 node | 1 TB | `db-custom-4-15360` | `db.t3.xlarge` | `Standard_D4s_v3` |

**Footnotes**:

Expand Down
20 changes: 10 additions & 10 deletions docs/admin/infrastructure/validated-architectures/3k-users.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,15 +18,15 @@ continuously improve the reliability and performance of the platform.

### Coderd nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------|-----------------|-------------|-------------------|
| Up to 3,000 | 8 vCPU, 32 GB memory | 4 / 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-----------------------|-----------------|-------------|-------------------|
| Up to 3,000 | 8 vCPU, 32 GB memory | 4 node, 1 coderd each | `n1-standard-4` | `t3.xlarge` | `Standard_D4s_v3` |

### Provisioner nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|--------------------------|------------------|--------------|-------------------|
| Up to 3,000 | 8 vCPU, 32 GB memory | 8 / 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------------------|------------------|--------------|-------------------|
| Up to 3,000 | 8 vCPU, 32 GB memory | 8 nodes, 30 provisioners each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |

**Footnotes**:

Expand All @@ -38,9 +38,9 @@ continuously improve the reliability and performance of the platform.

### Workspace nodes

| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|--------------------------------|------------------|--------------|-------------------|
| Up to 3,000 | 8 vCPU, 32 GB memory | 256 nodes / 12 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |
| Users | Node capacity | Replicas | GCP | AWS | Azure |
|-------------|----------------------|-------------------------------|------------------|--------------|-------------------|
| Up to 3,000 | 8 vCPU, 32 GB memory | 256 nodes, 12 workspaces each | `t2d-standard-8` | `t3.2xlarge` | `Standard_D8s_v3` |

**Footnotes**:

Expand All @@ -54,7 +54,7 @@ continuously improve the reliability and performance of the platform.

| Users | Node capacity | Replicas | Storage | GCP | AWS | Azure |
|-------------|----------------------|----------|---------|---------------------|-----------------|-------------------|
| Up to 3,000 | 8 vCPU, 32 GB memory | 2 | 1.5 TB | `db-custom-8-30720` | `db.t3.2xlarge` | `Standard_D8s_v3` |
| Up to 3,000 | 8 vCPU, 32 GB memory | 2 nodes | 1.5 TB | `db-custom-8-30720` | `db.t3.2xlarge` | `Standard_D8s_v3` |

**Footnotes**:

Expand Down
Loading