From 301cf48b5fdb0ef4e359b47075cca0b854850f9a Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Mon, 18 Mar 2024 13:46:36 +0100
Subject: [PATCH 01/21] deduplicate knowledge scale.md

---
 docs/admin/scale.md | 102 ++------------------------------------------
 1 file changed, 3 insertions(+), 99 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index 58fcd93373dad..024983bb7a528 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -2,104 +2,8 @@ We scale-test Coder with [a built-in utility](#scale-testing-utility) that can
 be used in your environment for insights into how Coder scales with your
 infrastructure.
 
-## General concepts
-
-Coder runs workspace operations in a queue. The number of concurrent builds will
-be limited to the number of provisioner daemons across all coderd replicas.
-
-- **coderd**: Coder’s primary service. Learn more about
-  [Coder’s architecture](../about/architecture.md)
-- **coderd replicas**: Replicas (often via Kubernetes) for high availability,
-  this is an [enterprise feature](../enterprise.md)
-- **concurrent workspace builds**: Workspace operations (e.g.
-  create/stop/delete/apply) across all users
-- **concurrent connections**: Any connection to a workspace (e.g. SSH, web
-  terminal, `coder_app`)
-- **provisioner daemons**: Coder runs one workspace build per provisioner
-  daemon. One coderd replica can host many daemons
-- **scaletest**: Our scale-testing utility, built into the `coder` command line.
-
-```text
-2 coderd replicas * 30 provisioner daemons = 60 max concurrent workspace builds
-```
-
-## Infrastructure recommendations
-
-> Note: The below are guidelines for planning your infrastructure. Your mileage
-> may vary depending on your templates, workflows, and users.
-
-When planning your infrastructure, we recommend you consider the following:
-
-1. CPU and memory requirements for `coderd`. We recommend allocating 1 CPU core
-   and 2 GB RAM per `coderd` replica at minimum. See
-   [Concurrent users](#concurrent-users) for more details.
-1. CPU and memory requirements for
-   [external provisioners](../admin/provisioners.md#running-external-provisioners),
-   if required. We recommend allocating 1 CPU core and 1 GB RAM per 5 concurrent
-   workspace builds to external provisioners. Note that this may vary depending
-   on the template used. See
-   [Concurrent workspace builds](#concurrent-workspace-builds) for more details.
-   By default, `coderd` runs 3 integrated provisioners.
-1. CPU and memory requirements for the database used by `coderd`. We recommend
-   allocating an additional 1 CPU core to the database used by Coder for every
-   1000 active users.
-1. CPU and memory requirements for workspaces created by Coder. This will vary
-   depending on users' needs. However, the Coder agent itself requires at
-   minimum 0.1 CPU cores and 256 MB to run inside a workspace.
-
-### Concurrent users
-
-We recommend allocating 2 CPU cores and 4 GB RAM per `coderd` replica per 1000
-active users. We also recommend allocating an additional 1 CPU core to the
-database used by Coder for every 1000 active users. Inactive users do not
-consume Coder resources, although workspaces configured to auto-start will
-consume resources when they are built.
-
-Users' primary mode of accessing Coder will also affect resource requirements.
-If users will be accessing workspaces primarily via Coder's HTTP interface, we
-recommend doubling the number of cores and RAM allocated per user. For example,
-if you expect 1000 users accessing workspaces via the web, we recommend
-allocating 4 CPU cores and 8 GB RAM.
-
-Users accessing workspaces via SSH will consume fewer resources, as SSH
-connections are not proxied through Coder.
-
-### Concurrent workspace builds
-
-Workspace builds are CPU-intensive, as it relies on Terraform. Various
-[Terraform providers](https://registry.terraform.io/browse/providers) have
-different resource requirements. When tested with our
-[kubernetes](https://github.com/coder/coder/tree/main/examples/templates/kubernetes)
-template, `coderd` will consume roughly 0.25 cores per concurrent workspace
-build. For effective provisioning, our helm chart prefers to schedule
-[one coderd replica per-node](https://github.com/coder/coder/blob/main/helm/coder/values.yaml#L188-L202).
-
-We recommend:
-
-- Running `coderd` on a dedicated set of nodes. This will prevent other
-  workloads from interfering with workspace builds. You can use
-  [node selectors](https://kubernetes.io/docs/concepts/scheduling-eviction/assign-pod-node/#nodeselector),
-  or
-  [taints and tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration/)
-  to achieve this.
-- Disabling autoscaling for `coderd` nodes. Autoscaling can cause interruptions
-  for users, see [Autoscaling](#autoscaling) for more details.
-- (Enterprise-only) Running external provisioners instead of Coder's built-in
-  provisioners (`CODER_PROVISIONER_DAEMONS=0`) will separate the load caused by
-  workspace provisioning on the `coderd` nodes. For more details, see
-  [External provisioners](../admin/provisioners.md#running-external-provisioners).
-- Alternatively, if increasing the number of integrated provisioner daemons in
-  `coderd` (`CODER_PROVISIONER_DAEMONS>3`), allocate additional resources to
-  `coderd` to compensate (approx. 0.25 cores and 256 MB per provisioner daemon).
-
-For example, to support 120 concurrent workspace builds:
-
-- Create a cluster/nodepool with 4 nodes, 8-core each (AWS: `t3.2xlarge` GCP:
-  `e2-highcpu-8`)
-- Run coderd with 4 replicas, 30 provisioner daemons each.
-  (`CODER_PROVISIONER_DAEMONS=30`)
-- Ensure Coder's [PostgreSQL server](./configure.md#postgresql-database) can use
-  up to 2 cores and 4 GB RAM
+Learn more about [Coder’s architecture](../about/architecture.md) and our
+[scale-testing methodology](architectures/index.md#scale-testing-methodology).
 
 ## Recent scale tests
 
@@ -228,6 +132,6 @@ an annotation on the coderd deployment.
 ## Troubleshooting
 
 If a load test fails or if you are experiencing performance issues during
-day-to-day use, you can leverage Coder's [prometheus metrics](./prometheus.md)
+day-to-day use, you can leverage Coder's [Prometheus metrics](./prometheus.md)
 to identify bottlenecks during scale tests. Additionally, you can use your
 existing cloud monitoring stack to measure load, view server logs, etc.

From fa60b7c6a46a12b80739e6c5668fe0ea99872361 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Tue, 19 Mar 2024 12:29:26 +0100
Subject: [PATCH 02/21] Upload coder templates

---
 examples/scaletests/kubernetes-large/main.tf  |  82 ++
 .../kubernetes-medium-greedy/main.tf          | 196 ++++
 examples/scaletests/kubernetes-medium/main.tf |  82 ++
 .../scaletests/kubernetes-minimal/main.tf     | 164 +++
 examples/scaletests/kubernetes-small/main.tf  |  82 ++
 .../kubernetes-with-podmonitor/README.md      |  98 ++
 .../kubernetes-with-podmonitor/main.tf        | 362 +++++++
 .../scaletests/scaletest-runner/Dockerfile    |  36 +
 .../scaletests/scaletest-runner/README.md     |   9 +
 examples/scaletests/scaletest-runner/main.tf  | 961 ++++++++++++++++++
 .../scaletest-runner/metadata_phase.sh        |   6 +
 .../metadata_previous_phase.sh                |   6 +
 .../scaletest-runner/metadata_status.sh       |   6 +
 .../scaletest-runner/scripts/cleanup.sh       |  62 ++
 .../scaletest-runner/scripts/lib.sh           | 313 ++++++
 .../scaletest-runner/scripts/prepare.sh       |  67 ++
 .../scaletest-runner/scripts/report.sh        | 109 ++
 .../scaletest-runner/scripts/run.sh           | 369 +++++++
 .../scaletests/scaletest-runner/shutdown.sh   |  30 +
 .../scaletests/scaletest-runner/startup.sh    | 181 ++++
 20 files changed, 3221 insertions(+)
 create mode 100644 examples/scaletests/kubernetes-large/main.tf
 create mode 100644 examples/scaletests/kubernetes-medium-greedy/main.tf
 create mode 100644 examples/scaletests/kubernetes-medium/main.tf
 create mode 100644 examples/scaletests/kubernetes-minimal/main.tf
 create mode 100644 examples/scaletests/kubernetes-small/main.tf
 create mode 100644 examples/scaletests/kubernetes-with-podmonitor/README.md
 create mode 100644 examples/scaletests/kubernetes-with-podmonitor/main.tf
 create mode 100644 examples/scaletests/scaletest-runner/Dockerfile
 create mode 100644 examples/scaletests/scaletest-runner/README.md
 create mode 100644 examples/scaletests/scaletest-runner/main.tf
 create mode 100755 examples/scaletests/scaletest-runner/metadata_phase.sh
 create mode 100755 examples/scaletests/scaletest-runner/metadata_previous_phase.sh
 create mode 100755 examples/scaletests/scaletest-runner/metadata_status.sh
 create mode 100755 examples/scaletests/scaletest-runner/scripts/cleanup.sh
 create mode 100644 examples/scaletests/scaletest-runner/scripts/lib.sh
 create mode 100755 examples/scaletests/scaletest-runner/scripts/prepare.sh
 create mode 100755 examples/scaletests/scaletest-runner/scripts/report.sh
 create mode 100755 examples/scaletests/scaletest-runner/scripts/run.sh
 create mode 100755 examples/scaletests/scaletest-runner/shutdown.sh
 create mode 100755 examples/scaletests/scaletest-runner/startup.sh

diff --git a/examples/scaletests/kubernetes-large/main.tf b/examples/scaletests/kubernetes-large/main.tf
new file mode 100644
index 0000000000000..98d5c552f9eaf
--- /dev/null
+++ b/examples/scaletests/kubernetes-large/main.tf
@@ -0,0 +1,82 @@
+    terraform {
+      required_providers {
+        coder = {
+          source  = "coder/coder"
+          version = "~> 0.7.0"
+        }
+        kubernetes = {
+          source  = "hashicorp/kubernetes"
+          version = "~> 2.18"
+        }
+      }
+    }
+
+    provider "coder" {}
+
+    provider "kubernetes" {
+      config_path = null # always use host
+    }
+
+    data "coder_workspace" "me" {}
+
+    resource "coder_agent" "main" {
+      os                     = "linux"
+      arch                   = "amd64"
+      startup_script_timeout = 180
+      startup_script         = ""
+    }
+
+    resource "kubernetes_pod" "main" {
+      count = data.coder_workspace.me.start_count
+      metadata {
+        name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        namespace = "coder-big"
+        labels = {
+          "app.kubernetes.io/name"     = "coder-workspace"
+          "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        }
+      }
+      spec {
+        security_context {
+          run_as_user = "1000"
+          fs_group    = "1000"
+        }
+        container {
+          name              = "dev"
+          image             = "docker.io/codercom/enterprise-minimal:ubuntu"
+          image_pull_policy = "Always"
+          command           = ["sh", "-c", coder_agent.main.init_script]
+          security_context {
+            run_as_user = "1000"
+          }
+          env {
+            name  = "CODER_AGENT_TOKEN"
+            value = coder_agent.main.token
+          }
+          resources {
+            requests = {
+              "cpu"    = "4"
+              "memory" = "4Gi"
+            }
+            limits = {
+              "cpu"    = "4"
+              "memory" = "4Gi"
+            }
+          }
+        }
+
+        affinity {
+          node_affinity {
+            required_during_scheduling_ignored_during_execution {
+              node_selector_term {
+                match_expressions {
+                  key = "cloud.google.com/gke-nodepool"
+                  operator = "In"
+                  values = ["big-workspaces"]
+                }
+              }
+            }
+          }
+        }
+      }
+    }
diff --git a/examples/scaletests/kubernetes-medium-greedy/main.tf b/examples/scaletests/kubernetes-medium-greedy/main.tf
new file mode 100644
index 0000000000000..45f5b970d73c7
--- /dev/null
+++ b/examples/scaletests/kubernetes-medium-greedy/main.tf
@@ -0,0 +1,196 @@
+terraform {
+  required_providers {
+    coder = {
+      source  = "coder/coder"
+      version = "~> 0.7.0"
+    }
+    kubernetes = {
+      source  = "hashicorp/kubernetes"
+      version = "~> 2.18"
+    }
+  }
+}
+
+provider "coder" {}
+
+provider "kubernetes" {
+  config_path = null # always use host
+}
+
+data "coder_workspace" "me" {}
+
+resource "coder_agent" "main" {
+  os                     = "linux"
+  arch                   = "amd64"
+  startup_script_timeout = 180
+  startup_script         = ""
+
+  # Greedy metadata (3072 bytes base64 encoded is 4097 bytes).
+  metadata {
+    display_name = "Meta 01"
+    key          = "01_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 02"
+    key          = "0_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 03"
+    key          = "03_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 04"
+    key          = "04_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 05"
+    key          = "05_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 06"
+    key          = "06_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 07"
+    key          = "07_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 08"
+    key          = "08_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 09"
+    key          = "09_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 10"
+    key          = "10_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 11"
+    key          = "11_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 12"
+    key          = "12_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 13"
+    key          = "13_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 14"
+    key          = "14_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 15"
+    key          = "15_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }
+  metadata {
+    display_name = "Meta 16"
+    key          = "16_meta"
+    script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
+    interval     = 1
+    timeout      = 10
+  }  
+}
+
+resource "kubernetes_pod" "main" {
+  count = data.coder_workspace.me.start_count
+  metadata {
+    name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    namespace = "coder-big"
+    labels = {
+      "app.kubernetes.io/name"     = "coder-workspace"
+      "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    }
+  }
+  spec {
+    security_context {
+      run_as_user = "1000"
+      fs_group    = "1000"
+    }
+    container {
+      name              = "dev"
+      image             = "docker.io/codercom/enterprise-minimal:ubuntu"
+      image_pull_policy = "Always"
+      command           = ["sh", "-c", coder_agent.main.init_script]
+      security_context {
+        run_as_user = "1000"
+      }
+      env {
+        name  = "CODER_AGENT_TOKEN"
+        value = coder_agent.main.token
+      }
+      resources {
+        requests = {
+          "cpu"    = "2"
+          "memory" = "2Gi"
+        }
+        limits = {
+          "cpu"    = "2"
+          "memory" = "2Gi"
+        }
+      }
+    }
+
+    affinity {
+      node_affinity {
+        required_during_scheduling_ignored_during_execution {
+          node_selector_term {
+            match_expressions {
+              key = "cloud.google.com/gke-nodepool"
+              operator = "In"
+              values = ["big-workspaces"]
+            }
+          }
+        }
+      }
+    }
+  }
+}
diff --git a/examples/scaletests/kubernetes-medium/main.tf b/examples/scaletests/kubernetes-medium/main.tf
new file mode 100644
index 0000000000000..b8ce10b4bdb8a
--- /dev/null
+++ b/examples/scaletests/kubernetes-medium/main.tf
@@ -0,0 +1,82 @@
+    terraform {
+      required_providers {
+        coder = {
+          source  = "coder/coder"
+          version = "~> 0.7.0"
+        }
+        kubernetes = {
+          source  = "hashicorp/kubernetes"
+          version = "~> 2.18"
+        }
+      }
+    }
+
+    provider "coder" {}
+
+    provider "kubernetes" {
+      config_path = null # always use host
+    }
+
+    data "coder_workspace" "me" {}
+
+    resource "coder_agent" "main" {
+      os                     = "linux"
+      arch                   = "amd64"
+      startup_script_timeout = 180
+      startup_script         = ""
+    }
+
+    resource "kubernetes_pod" "main" {
+      count = data.coder_workspace.me.start_count
+      metadata {
+        name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        namespace = "coder-big"
+        labels = {
+          "app.kubernetes.io/name"     = "coder-workspace"
+          "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        }
+      }
+      spec {
+        security_context {
+          run_as_user = "1000"
+          fs_group    = "1000"
+        }
+        container {
+          name              = "dev"
+          image             = "docker.io/codercom/enterprise-minimal:ubuntu"
+          image_pull_policy = "Always"
+          command           = ["sh", "-c", coder_agent.main.init_script]
+          security_context {
+            run_as_user = "1000"
+          }
+          env {
+            name  = "CODER_AGENT_TOKEN"
+            value = coder_agent.main.token
+          }
+          resources {
+            requests = {
+              "cpu"    = "2"
+              "memory" = "2Gi"
+            }
+            limits = {
+              "cpu"    = "2"
+              "memory" = "2Gi"
+            }
+          }
+        }
+
+        affinity {
+          node_affinity {
+            required_during_scheduling_ignored_during_execution {
+              node_selector_term {
+                match_expressions {
+                  key = "cloud.google.com/gke-nodepool"
+                  operator = "In"
+                  values = ["big-workspaces"]
+                }
+              }
+            }
+          }
+        }
+      }
+    }
diff --git a/examples/scaletests/kubernetes-minimal/main.tf b/examples/scaletests/kubernetes-minimal/main.tf
new file mode 100644
index 0000000000000..6d04fb68a33ed
--- /dev/null
+++ b/examples/scaletests/kubernetes-minimal/main.tf
@@ -0,0 +1,164 @@
+terraform {
+  required_providers {
+    coder = {
+      source  = "coder/coder"
+      version = "~> 0.12.0"
+    }
+    kubernetes = {
+      source  = "hashicorp/kubernetes"
+      version = "~> 2.18"
+    }
+  }
+}
+
+provider "coder" {}
+
+provider "kubernetes" {
+  config_path = null # always use host
+}
+
+data "coder_workspace" "me" {}
+
+resource "coder_agent" "m" {
+  os                     = "linux"
+  arch                   = "amd64"
+  startup_script_timeout = 180
+  startup_script         = ""
+  metadata {
+    display_name = "CPU Usage"
+    key          = "0_cpu_usage"
+    script       = "coder stat cpu"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "RAM Usage"
+    key          = "1_ram_usage"
+    script       = "coder stat mem"
+    interval     = 10
+    timeout      = 1
+  }
+}
+
+resource "coder_script" "websocat" {
+  agent_id     = coder_agent.m.id
+  display_name = "websocat"
+  script       = <<EOF
+curl -sSL -o /tmp/websocat https://github.com/vi/websocat/releases/download/v1.12.0/websocat.x86_64-unknown-linux-musl
+chmod +x /tmp/websocat
+
+/tmp/websocat --exit-on-eof --binary ws-l:127.0.0.1:1234 mirror: &
+/tmp/websocat --exit-on-eof --binary ws-l:127.0.0.1:1235 cmd:'dd if=/dev/urandom' &
+/tmp/websocat --exit-on-eof --binary ws-l:127.0.0.1:1236 cmd:'dd of=/dev/null' &
+wait
+EOF
+  run_on_start = true
+}
+
+resource "coder_app" "ws_echo" {
+  agent_id     = coder_agent.m.id
+  slug         = "wsec" # Short slug so URL doesn't exceed limit: https://wsec--main--scaletest-UN9UmkDA-0--scaletest-SMXCCYVP-0--apps.big.cdr.dev
+  display_name = "WebSocket Echo"
+  url          = "http://localhost:1234"
+  subdomain    = true
+  share        = "authenticated"
+}
+
+resource "coder_app" "ws_random" {
+  agent_id     = coder_agent.m.id
+  slug         = "wsra" # Short slug so URL doesn't exceed limit: https://wsra--main--scaletest-UN9UmkDA-0--scaletest-SMXCCYVP-0--apps.big.cdr.dev
+  display_name = "WebSocket Random"
+  url          = "http://localhost:1235"
+  subdomain    = true
+  share        = "authenticated"
+}
+
+resource "coder_app" "ws_discard" {
+  agent_id     = coder_agent.m.id
+  slug         = "wsdi" # Short slug so URL doesn't exceed limit: https://wsdi--main--scaletest-UN9UmkDA-0--scaletest-SMXCCYVP-0--apps.big.cdr.dev
+  display_name = "WebSocket Discard"
+  url          = "http://localhost:1236"
+  subdomain    = true
+  share        = "authenticated"
+}
+
+resource "kubernetes_deployment" "main" {
+  count = data.coder_workspace.me.start_count
+  metadata {
+    name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    namespace = "coder-big"
+    labels = {
+      "app.kubernetes.io/name"     = "coder-workspace"
+      "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+      "app.kubernetes.io/part-of"  = "coder"
+      "com.coder.resource"         = "true"
+      "com.coder.workspace.id"     = data.coder_workspace.me.id
+      "com.coder.workspace.name"   = data.coder_workspace.me.name
+      "com.coder.user.id"          = data.coder_workspace.me.owner_id
+      "com.coder.user.username"    = data.coder_workspace.me.owner
+    }
+  }
+  spec {
+    replicas = 1
+    selector {
+      match_labels = {
+        "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+      }
+    }
+    strategy {
+      type = "Recreate"
+    }
+    template {
+      metadata {
+        labels = {
+          "app.kubernetes.io/name"     = "coder-workspace"
+          "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        }
+      }
+      spec {
+        security_context {
+          run_as_user = "1000"
+          fs_group    = "1000"
+        }
+        container {
+          name              = "dev"
+          image             = "docker.io/codercom/enterprise-minimal:ubuntu"
+          image_pull_policy = "IfNotPresent"
+          command           = ["sh", "-c", coder_agent.m.init_script]
+          security_context {
+            run_as_user = "1000"
+          }
+          env {
+            name  = "CODER_AGENT_TOKEN"
+            value = coder_agent.m.token
+          }
+          resources {
+            requests = {
+              "cpu"    = "100m"
+              "memory" = "320Mi"
+            }
+            limits = {
+              "cpu"    = "100m"
+              "memory" = "320Mi"
+            }
+          }
+        }
+
+        affinity {
+          node_affinity {
+            required_during_scheduling_ignored_during_execution {
+              node_selector_term {
+                match_expressions {
+                  key      = "cloud.google.com/gke-nodepool"
+                  operator = "In"
+                  values   = ["big-workspaces", "big-workspaces2"]
+                }
+              }
+            }
+          }
+        }
+      }
+    }
+  }
+}
diff --git a/examples/scaletests/kubernetes-small/main.tf b/examples/scaletests/kubernetes-small/main.tf
new file mode 100644
index 0000000000000..b11308b4a2ccf
--- /dev/null
+++ b/examples/scaletests/kubernetes-small/main.tf
@@ -0,0 +1,82 @@
+    terraform {
+      required_providers {
+        coder = {
+          source  = "coder/coder"
+          version = "~> 0.7.0"
+        }
+        kubernetes = {
+          source  = "hashicorp/kubernetes"
+          version = "~> 2.18"
+        }
+      }
+    }
+
+    provider "coder" {}
+
+    provider "kubernetes" {
+      config_path = null # always use host
+    }
+
+    data "coder_workspace" "me" {}
+
+    resource "coder_agent" "main" {
+      os                     = "linux"
+      arch                   = "amd64"
+      startup_script_timeout = 180
+      startup_script         = ""
+    }
+
+    resource "kubernetes_pod" "main" {
+      count = data.coder_workspace.me.start_count
+      metadata {
+        name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        namespace = "coder-big"
+        labels = {
+          "app.kubernetes.io/name"     = "coder-workspace"
+          "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        }
+      }
+      spec {
+        security_context {
+          run_as_user = "1000"
+          fs_group    = "1000"
+        }
+        container {
+          name              = "dev"
+          image             = "docker.io/codercom/enterprise-base:ubuntu"
+          image_pull_policy = "Always"
+          command           = ["sh", "-c", coder_agent.main.init_script]
+          security_context {
+            run_as_user = "1000"
+          }
+          env {
+            name  = "CODER_AGENT_TOKEN"
+            value = coder_agent.main.token
+          }
+          resources {
+            requests = {
+              "cpu"    = "1"
+              "memory" = "1Gi"
+            }
+            limits = {
+              "cpu"    = "1"
+              "memory" = "1Gi"
+            }
+          }
+        }
+
+        affinity {
+          node_affinity {
+            required_during_scheduling_ignored_during_execution {
+              node_selector_term {
+                match_expressions {
+                  key = "cloud.google.com/gke-nodepool"
+                  operator = "In"
+                  values = ["big-workspaces"]
+                }
+              }
+            }
+          }
+        }
+      }
+    }
diff --git a/examples/scaletests/kubernetes-with-podmonitor/README.md b/examples/scaletests/kubernetes-with-podmonitor/README.md
new file mode 100644
index 0000000000000..6c04af8ea6a63
--- /dev/null
+++ b/examples/scaletests/kubernetes-with-podmonitor/README.md
@@ -0,0 +1,98 @@
+---
+name: Develop in Kubernetes
+description: Get started with Kubernetes development.
+tags: [cloud, kubernetes]
+icon: /icon/k8s.png
+---
+
+# Getting started
+
+This template creates a pod running the `codercom/enterprise-base:ubuntu` image.
+
+## Authentication
+
+This template can authenticate using in-cluster authentication, or using a kubeconfig local to the
+Coder host. For additional authentication options, consult the [Kubernetes provider
+documentation](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs).
+
+### kubeconfig on Coder host
+
+If the Coder host has a local `~/.kube/config`, you can use this to authenticate
+with Coder. Make sure this is done with same user that's running the `coder` service.
+
+To use this authentication, set the parameter `use_kubeconfig` to true.
+
+### In-cluster authentication
+
+If the Coder host runs in a Pod on the same Kubernetes cluster as you are creating workspaces in,
+you can use in-cluster authentication.
+
+To use this authentication, set the parameter `use_kubeconfig` to false.
+
+The Terraform provisioner will automatically use the service account associated with the pod to
+authenticate to Kubernetes. Be sure to bind a [role with appropriate permission](#rbac) to the
+service account. For example, assuming the Coder host runs in the same namespace as you intend
+to create workspaces:
+
+```yaml
+apiVersion: v1
+kind: ServiceAccount
+metadata:
+  name: coder
+
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: coder
+subjects:
+  - kind: ServiceAccount
+    name: coder
+roleRef:
+  kind: Role
+  name: coder
+  apiGroup: rbac.authorization.k8s.io
+```
+
+Then start the Coder host with `serviceAccountName: coder` in the pod spec.
+
+### Authenticate against external clusters
+
+You may want to deploy workspaces on a cluster outside of the Coder control plane. Refer to the [Coder docs](https://coder.com/docs/v2/latest/platforms/kubernetes/additional-clusters) to learn how to modify your template to authenticate against external clusters.
+
+## Namespace
+
+The target namespace in which the pod will be deployed is defined via the `coder_workspace`
+variable. The namespace must exist prior to creating workspaces.
+
+## Persistence
+
+The `/home/coder` directory in this example is persisted via the attached PersistentVolumeClaim.
+Any data saved outside of this directory will be wiped when the workspace stops.
+
+Since most binary installations and environment configurations live outside of
+the `/home` directory, we suggest including these in the `startup_script` argument
+of the `coder_agent` resource block, which will run each time the workspace starts up.
+
+For example, when installing the `aws` CLI, the install script will place the
+`aws` binary in `/usr/local/bin/aws`. To ensure the `aws` CLI is persisted across
+workspace starts/stops, include the following code in the `coder_agent` resource
+block of your workspace template:
+
+```terraform
+resource "coder_agent" "main" {
+  startup_script = <<-EOT
+    set -e
+    # install AWS CLI
+    curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
+    unzip awscliv2.zip
+    sudo ./aws/install
+  EOT
+}
+```
+
+## code-server
+
+`code-server` is installed via the `startup_script` argument in the `coder_agent`
+resource block. The `coder_app` resource is defined to access `code-server` through
+the dashboard UI over `localhost:13337`.
diff --git a/examples/scaletests/kubernetes-with-podmonitor/main.tf b/examples/scaletests/kubernetes-with-podmonitor/main.tf
new file mode 100644
index 0000000000000..1c6c732377728
--- /dev/null
+++ b/examples/scaletests/kubernetes-with-podmonitor/main.tf
@@ -0,0 +1,362 @@
+terraform {
+  required_providers {
+    coder = {
+      source  = "coder/coder"
+      version = "~> 0.7.0"
+    }
+    kubernetes = {
+      source  = "hashicorp/kubernetes"
+      version = "~> 2.18"
+    }
+  }
+}
+
+provider "coder" {
+}
+
+variable "use_kubeconfig" {
+  type        = bool
+  description = <<-EOF
+  Use host kubeconfig? (true/false)
+
+  Set this to false if the Coder host is itself running as a Pod on the same
+  Kubernetes cluster as you are deploying workspaces to.
+
+  Set this to true if the Coder host is running outside the Kubernetes cluster
+  for workspaces.  A valid "~/.kube/config" must be present on the Coder host.
+  EOF
+  default     = false
+}
+
+variable "namespace" {
+  type        = string
+  description = "The Kubernetes namespace to create workspaces in (must exist prior to creating workspaces)"
+}
+
+data "coder_parameter" "cpu" {
+  name         = "cpu"
+  display_name = "CPU"
+  description  = "The number of CPU cores"
+  default      = "2"
+  icon         = "/icon/memory.svg"
+  mutable      = true
+  option {
+    name  = "2 Cores"
+    value = "2"
+  }
+  option {
+    name  = "4 Cores"
+    value = "4"
+  }
+  option {
+    name  = "6 Cores"
+    value = "6"
+  }
+  option {
+    name  = "8 Cores"
+    value = "8"
+  }
+}
+
+data "coder_parameter" "memory" {
+  name         = "memory"
+  display_name = "Memory"
+  description  = "The amount of memory in GB"
+  default      = "2"
+  icon         = "/icon/memory.svg"
+  mutable      = true
+  option {
+    name  = "2 GB"
+    value = "2"
+  }
+  option {
+    name  = "4 GB"
+    value = "4"
+  }
+  option {
+    name  = "6 GB"
+    value = "6"
+  }
+  option {
+    name  = "8 GB"
+    value = "8"
+  }
+  option {
+    name  = "16 GB"
+    value = "16"
+  }
+  option {
+    name  = "24 GB"
+    value = "24"
+  }
+}
+
+data "coder_parameter" "home_disk_size" {
+  name         = "home_disk_size"
+  display_name = "Home disk size"
+  description  = "The size of the home disk in GB"
+  default      = "10"
+  type         = "number"
+  icon         = "/emojis/1f4be.png"
+  mutable      = false
+  validation {
+    min = 1
+    max = 99999
+  }
+}
+
+provider "kubernetes" {
+  # Authenticate via ~/.kube/config or a Coder-specific ServiceAccount, depending on admin preferences
+  config_path = var.use_kubeconfig == true ? "~/.kube/config" : null
+}
+
+data "coder_workspace" "me" {}
+
+resource "coder_agent" "main" {
+  os                     = "linux"
+  arch                   = "amd64"
+  startup_script_timeout = 180
+  startup_script         = <<-EOT
+    set -e
+
+    # install and start code-server
+    curl -fsSL https://code-server.dev/install.sh | sh -s -- --method=standalone --prefix=/tmp/code-server --version 4.11.0
+    /tmp/code-server/bin/code-server --auth none --port 13337 >/tmp/code-server.log 2>&1 &
+  EOT
+
+  # The following metadata blocks are optional. They are used to display
+  # information about your workspace in the dashboard. You can remove them
+  # if you don't want to display any information.
+  # For basic resources, you can use the `coder stat` command.
+  # If you need more control, you can write your own script.
+  metadata {
+    display_name = "CPU Usage"
+    key          = "0_cpu_usage"
+    script       = "coder stat cpu"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "RAM Usage"
+    key          = "1_ram_usage"
+    script       = "coder stat mem"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Home Disk"
+    key          = "3_home_disk"
+    script       = "coder stat disk --path $${HOME}"
+    interval     = 60
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "CPU Usage (Host)"
+    key          = "4_cpu_usage_host"
+    script       = "coder stat cpu --host"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Memory Usage (Host)"
+    key          = "5_mem_usage_host"
+    script       = "coder stat mem --host"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Load Average (Host)"
+    key          = "6_load_host"
+    # get load avg scaled by number of cores
+    script   = <<EOT
+      echo "`cat /proc/loadavg | awk '{ print $1 }'` `nproc`" | awk '{ printf "%0.2f", $1/$2 }'
+    EOT
+    interval = 60
+    timeout  = 1
+  }
+}
+
+# code-server
+resource "coder_app" "code-server" {
+  agent_id     = coder_agent.main.id
+  slug         = "code-server"
+  display_name = "code-server"
+  icon         = "/icon/code.svg"
+  url          = "http://localhost:13337?folder=/home/coder"
+  subdomain    = false
+  share        = "owner"
+
+  healthcheck {
+    url       = "http://localhost:13337/healthz"
+    interval  = 3
+    threshold = 10
+  }
+}
+
+
+locals {
+  workspace_pod_name = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+  workspace_pvc_name = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}-home"
+}
+
+
+resource "kubernetes_persistent_volume_claim" "home" {
+  metadata {
+    name      = local.workspace_pvc_name
+    namespace = var.namespace
+    labels = {
+      "app.kubernetes.io/name"     = "coder-pvc"
+      "app.kubernetes.io/instance" = "coder-pvc-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+      "app.kubernetes.io/part-of"  = "coder"
+      // Coder specific labels.
+      "com.coder.resource"       = "true"
+      "com.coder.workspace.id"   = data.coder_workspace.me.id
+      "com.coder.workspace.name" = data.coder_workspace.me.name
+      "com.coder.user.id"        = data.coder_workspace.me.owner_id
+      "com.coder.user.username"  = data.coder_workspace.me.owner
+    }
+    annotations = {
+      "com.coder.user.email" = data.coder_workspace.me.owner_email
+    }
+  }
+  wait_until_bound = false
+  spec {
+    access_modes = ["ReadWriteOnce"]
+    resources {
+      requests = {
+        storage = "${data.coder_parameter.home_disk_size.value}Gi"
+      }
+    }
+  }
+}
+
+resource "kubernetes_pod" "main" {
+  count = data.coder_workspace.me.start_count
+  metadata {
+    name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    namespace = var.namespace
+    labels = {
+      "app.kubernetes.io/name"     = "coder-workspace"
+      "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+      "app.kubernetes.io/part-of"  = "coder"
+      // Coder specific labels.
+      "com.coder.resource"       = "true"
+      "com.coder.workspace.id"   = data.coder_workspace.me.id
+      "com.coder.workspace.name" = data.coder_workspace.me.name
+      "com.coder.user.id"        = data.coder_workspace.me.owner_id
+      "com.coder.user.username"  = data.coder_workspace.me.owner
+    }
+    annotations = {
+      "com.coder.user.email" = data.coder_workspace.me.owner_email
+    }
+  }
+  spec {
+    security_context {
+      run_as_user = "1000"
+      fs_group    = "1000"
+    }
+    container {
+      name              = "dev"
+      image             = "codercom/enterprise-base:ubuntu"
+      image_pull_policy = "Always"
+      command           = ["sh", "-c", coder_agent.main.init_script]
+      security_context {
+        run_as_user = "1000"
+      }
+      env {
+        name  = "CODER_AGENT_TOKEN"
+        value = coder_agent.main.token
+      }
+      resources {
+        requests = {
+          "cpu"    = "250m"
+          "memory" = "512Mi"
+        }
+        limits = {
+          "cpu"    = "${data.coder_parameter.cpu.value}"
+          "memory" = "${data.coder_parameter.memory.value}Gi"
+        }
+      }
+      volume_mount {
+        mount_path = "/home/coder"
+        name       = "home"
+        read_only  = false
+      }
+      port {
+        container_port = 21112
+        name = "prometheus-http"
+        protocol = "TCP"
+      }
+    }
+
+    volume {
+      name = "home"
+      persistent_volume_claim {
+        claim_name = kubernetes_persistent_volume_claim.home.metadata.0.name
+        read_only  = false
+      }
+    }
+
+
+    affinity {
+      pod_anti_affinity {
+        // This affinity attempts to spread out all workspace pods evenly across
+        // nodes.
+        preferred_during_scheduling_ignored_during_execution {
+          weight = 1
+          pod_affinity_term {
+            topology_key = "kubernetes.io/hostname"
+            label_selector {
+              match_expressions {
+                key      = "app.kubernetes.io/name"
+                operator = "In"
+                values   = ["coder-workspace"]
+              }
+            }
+          }
+        }
+      }
+      node_affinity {
+        required_during_scheduling_ignored_during_execution {
+          node_selector_term {
+            match_expressions {
+              key = "cloud.google.com/gke-nodepool"
+              operator = "In"
+              values = ["big-misc"] # avoid placing on the same nodes as scaletest workspaces
+            }
+          }
+        }
+      }
+    }
+  }
+}
+
+resource "kubernetes_manifest" "pod_monitor" {
+  count = data.coder_workspace.me.start_count
+  manifest = {
+  apiVersion = "monitoring.coreos.com/v1"
+      kind = "PodMonitor"
+    metadata = {
+      namespace = var.namespace
+      name = "podmonitor-${local.workspace_pod_name}"
+    }
+    spec = {
+      selector = {
+        matchLabels = {
+          "app.kubernetes.io/instance": "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+        }
+      }
+      podMetricsEndpoints = [
+        {
+          port = "prometheus-http"
+          interval = "15s"
+        }
+      ]
+    }
+  }
+}
\ No newline at end of file
diff --git a/examples/scaletests/scaletest-runner/Dockerfile b/examples/scaletests/scaletest-runner/Dockerfile
new file mode 100644
index 0000000000000..9aa016b534a17
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/Dockerfile
@@ -0,0 +1,36 @@
+# This image is used to run scaletest jobs and, although it is inside
+# the template directory, it is built separately and pushed to
+# gcr.io/coder-dev-1/scaletest-runner:latest.
+#
+# Future improvements will include versioning and including the version
+# in the template push.
+
+FROM codercom/enterprise-base:ubuntu
+
+ARG DEBIAN_FRONTEND=noninteractive
+
+USER root
+
+# TODO(mafredri): Remove unneeded dependencies once we have a clear idea of what's needed.
+RUN wget --quiet -O /tmp/terraform.zip https://releases.hashicorp.com/terraform/1.5.7/terraform_1.5.7_linux_amd64.zip \
+	&& unzip /tmp/terraform.zip -d /usr/local/bin \
+	&& rm /tmp/terraform.zip \
+	&& terraform --version
+
+RUN wget --quiet -O /tmp/envsubst "https://github.com/a8m/envsubst/releases/download/v1.2.0/envsubst-$(uname -s)-$(uname -m)" \
+	&& chmod +x /tmp/envsubst \
+	&& mv /tmp/envsubst /usr/local/bin
+
+RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list \
+	&& curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - \
+	&& apt-get update \
+	&& apt-get install --yes \
+	google-cloud-cli \
+	jq \
+	kubectl \
+	zstd \
+	&& gcloud --version \
+	&& kubectl version --client \
+	&& rm -rf /var/lib/apt/lists/*
+
+USER coder
diff --git a/examples/scaletests/scaletest-runner/README.md b/examples/scaletests/scaletest-runner/README.md
new file mode 100644
index 0000000000000..6c048211e1ad4
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/README.md
@@ -0,0 +1,9 @@
+---
+name: Scaletest Runner
+description: Run a scaletest.
+tags: [local]
+---
+
+# Scaletest Runner
+
+Run a scaletest.
diff --git a/examples/scaletests/scaletest-runner/main.tf b/examples/scaletests/scaletest-runner/main.tf
new file mode 100644
index 0000000000000..2a6eb8ca21ed5
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/main.tf
@@ -0,0 +1,961 @@
+terraform {
+  required_providers {
+    coder = {
+      source  = "coder/coder"
+      version = "~> 0.12"
+    }
+    kubernetes = {
+      source  = "hashicorp/kubernetes"
+      version = "~> 2.22"
+    }
+  }
+}
+
+resource "time_static" "start_time" {
+  # We don't set `count = data.coder_workspace.me.start_count` here because then
+  # we can't use this value in `locals`, but we want to trigger recreation when
+  # the scaletest is restarted.
+  triggers = {
+    count : data.coder_workspace.me.start_count
+    token : data.coder_workspace.me.owner_session_token # Rely on this being re-generated every start.
+  }
+}
+
+resource "null_resource" "permission_check" {
+  count = data.coder_workspace.me.start_count
+
+  # Limit which users can create a workspace in this template.
+  # The "default" user and workspace are present because they are needed
+  # for the plan, and consequently, updating the template.
+  lifecycle {
+    precondition {
+      condition     = can(regex("^(default/default|scaletest/runner)$", "${data.coder_workspace.me.owner}/${data.coder_workspace.me.name}"))
+      error_message = "User and workspace name is not allowed, expected 'scaletest/runner'."
+    }
+  }
+}
+
+locals {
+  workspace_pod_name                             = "coder-scaletest-runner-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+  workspace_pod_instance                         = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+  workspace_pod_termination_grace_period_seconds = 5 * 60 * 60 # 5 hours (cleanup timeout).
+  service_account_name                           = "scaletest-sa"
+  home_disk_size                                 = 10
+  scaletest_run_id                               = "scaletest-${replace(time_static.start_time.rfc3339, ":", "-")}"
+  scaletest_run_dir                              = "/home/coder/${local.scaletest_run_id}"
+  scaletest_run_start_time                       = time_static.start_time.rfc3339
+  grafana_url                                    = "https://stats.dev.c8s.io"
+  grafana_dashboard_uid                          = "qLVSTR-Vz"
+  grafana_dashboard_name                         = "coderv2-loadtest-dashboard"
+}
+
+data "coder_provisioner" "me" {
+}
+
+data "coder_workspace" "me" {
+}
+
+data "coder_parameter" "verbose" {
+  order       = 1
+  type        = "bool"
+  name        = "Verbose"
+  default     = false
+  description = "Show debug output."
+  mutable     = true
+  ephemeral   = true
+}
+
+data "coder_parameter" "dry_run" {
+  order       = 2
+  type        = "bool"
+  name        = "Dry-run"
+  default     = true
+  description = "Perform a dry-run to see what would happen."
+  mutable     = true
+  ephemeral   = true
+}
+
+data "coder_parameter" "repo_branch" {
+  order       = 3
+  type        = "string"
+  name        = "Branch"
+  default     = "main"
+  description = "Branch of coder/coder repo to check out (only useful for developing the runner)."
+  mutable     = true
+}
+
+data "coder_parameter" "comment" {
+  order       = 4
+  type        = "string"
+  name        = "Comment"
+  default     = ""
+  description = "Describe **what** you're testing and **why** you're testing it."
+  mutable     = true
+  ephemeral   = true
+}
+
+data "coder_parameter" "create_concurrency" {
+  order       = 10
+  type        = "number"
+  name        = "Create concurrency"
+  default     = 10
+  description = "The number of workspaces to create concurrently."
+  mutable     = true
+
+  # Setting zero = unlimited, but perhaps not a good idea,
+  # we can raise this limit instead.
+  validation {
+    min = 1
+    max = 100
+  }
+}
+
+data "coder_parameter" "job_concurrency" {
+  order       = 11
+  type        = "number"
+  name        = "Job concurrency"
+  default     = 0
+  description = "The number of concurrent jobs (e.g. when producing workspace traffic)."
+  mutable     = true
+
+  # Setting zero = unlimited, but perhaps not a good idea,
+  # we can raise this limit instead.
+  validation {
+    min = 0
+  }
+}
+
+data "coder_parameter" "cleanup_concurrency" {
+  order       = 12
+  type        = "number"
+  name        = "Cleanup concurrency"
+  default     = 10
+  description = "The number of concurrent cleanup jobs."
+  mutable     = true
+
+  # Setting zero = unlimited, but perhaps not a good idea,
+  # we can raise this limit instead.
+  validation {
+    min = 1
+    max = 100
+  }
+}
+
+data "coder_parameter" "cleanup_strategy" {
+  order       = 13
+  name        = "Cleanup strategy"
+  default     = "always"
+  description = "The strategy used to cleanup workspaces after the scaletest is complete."
+  mutable     = true
+  ephemeral   = true
+  option {
+    name        = "Always"
+    value       = "always"
+    description = "Automatically cleanup workspaces after the scaletest ends."
+  }
+  option {
+    name        = "On stop"
+    value       = "on_stop"
+    description = "Cleanup workspaces when the workspace is stopped."
+  }
+  option {
+    name        = "On success"
+    value       = "on_success"
+    description = "Automatically cleanup workspaces after the scaletest is complete if no error occurs."
+  }
+  option {
+    name        = "On error"
+    value       = "on_error"
+    description = "Automatically cleanup workspaces after the scaletest is complete if an error occurs."
+  }
+}
+
+data "coder_parameter" "cleanup_prepare" {
+  order       = 14
+  type        = "bool"
+  name        = "Cleanup before scaletest"
+  default     = true
+  description = "Cleanup existing scaletest users and workspaces before the scaletest starts (prepare phase)."
+  mutable     = true
+  ephemeral   = true
+}
+
+
+data "coder_parameter" "workspace_template" {
+  order        = 20
+  name         = "workspace_template"
+  display_name = "Workspace Template"
+  description  = "The template used for workspace creation."
+  default      = "kubernetes-minimal"
+  icon         = "/emojis/1f4dc.png" # Scroll.
+  mutable      = true
+  option {
+    name        = "Minimal"
+    value       = "kubernetes-minimal" # Feather.
+    icon        = "/emojis/1fab6.png"
+    description = "Sized to fit approx. 32 per t2d-standard-8 instance."
+  }
+  option {
+    name        = "Small"
+    value       = "kubernetes-small"
+    icon        = "/emojis/1f42d.png" # Mouse.
+    description = "Provisions a small-sized workspace with no persistent storage."
+  }
+  option {
+    name        = "Medium"
+    value       = "kubernetes-medium"
+    icon        = "/emojis/1f436.png" # Dog.
+    description = "Provisions a medium-sized workspace with no persistent storage."
+  }
+  option {
+    name        = "Medium (Greedy)"
+    value       = "kubernetes-medium-greedy"
+    icon        = "/emojis/1f436.png" # Dog.
+    description = "Provisions a medium-sized workspace with no persistent storage. Greedy agent variant."
+  }
+  option {
+    name        = "Large"
+    value       = "kubernetes-large"
+    icon        = "/emojis/1f434.png" # Horse.
+    description = "Provisions a large-sized workspace with no persistent storage."
+  }
+}
+
+data "coder_parameter" "num_workspaces" {
+  order       = 21
+  type        = "number"
+  name        = "Number of workspaces to create"
+  default     = 100
+  description = "The scaletest suite will create this number of workspaces."
+  mutable     = true
+
+  validation {
+    min = 0
+    max = 2000
+  }
+}
+
+data "coder_parameter" "skip_create_workspaces" {
+  order       = 22
+  type        = "bool"
+  name        = "DEBUG: Skip creating workspaces"
+  default     = false
+  description = "Skip creating workspaces (for resuming failed scaletests or debugging)"
+  mutable     = true
+}
+
+
+data "coder_parameter" "load_scenarios" {
+  order       = 23
+  name        = "Load Scenarios"
+  type        = "list(string)"
+  description = "The load scenarios to run."
+  mutable     = true
+  ephemeral   = true
+  default = jsonencode([
+    "SSH Traffic",
+    "Web Terminal Traffic",
+    "App Traffic",
+    "Dashboard Traffic",
+  ])
+}
+
+data "coder_parameter" "load_scenario_run_concurrently" {
+  order       = 24
+  name        = "Run Load Scenarios Concurrently"
+  type        = "bool"
+  default     = false
+  description = "Run all load scenarios concurrently, this setting enables the load scenario percentages so that they can be assigned a percentage of 1-100%."
+  mutable     = true
+}
+
+data "coder_parameter" "load_scenario_concurrency_stagger_delay_mins" {
+  order       = 25
+  name        = "Load Scenario Concurrency Stagger Delay"
+  type        = "number"
+  default     = 3
+  description = "The number of minutes to wait between starting each load scenario when run concurrently."
+  mutable     = true
+}
+
+data "coder_parameter" "load_scenario_ssh_traffic_duration" {
+  order       = 30
+  name        = "SSH Traffic Duration"
+  type        = "number"
+  description = "The duration of the SSH traffic load scenario in minutes."
+  mutable     = true
+  default     = 30
+  validation {
+    min = 1
+    max = 1440 // 24 hours.
+  }
+}
+
+data "coder_parameter" "load_scenario_ssh_bytes_per_tick" {
+  order       = 31
+  name        = "SSH Bytes Per Tick"
+  type        = "number"
+  description = "The number of bytes to send per tick in the SSH traffic load scenario."
+  mutable     = true
+  default     = 1024
+  validation {
+    min = 1
+  }
+}
+
+data "coder_parameter" "load_scenario_ssh_tick_interval" {
+  order       = 32
+  name        = "SSH Tick Interval"
+  type        = "number"
+  description = "The number of milliseconds between each tick in the SSH traffic load scenario."
+  mutable     = true
+  default     = 100
+  validation {
+    min = 1
+  }
+}
+
+data "coder_parameter" "load_scenario_ssh_traffic_percentage" {
+  order       = 33
+  name        = "SSH Traffic Percentage"
+  type        = "number"
+  description = "The percentage of workspaces that should be targeted for SSH traffic."
+  mutable     = true
+  default     = 100
+  validation {
+    min = 1
+    max = 100
+  }
+}
+
+data "coder_parameter" "load_scenario_web_terminal_traffic_duration" {
+  order       = 40
+  name        = "Web Terminal Traffic Duration"
+  type        = "number"
+  description = "The duration of the web terminal traffic load scenario in minutes."
+  mutable     = true
+  default     = 30
+  validation {
+    min = 1
+    max = 1440 // 24 hours.
+  }
+}
+
+data "coder_parameter" "load_scenario_web_terminal_bytes_per_tick" {
+  order       = 41
+  name        = "Web Terminal Bytes Per Tick"
+  type        = "number"
+  description = "The number of bytes to send per tick in the web terminal traffic load scenario."
+  mutable     = true
+  default     = 1024
+  validation {
+    min = 1
+  }
+}
+
+data "coder_parameter" "load_scenario_web_terminal_tick_interval" {
+  order       = 42
+  name        = "Web Terminal Tick Interval"
+  type        = "number"
+  description = "The number of milliseconds between each tick in the web terminal traffic load scenario."
+  mutable     = true
+  default     = 100
+  validation {
+    min = 1
+  }
+}
+
+data "coder_parameter" "load_scenario_web_terminal_traffic_percentage" {
+  order       = 43
+  name        = "Web Terminal Traffic Percentage"
+  type        = "number"
+  description = "The percentage of workspaces that should be targeted for web terminal traffic."
+  mutable     = true
+  default     = 100
+  validation {
+    min = 1
+    max = 100
+  }
+}
+
+data "coder_parameter" "load_scenario_app_traffic_duration" {
+  order       = 50
+  name        = "App Traffic Duration"
+  type        = "number"
+  description = "The duration of the app traffic load scenario in minutes."
+  mutable     = true
+  default     = 30
+  validation {
+    min = 1
+    max = 1440 // 24 hours.
+  }
+}
+
+data "coder_parameter" "load_scenario_app_bytes_per_tick" {
+  order       = 51
+  name        = "App Bytes Per Tick"
+  type        = "number"
+  description = "The number of bytes to send per tick in the app traffic load scenario."
+  mutable     = true
+  default     = 1024
+  validation {
+    min = 1
+  }
+}
+
+data "coder_parameter" "load_scenario_app_tick_interval" {
+  order       = 52
+  name        = "App Tick Interval"
+  type        = "number"
+  description = "The number of milliseconds between each tick in the app traffic load scenario."
+  mutable     = true
+  default     = 100
+  validation {
+    min = 1
+  }
+}
+
+data "coder_parameter" "load_scenario_app_traffic_percentage" {
+  order       = 53
+  name        = "App Traffic Percentage"
+  type        = "number"
+  description = "The percentage of workspaces that should be targeted for app traffic."
+  mutable     = true
+  default     = 100
+  validation {
+    min = 1
+    max = 100
+  }
+}
+
+data "coder_parameter" "load_scenario_app_traffic_mode" {
+  order       = 54
+  name        = "App Traffic Mode"
+  default     = "wsec"
+  description = "The mode of the app traffic load scenario."
+  mutable     = true
+  option {
+    name        = "WebSocket Echo"
+    value       = "wsec"
+    description = "Send traffic to the workspace via the app websocket and read it back."
+  }
+  option {
+    name        = "WebSocket Read (Random)"
+    value       = "wsra"
+    description = "Read traffic from the workspace via the app websocket."
+  }
+  option {
+    name        = "WebSocket Write (Discard)"
+    value       = "wsdi"
+    description = "Send traffic to the workspace via the app websocket."
+  }
+}
+
+data "coder_parameter" "load_scenario_dashboard_traffic_duration" {
+  order       = 60
+  name        = "Dashboard Traffic Duration"
+  type        = "number"
+  description = "The duration of the dashboard traffic load scenario in minutes."
+  mutable     = true
+  default     = 30
+  validation {
+    min = 1
+    max = 1440 // 24 hours.
+  }
+}
+
+data "coder_parameter" "load_scenario_dashboard_traffic_percentage" {
+  order       = 61
+  name        = "Dashboard Traffic Percentage"
+  type        = "number"
+  description = "The percentage of users that should be targeted for dashboard traffic."
+  mutable     = true
+  default     = 100
+  validation {
+    min = 1
+    max = 100
+  }
+}
+
+data "coder_parameter" "load_scenario_baseline_duration" {
+  order       = 100
+  name        = "Baseline Wait Duration"
+  type        = "number"
+  description = "The duration to wait before starting a load scenario in minutes."
+  mutable     = true
+  default     = 5
+  validation {
+    min = 0
+    max = 60
+  }
+}
+
+data "coder_parameter" "greedy_agent" {
+  order       = 200
+  type        = "bool"
+  name        = "Greedy Agent"
+  default     = false
+  description = "If true, the agent will attempt to consume all available resources."
+  mutable     = true
+  ephemeral   = true
+}
+
+data "coder_parameter" "greedy_agent_template" {
+  order        = 201
+  name         = "Greedy Agent Template"
+  display_name = "Greedy Agent Template"
+  description  = "The template used for the greedy agent workspace (must not be same as workspace template)."
+  default      = "kubernetes-medium-greedy"
+  icon         = "/emojis/1f4dc.png" # Scroll.
+  mutable      = true
+  option {
+    name        = "Minimal"
+    value       = "kubernetes-minimal" # Feather.
+    icon        = "/emojis/1fab6.png"
+    description = "Sized to fit approx. 32 per t2d-standard-8 instance."
+  }
+  option {
+    name        = "Small"
+    value       = "kubernetes-small"
+    icon        = "/emojis/1f42d.png" # Mouse.
+    description = "Provisions a small-sized workspace with no persistent storage."
+  }
+  option {
+    name        = "Medium"
+    value       = "kubernetes-medium"
+    icon        = "/emojis/1f436.png" # Dog.
+    description = "Provisions a medium-sized workspace with no persistent storage."
+  }
+  option {
+    name        = "Medium (Greedy)"
+    value       = "kubernetes-medium-greedy"
+    icon        = "/emojis/1f436.png" # Dog.
+    description = "Provisions a medium-sized workspace with no persistent storage. Greedy agent variant."
+  }
+  option {
+    name        = "Large"
+    value       = "kubernetes-large"
+    icon        = "/emojis/1f434.png" # Horse.
+    description = "Provisions a large-sized workspace with no persistent storage."
+  }
+}
+
+data "coder_parameter" "namespace" {
+  order       = 999
+  type        = "string"
+  name        = "Namespace"
+  default     = "coder-big"
+  description = "The Kubernetes namespace to create the scaletest runner resources in."
+}
+
+data "archive_file" "scripts_zip" {
+  type        = "zip"
+  output_path = "${path.module}/scripts.zip"
+  source_dir  = "${path.module}/scripts"
+}
+
+resource "coder_agent" "main" {
+  arch = data.coder_provisioner.me.arch
+  dir  = local.scaletest_run_dir
+  os   = "linux"
+  env = {
+    VERBOSE : data.coder_parameter.verbose.value ? "1" : "0",
+    DRY_RUN : data.coder_parameter.dry_run.value ? "1" : "0",
+    CODER_CONFIG_DIR : "/home/coder/.config/coderv2",
+    CODER_USER_TOKEN : data.coder_workspace.me.owner_session_token,
+    CODER_URL : data.coder_workspace.me.access_url,
+    CODER_USER : data.coder_workspace.me.owner,
+    CODER_WORKSPACE : data.coder_workspace.me.name,
+
+    # Global scaletest envs that may affect each `coder exp scaletest` invocation.
+    CODER_SCALETEST_PROMETHEUS_ADDRESS : "0.0.0.0:21112",
+    CODER_SCALETEST_PROMETHEUS_WAIT : "60s",
+    CODER_SCALETEST_CONCURRENCY : "${data.coder_parameter.job_concurrency.value}",
+    CODER_SCALETEST_CLEANUP_CONCURRENCY : "${data.coder_parameter.cleanup_concurrency.value}",
+
+    # Expose as params as well, for reporting (TODO(mafredri): refactor, only have one).
+    SCALETEST_PARAM_SCALETEST_CONCURRENCY : "${data.coder_parameter.job_concurrency.value}",
+    SCALETEST_PARAM_SCALETEST_CLEANUP_CONCURRENCY : "${data.coder_parameter.cleanup_concurrency.value}",
+
+    # Local envs passed as arguments to `coder exp scaletest` invocations.
+    SCALETEST_RUN_ID : local.scaletest_run_id,
+    SCALETEST_RUN_DIR : local.scaletest_run_dir,
+    SCALETEST_RUN_START_TIME : local.scaletest_run_start_time,
+    SCALETEST_PROMETHEUS_START_PORT : "21112",
+
+    # Comment is a scaletest param, but we want to surface it separately from
+    # the rest, so we use a different name.
+    SCALETEST_COMMENT : data.coder_parameter.comment.value != "" ? data.coder_parameter.comment.value : "No comment provided",
+
+    SCALETEST_PARAM_TEMPLATE : data.coder_parameter.workspace_template.value,
+    SCALETEST_PARAM_REPO_BRANCH : data.coder_parameter.repo_branch.value,
+    SCALETEST_PARAM_NUM_WORKSPACES : data.coder_parameter.num_workspaces.value,
+    SCALETEST_PARAM_SKIP_CREATE_WORKSPACES : data.coder_parameter.skip_create_workspaces.value ? "1" : "0",
+    SCALETEST_PARAM_CREATE_CONCURRENCY : "${data.coder_parameter.create_concurrency.value}",
+    SCALETEST_PARAM_CLEANUP_STRATEGY : data.coder_parameter.cleanup_strategy.value,
+    SCALETEST_PARAM_CLEANUP_PREPARE : data.coder_parameter.cleanup_prepare.value ? "1" : "0",
+    SCALETEST_PARAM_LOAD_SCENARIOS : data.coder_parameter.load_scenarios.value,
+    SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY : data.coder_parameter.load_scenario_run_concurrently.value ? "1" : "0",
+    SCALETEST_PARAM_LOAD_SCENARIO_CONCURRENCY_STAGGER_DELAY_MINS : "${data.coder_parameter.load_scenario_concurrency_stagger_delay_mins.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_ssh_traffic_duration.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_BYTES_PER_TICK : "${data.coder_parameter.load_scenario_ssh_bytes_per_tick.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_TICK_INTERVAL : "${data.coder_parameter.load_scenario_ssh_tick_interval.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_ssh_traffic_percentage.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_web_terminal_traffic_duration.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_BYTES_PER_TICK : "${data.coder_parameter.load_scenario_web_terminal_bytes_per_tick.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_TICK_INTERVAL : "${data.coder_parameter.load_scenario_web_terminal_tick_interval.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_web_terminal_traffic_percentage.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_app_traffic_duration.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_BYTES_PER_TICK : "${data.coder_parameter.load_scenario_app_bytes_per_tick.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_TICK_INTERVAL : "${data.coder_parameter.load_scenario_app_tick_interval.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_app_traffic_percentage.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_MODE : data.coder_parameter.load_scenario_app_traffic_mode.value,
+    SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_dashboard_traffic_duration.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_dashboard_traffic_percentage.value}",
+    SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION : "${data.coder_parameter.load_scenario_baseline_duration.value}",
+    SCALETEST_PARAM_GREEDY_AGENT : data.coder_parameter.greedy_agent.value ? "1" : "0",
+    SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE : data.coder_parameter.greedy_agent_template.value,
+
+    GRAFANA_URL : local.grafana_url,
+
+    SCRIPTS_ZIP : filebase64(data.archive_file.scripts_zip.output_path),
+    SCRIPTS_DIR : "/tmp/scripts",
+  }
+  display_apps {
+    vscode     = false
+    ssh_helper = false
+  }
+  startup_script_timeout  = 86400
+  shutdown_script_timeout = 7200
+  startup_script_behavior = "blocking"
+  startup_script          = file("startup.sh")
+  shutdown_script         = file("shutdown.sh")
+
+  # IDEA(mafredri): It would be pretty cool to define metadata to expect JSON output, each field/item could become a separate metadata item.
+  # Scaletest metadata.
+  metadata {
+    display_name = "Scaletest status"
+    key          = "00_scaletest_status"
+    script       = file("metadata_status.sh")
+    interval     = 1
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Scaletest phase"
+    key          = "01_scaletest_phase"
+    script       = file("metadata_phase.sh")
+    interval     = 1
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Scaletest phase (previous)"
+    key          = "02_scaletest_previous_phase"
+    script       = file("metadata_previous_phase.sh")
+    interval     = 1
+    timeout      = 1
+  }
+
+  # Misc workspace metadata.
+  metadata {
+    display_name = "CPU Usage"
+    key          = "80_cpu_usage"
+    script       = "coder stat cpu"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "RAM Usage"
+    key          = "81_ram_usage"
+    script       = "coder stat mem"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Home Disk"
+    key          = "82_home_disk"
+    script       = "coder stat disk --path $${HOME}"
+    interval     = 60
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "CPU Usage (Host)"
+    key          = "83_cpu_usage_host"
+    script       = "coder stat cpu --host"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Memory Usage (Host)"
+    key          = "84_mem_usage_host"
+    script       = "coder stat mem --host"
+    interval     = 10
+    timeout      = 1
+  }
+
+  metadata {
+    display_name = "Load Average (Host)"
+    key          = "85_load_host"
+    # Get load avg scaled by number of cores.
+    script   = <<-EOS
+      echo "`cat /proc/loadavg | awk '{ print $1 }'` `nproc`" | awk '{ printf "%0.2f", $1/$2 }'
+    EOS
+    interval = 60
+    timeout  = 1
+  }
+}
+
+module "code-server" {
+  source          = "https://registry.coder.com/modules/code-server"
+  agent_id        = coder_agent.main.id
+  install_version = "4.8.3"
+  folder          = local.scaletest_run_dir
+}
+
+module "filebrowser" {
+  source   = "https://registry.coder.com/modules/filebrowser"
+  agent_id = coder_agent.main.id
+  folder   = local.scaletest_run_dir
+}
+
+resource "coder_app" "grafana" {
+  agent_id     = coder_agent.main.id
+  slug         = "00-grafana"
+  display_name = "Grafana"
+  url          = "${local.grafana_url}/d/${local.grafana_dashboard_uid}/${local.grafana_dashboard_name}?orgId=1&from=${time_static.start_time.unix * 1000}&to=now"
+  icon         = "https://grafana.com/static/assets/img/fav32.png"
+  external     = true
+}
+
+resource "coder_app" "prometheus" {
+  agent_id     = coder_agent.main.id
+  slug         = "01-prometheus"
+  display_name = "Prometheus"
+  // https://stats.dev.c8s.io:9443/classic/graph?g0.range_input=2h&g0.end_input=2023-09-08%2015%3A58&g0.stacked=0&g0.expr=rate(pg_stat_database_xact_commit%7Bcluster%3D%22big%22%2Cdatname%3D%22big-coder%22%7D%5B1m%5D)&g0.tab=0
+  url      = "https://stats.dev.c8s.io:9443"
+  icon     = "https://prometheus.io/assets/favicons/favicon-32x32.png"
+  external = true
+}
+
+resource "coder_app" "manual_cleanup" {
+  agent_id     = coder_agent.main.id
+  slug         = "02-manual-cleanup"
+  display_name = "Manual cleanup"
+  icon         = "/emojis/1f9f9.png"
+  command      = "/tmp/scripts/cleanup.sh manual"
+}
+
+resource "kubernetes_persistent_volume_claim" "home" {
+  depends_on = [null_resource.permission_check]
+  metadata {
+    name      = "${local.workspace_pod_name}-home"
+    namespace = data.coder_parameter.namespace.value
+    labels = {
+      "app.kubernetes.io/name"     = "coder-pvc"
+      "app.kubernetes.io/instance" = "coder-pvc-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+      "app.kubernetes.io/part-of"  = "coder"
+      // Coder specific labels.
+      "com.coder.resource"       = "true"
+      "com.coder.workspace.id"   = data.coder_workspace.me.id
+      "com.coder.workspace.name" = data.coder_workspace.me.name
+      "com.coder.user.id"        = data.coder_workspace.me.owner_id
+      "com.coder.user.username"  = data.coder_workspace.me.owner
+    }
+    annotations = {
+      "com.coder.user.email" = data.coder_workspace.me.owner_email
+    }
+  }
+  wait_until_bound = false
+  spec {
+    access_modes = ["ReadWriteOnce"]
+    resources {
+      requests = {
+        storage = "${local.home_disk_size}Gi"
+      }
+    }
+  }
+}
+
+resource "kubernetes_pod" "main" {
+  depends_on = [null_resource.permission_check]
+  count      = data.coder_workspace.me.start_count
+  metadata {
+    name      = local.workspace_pod_name
+    namespace = data.coder_parameter.namespace.value
+    labels = {
+      "app.kubernetes.io/name"     = "coder-workspace"
+      "app.kubernetes.io/instance" = local.workspace_pod_instance
+      "app.kubernetes.io/part-of"  = "coder"
+      // Coder specific labels.
+      "com.coder.resource"       = "true"
+      "com.coder.workspace.id"   = data.coder_workspace.me.id
+      "com.coder.workspace.name" = data.coder_workspace.me.name
+      "com.coder.user.id"        = data.coder_workspace.me.owner_id
+      "com.coder.user.username"  = data.coder_workspace.me.owner
+    }
+    annotations = {
+      "com.coder.user.email" = data.coder_workspace.me.owner_email
+    }
+  }
+  # Set the pod delete timeout to termination_grace_period_seconds + 1m.
+  timeouts {
+    delete = "${(local.workspace_pod_termination_grace_period_seconds + 120)}s"
+  }
+  spec {
+    security_context {
+      run_as_user = "1000"
+      fs_group    = "1000"
+    }
+
+    # Allow this pod to perform scale tests.
+    service_account_name = local.service_account_name
+
+    # Allow the coder agent to perform graceful shutdown and cleanup of
+    # scaletest resources. We add an extra minute so ensure work
+    # completion is prioritized over timeout.
+    termination_grace_period_seconds = local.workspace_pod_termination_grace_period_seconds + 60
+
+    container {
+      name              = "dev"
+      image             = "gcr.io/coder-dev-1/scaletest-runner:latest"
+      image_pull_policy = "Always"
+      command           = ["sh", "-c", coder_agent.main.init_script]
+      security_context {
+        run_as_user = "1000"
+      }
+      env {
+        name  = "CODER_AGENT_TOKEN"
+        value = coder_agent.main.token
+      }
+      env {
+        name  = "CODER_AGENT_LOG_DIR"
+        value = "${local.scaletest_run_dir}/logs"
+      }
+      env {
+        name = "GRAFANA_API_TOKEN"
+        value_from {
+          secret_key_ref {
+            name = data.kubernetes_secret.grafana_editor_api_token.metadata[0].name
+            key  = "token"
+          }
+        }
+      }
+      env {
+        name = "SLACK_WEBHOOK_URL"
+        value_from {
+          secret_key_ref {
+            name = data.kubernetes_secret.slack_scaletest_notifications_webhook_url.metadata[0].name
+            key  = "url"
+          }
+        }
+      }
+      resources {
+        requests = {
+          "cpu"    = "250m"
+          "memory" = "512Mi"
+        }
+      }
+      volume_mount {
+        mount_path = "/home/coder"
+        name       = "home"
+        read_only  = false
+      }
+      dynamic "port" {
+        for_each = data.coder_parameter.load_scenario_run_concurrently.value ? jsondecode(data.coder_parameter.load_scenarios.value) : [""]
+        iterator = it
+        content {
+          container_port = 21112 + it.key
+          name           = "prom-http${it.key}"
+          protocol       = "TCP"
+        }
+      }
+    }
+
+    volume {
+      name = "home"
+      persistent_volume_claim {
+        claim_name = kubernetes_persistent_volume_claim.home.metadata.0.name
+        read_only  = false
+      }
+    }
+
+    affinity {
+      pod_anti_affinity {
+        // This affinity attempts to spread out all workspace pods evenly across
+        // nodes.
+        preferred_during_scheduling_ignored_during_execution {
+          weight = 1
+          pod_affinity_term {
+            topology_key = "kubernetes.io/hostname"
+            label_selector {
+              match_expressions {
+                key      = "app.kubernetes.io/name"
+                operator = "In"
+                values   = ["coder-workspace"]
+              }
+            }
+          }
+        }
+      }
+      node_affinity {
+        required_during_scheduling_ignored_during_execution {
+          node_selector_term {
+            match_expressions {
+              key      = "cloud.google.com/gke-nodepool"
+              operator = "In"
+              values   = ["big-workspacetraffic"] # Avoid placing on the same nodes as scaletest workspaces.
+            }
+          }
+        }
+      }
+    }
+  }
+}
+
+data "kubernetes_secret" "grafana_editor_api_token" {
+  metadata {
+    name      = "grafana-editor-api-token"
+    namespace = data.coder_parameter.namespace.value
+  }
+}
+
+data "kubernetes_secret" "slack_scaletest_notifications_webhook_url" {
+  metadata {
+    name      = "slack-scaletest-notifications-webhook-url"
+    namespace = data.coder_parameter.namespace.value
+  }
+}
+
+resource "kubernetes_manifest" "pod_monitor" {
+  count = data.coder_workspace.me.start_count
+  manifest = {
+    apiVersion = "monitoring.coreos.com/v1"
+    kind       = "PodMonitor"
+    metadata = {
+      namespace = data.coder_parameter.namespace.value
+      name      = "podmonitor-${local.workspace_pod_name}"
+    }
+    spec = {
+      selector = {
+        matchLabels = {
+          "app.kubernetes.io/instance" : local.workspace_pod_instance
+        }
+      }
+      podMetricsEndpoints = [
+        # NOTE(mafredri): We could add more information here by including the
+        # scenario name in the port name (although it's limited to 15 chars so
+        # it needs to be short). That said, someone looking at the stats can
+        # assume that there's a 1-to-1 mapping between scenario# and port.
+        for i, _ in data.coder_parameter.load_scenario_run_concurrently.value ? jsondecode(data.coder_parameter.load_scenarios.value) : [""] : {
+          port     = "prom-http${i}"
+          interval = "15s"
+        }
+      ]
+    }
+  }
+}
diff --git a/examples/scaletests/scaletest-runner/metadata_phase.sh b/examples/scaletests/scaletest-runner/metadata_phase.sh
new file mode 100755
index 0000000000000..755a8ba084db7
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/metadata_phase.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh"
+
+get_phase
diff --git a/examples/scaletests/scaletest-runner/metadata_previous_phase.sh b/examples/scaletests/scaletest-runner/metadata_previous_phase.sh
new file mode 100755
index 0000000000000..c858687b72ad8
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/metadata_previous_phase.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh" 2>/dev/null || return
+
+get_previous_phase
diff --git a/examples/scaletests/scaletest-runner/metadata_status.sh b/examples/scaletests/scaletest-runner/metadata_status.sh
new file mode 100755
index 0000000000000..8ec45f0875c1d
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/metadata_status.sh
@@ -0,0 +1,6 @@
+#!/bin/bash
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh" 2>/dev/null || return
+
+get_status
diff --git a/examples/scaletests/scaletest-runner/scripts/cleanup.sh b/examples/scaletests/scaletest-runner/scripts/cleanup.sh
new file mode 100755
index 0000000000000..c80982497b5e9
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/scripts/cleanup.sh
@@ -0,0 +1,62 @@
+#!/bin/bash
+set -euo pipefail
+
+[[ $VERBOSE == 1 ]] && set -x
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh"
+
+event=${1:-}
+
+if [[ -z $event ]]; then
+	event=manual
+fi
+
+do_cleanup() {
+	start_phase "Cleanup (${event})"
+	coder exp scaletest cleanup \
+		--cleanup-job-timeout 2h \
+		--cleanup-timeout 5h |
+		tee "${SCALETEST_RESULTS_DIR}/cleanup-${event}.txt"
+	end_phase
+}
+
+do_scaledown() {
+	start_phase "Scale down provisioners (${event})"
+	maybedryrun "$DRY_RUN" kubectl scale deployment/coder-provisioner --replicas 1
+	maybedryrun "$DRY_RUN" kubectl rollout status deployment/coder-provisioner
+	end_phase
+}
+
+case "${event}" in
+manual)
+	echo -n 'WARNING: This will clean up all scaletest resources, continue? (y/n) '
+	read -r -n 1
+	if [[ $REPLY != [yY] ]]; then
+		echo $'\nAborting...'
+		exit 1
+	fi
+	echo
+
+	do_cleanup
+	do_scaledown
+
+	echo 'Press any key to continue...'
+	read -s -r -n 1
+	;;
+prepare)
+	do_cleanup
+	;;
+on_stop) ;; # Do nothing, handled by "shutdown".
+always | on_success | on_error | shutdown)
+	do_cleanup
+	do_scaledown
+	;;
+shutdown_scale_down_only)
+	do_scaledown
+	;;
+*)
+	echo "Unknown event: ${event}" >&2
+	exit 1
+	;;
+esac
diff --git a/examples/scaletests/scaletest-runner/scripts/lib.sh b/examples/scaletests/scaletest-runner/scripts/lib.sh
new file mode 100644
index 0000000000000..868dd5c078d2e
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/scripts/lib.sh
@@ -0,0 +1,313 @@
+#!/bin/bash
+set -euo pipefail
+
+# Only source this script once, this env comes from sourcing
+# scripts/lib.sh from coder/coder below.
+if [[ ${SCRIPTS_LIB_IS_SOURCED:-0} == 1 ]]; then
+	return 0
+fi
+
+# Source scripts/lib.sh from coder/coder for common functions.
+# shellcheck source=scripts/lib.sh
+. "${HOME}/coder/scripts/lib.sh"
+
+# Make shellcheck happy.
+DRY_RUN=${DRY_RUN:-0}
+
+# Environment variables shared between scripts.
+SCALETEST_STATE_DIR="${SCALETEST_RUN_DIR}/state"
+SCALETEST_PHASE_FILE="${SCALETEST_STATE_DIR}/phase"
+# shellcheck disable=SC2034
+SCALETEST_RESULTS_DIR="${SCALETEST_RUN_DIR}/results"
+SCALETEST_LOGS_DIR="${SCALETEST_RUN_DIR}/logs"
+SCALETEST_PPROF_DIR="${SCALETEST_RUN_DIR}/pprof"
+# https://github.com/kubernetes/kubernetes/issues/72501 :-(
+SCALETEST_CODER_BINARY="/tmp/coder-full-${SCALETEST_RUN_ID}"
+
+mkdir -p "${SCALETEST_STATE_DIR}" "${SCALETEST_RESULTS_DIR}" "${SCALETEST_LOGS_DIR}" "${SCALETEST_PPROF_DIR}"
+
+coder() {
+	if [[ ! -x "${SCALETEST_CODER_BINARY}" ]]; then
+		log "Fetching full coder binary..."
+		fetch_coder_full
+	fi
+	maybedryrun "${DRY_RUN}" "${SCALETEST_CODER_BINARY}" "${@}"
+}
+
+show_json() {
+	maybedryrun "${DRY_RUN}" jq 'del(.. | .logs?)' "${1}"
+}
+
+set_status() {
+	dry_run=
+	if [[ ${DRY_RUN} == 1 ]]; then
+		dry_run=" (dry-run)"
+	fi
+	prev_status=$(get_status)
+	if [[ ${prev_status} != *"Not started"* ]]; then
+		annotate_grafana_end "status" "Status: ${prev_status}"
+	fi
+	echo "$(date -Ins) ${*}${dry_run}" >>"${SCALETEST_STATE_DIR}/status"
+
+	annotate_grafana "status" "Status: ${*}"
+
+	status_lower=$(tr '[:upper:]' '[:lower:]' <<<"${*}")
+	set_pod_status_annotation "${status_lower}"
+}
+lock_status() {
+	chmod 0440 "${SCALETEST_STATE_DIR}/status"
+}
+get_status() {
+	# Order of importance (reverse of creation).
+	if [[ -f "${SCALETEST_STATE_DIR}/status" ]]; then
+		tail -n1 "${SCALETEST_STATE_DIR}/status" | cut -d' ' -f2-
+	else
+		echo "Not started"
+	fi
+}
+
+phase_num=0
+start_phase() {
+	# This may be incremented from another script, so we read it every time.
+	if [[ -f "${SCALETEST_PHASE_FILE}" ]]; then
+		phase_num=$(grep -c START: "${SCALETEST_PHASE_FILE}")
+	fi
+	phase_num=$((phase_num + 1))
+	log "Start phase ${phase_num}: ${*}"
+	echo "$(date -Ins) START:${phase_num}: ${*}" >>"${SCALETEST_PHASE_FILE}"
+
+	GRAFANA_EXTRA_TAGS="${PHASE_TYPE:-phase-default}" annotate_grafana "phase" "Phase ${phase_num}: ${*}"
+}
+end_phase() {
+	phase=$(tail -n 1 "${SCALETEST_PHASE_FILE}" | grep "START:${phase_num}:" | cut -d' ' -f3-)
+	if [[ -z ${phase} ]]; then
+		log "BUG: Could not find start phase ${phase_num} in ${SCALETEST_PHASE_FILE}"
+		return 1
+	fi
+	log "End phase ${phase_num}: ${phase}"
+	echo "$(date -Ins) END:${phase_num}: ${phase}" >>"${SCALETEST_PHASE_FILE}"
+
+	GRAFANA_EXTRA_TAGS="${PHASE_TYPE:-phase-default}" GRAFANA_ADD_TAGS="${PHASE_ADD_TAGS:-}" annotate_grafana_end "phase" "Phase ${phase_num}: ${phase}"
+}
+get_phase() {
+	if [[ -f "${SCALETEST_PHASE_FILE}" ]]; then
+		phase_raw=$(tail -n1 "${SCALETEST_PHASE_FILE}")
+		phase=$(echo "${phase_raw}" | cut -d' ' -f3-)
+		if [[ ${phase_raw} == *"END:"* ]]; then
+			phase+=" [done]"
+		fi
+		echo "${phase}"
+	else
+		echo "None"
+	fi
+}
+get_previous_phase() {
+	if [[ -f "${SCALETEST_PHASE_FILE}" ]] && [[ $(grep -c START: "${SCALETEST_PHASE_FILE}") -gt 1 ]]; then
+		grep START: "${SCALETEST_PHASE_FILE}" | tail -n2 | head -n1 | cut -d' ' -f3-
+	else
+		echo "None"
+	fi
+}
+
+annotate_grafana() {
+	local tags=${1} text=${2} start=${3:-$(($(date +%s) * 1000))}
+	local json resp id
+
+	if [[ -z $tags ]]; then
+		tags="scaletest,runner"
+	else
+		tags="scaletest,runner,${tags}"
+	fi
+	if [[ -n ${GRAFANA_EXTRA_TAGS:-} ]]; then
+		tags="${tags},${GRAFANA_EXTRA_TAGS}"
+	fi
+
+	log "Annotating Grafana (start=${start}): ${text} [${tags}]"
+
+	json="$(
+		jq \
+			--argjson time "${start}" \
+			--arg text "${text}" \
+			--arg tags "${tags}" \
+			'{time: $time, tags: $tags | split(","), text: $text}' <<<'{}'
+	)"
+	if [[ ${DRY_RUN} == 1 ]]; then
+		echo "FAKEID:${tags}:${text}:${start}" >>"${SCALETEST_STATE_DIR}/grafana-annotations"
+		log "Would have annotated Grafana, data=${json}"
+		return 0
+	fi
+	if ! resp="$(
+		curl -sSL \
+			--insecure \
+			-H "Authorization: Bearer ${GRAFANA_API_TOKEN}" \
+			-H "Content-Type: application/json" \
+			-d "${json}" \
+			"${GRAFANA_URL}/api/annotations"
+	)"; then
+		# Don't abort scaletest just because we couldn't annotate Grafana.
+		log "Failed to annotate Grafana: ${resp}"
+		return 0
+	fi
+
+	if [[ $(jq -r '.message' <<<"${resp}") != "Annotation added" ]]; then
+		log "Failed to annotate Grafana: ${resp}"
+		return 0
+	fi
+
+	log "Grafana annotation added!"
+
+	id="$(jq -r '.id' <<<"${resp}")"
+	echo "${id}:${tags}:${text}:${start}" >>"${SCALETEST_STATE_DIR}/grafana-annotations"
+}
+annotate_grafana_end() {
+	local tags=${1} text=${2} start=${3:-} end=${4:-$(($(date +%s) * 1000))}
+	local id json resp
+
+	if [[ -z $tags ]]; then
+		tags="scaletest,runner"
+	else
+		tags="scaletest,runner,${tags}"
+	fi
+	if [[ -n ${GRAFANA_EXTRA_TAGS:-} ]]; then
+		tags="${tags},${GRAFANA_EXTRA_TAGS}"
+	fi
+
+	if ! id=$(grep ":${tags}:${text}:${start}" "${SCALETEST_STATE_DIR}/grafana-annotations" | sort -n | tail -n1 | cut -d: -f1); then
+		log "NOTICE: Could not find Grafana annotation to end: '${tags}:${text}:${start}', skipping..."
+		return 0
+	fi
+
+	log "Updating Grafana annotation (end=${end}): ${text} [${tags}, add=${GRAFANA_ADD_TAGS:-}]"
+
+	if [[ -n ${GRAFANA_ADD_TAGS:-} ]]; then
+		json="$(
+			jq -n \
+				--argjson timeEnd "${end}" \
+				--arg tags "${tags},${GRAFANA_ADD_TAGS}" \
+				'{timeEnd: $timeEnd, tags: $tags | split(",")}'
+		)"
+	else
+		json="$(
+			jq -n \
+				--argjson timeEnd "${end}" \
+				'{timeEnd: $timeEnd}'
+		)"
+	fi
+	if [[ ${DRY_RUN} == 1 ]]; then
+		log "Would have patched Grafana annotation: id=${id}, data=${json}"
+		return 0
+	fi
+	if ! resp="$(
+		curl -sSL \
+			--insecure \
+			-H "Authorization: Bearer ${GRAFANA_API_TOKEN}" \
+			-H "Content-Type: application/json" \
+			-X PATCH \
+			-d "${json}" \
+			"${GRAFANA_URL}/api/annotations/${id}"
+	)"; then
+		# Don't abort scaletest just because we couldn't annotate Grafana.
+		log "Failed to annotate Grafana end: ${resp}"
+		return 0
+	fi
+
+	if [[ $(jq -r '.message' <<<"${resp}") != "Annotation patched" ]]; then
+		log "Failed to annotate Grafana end: ${resp}"
+		return 0
+	fi
+
+	log "Grafana annotation patched!"
+}
+
+wait_baseline() {
+	s=${1:-2}
+	PHASE_TYPE="phase-wait" start_phase "Waiting ${s}m to establish baseline"
+	maybedryrun "$DRY_RUN" sleep $((s * 60))
+	PHASE_TYPE="phase-wait" end_phase
+}
+
+get_appearance() {
+	session_token=$CODER_USER_TOKEN
+	if [[ -f "${CODER_CONFIG_DIR}/session" ]]; then
+		session_token="$(<"${CODER_CONFIG_DIR}/session")"
+	fi
+	curl -sSL \
+		-H "Coder-Session-Token: ${session_token}" \
+		"${CODER_URL}/api/v2/appearance"
+}
+set_appearance() {
+	local json=$1 color=$2 message=$3
+
+	session_token=$CODER_USER_TOKEN
+	if [[ -f "${CODER_CONFIG_DIR}/session" ]]; then
+		session_token="$(<"${CODER_CONFIG_DIR}/session")"
+	fi
+	newjson="$(
+		jq \
+			--arg color "${color}" \
+			--arg message "${message}" \
+			'. | .service_banner.message |= $message | .service_banner.background_color |= $color' <<<"${json}"
+	)"
+	maybedryrun "${DRY_RUN}" curl -sSL \
+		-X PUT \
+		-H 'Content-Type: application/json' \
+		-H "Coder-Session-Token: ${session_token}" \
+		--data "${newjson}" \
+		"${CODER_URL}/api/v2/appearance"
+}
+
+namespace() {
+	cat /var/run/secrets/kubernetes.io/serviceaccount/namespace
+}
+coder_pods() {
+	kubectl get pods \
+		--namespace "$(namespace)" \
+		--selector "app.kubernetes.io/name=coder,app.kubernetes.io/part-of=coder" \
+		--output jsonpath='{.items[*].metadata.name}'
+}
+
+# fetch_coder_full fetches the full (non-slim) coder binary from one of the coder pods
+# running in the same namespace as the current pod.
+fetch_coder_full() {
+	if [[ -x "${SCALETEST_CODER_BINARY}" ]]; then
+		log "Full Coder binary already exists at ${SCALETEST_CODER_BINARY}"
+		return 0
+	fi
+	ns=$(namespace)
+	if [[ -z "${ns}" ]]; then
+		log "Could not determine namespace!"
+		return 1
+	fi
+	log "Namespace from serviceaccount token is ${ns}"
+	pods=$(coder_pods)
+	if [[ -z ${pods} ]]; then
+		log "Could not find coder pods!"
+		return 1
+	fi
+	pod=$(cut -d ' ' -f 1 <<<"${pods}")
+	if [[ -z ${pod} ]]; then
+		log "Could not find coder pod!"
+		return 1
+	fi
+	log "Fetching full Coder binary from ${pod}"
+	# We need --retries due to https://github.com/kubernetes/kubernetes/issues/60140 :(
+	maybedryrun "${DRY_RUN}" kubectl \
+		--namespace "${ns}" \
+		cp \
+		--container coder \
+		--retries 10 \
+		"${pod}:/opt/coder" "${SCALETEST_CODER_BINARY}"
+	maybedryrun "${DRY_RUN}" chmod +x "${SCALETEST_CODER_BINARY}"
+	log "Full Coder binary downloaded to ${SCALETEST_CODER_BINARY}"
+}
+
+# set_pod_status_annotation annotates the currently running pod with the key
+# com.coder.scaletest.status. It will overwrite the previous status.
+set_pod_status_annotation() {
+	if [[ $# -ne 1 ]]; then
+		log "BUG: Must specify an annotation value"
+		return 1
+	else
+		maybedryrun "${DRY_RUN}" kubectl --namespace "$(namespace)" annotate pod "$(hostname)" "com.coder.scaletest.status=$1" --overwrite
+	fi
+}
diff --git a/examples/scaletests/scaletest-runner/scripts/prepare.sh b/examples/scaletests/scaletest-runner/scripts/prepare.sh
new file mode 100755
index 0000000000000..90b2dd05f945f
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/scripts/prepare.sh
@@ -0,0 +1,67 @@
+#!/bin/bash
+set -euo pipefail
+
+[[ $VERBOSE == 1 ]] && set -x
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh"
+
+mkdir -p "${SCALETEST_STATE_DIR}"
+mkdir -p "${SCALETEST_RESULTS_DIR}"
+
+log "Preparing scaletest workspace environment..."
+set_status Preparing
+
+log "Compressing previous run logs (if applicable)..."
+mkdir -p "${HOME}/archive"
+for dir in "${HOME}/scaletest-"*; do
+	if [[ ${dir} = "${SCALETEST_RUN_DIR}" ]]; then
+		continue
+	fi
+	if [[ -d ${dir} ]]; then
+		name="$(basename "${dir}")"
+		(
+			cd "$(dirname "${dir}")"
+			ZSTD_CLEVEL=12 maybedryrun "$DRY_RUN" tar --zstd -cf "${HOME}/archive/${name}.tar.zst" "${name}"
+		)
+		maybedryrun "$DRY_RUN" rm -rf "${dir}"
+	fi
+done
+
+log "Creating coder CLI token (needed for cleanup during shutdown)..."
+
+mkdir -p "${CODER_CONFIG_DIR}"
+echo -n "${CODER_URL}" >"${CODER_CONFIG_DIR}/url"
+
+set +x # Avoid logging the token.
+# Persist configuration for shutdown script too since the
+# owner token is invalidated immediately on workspace stop.
+export CODER_SESSION_TOKEN=${CODER_USER_TOKEN}
+coder tokens delete scaletest_runner >/dev/null 2>&1 || true
+# TODO(mafredri): Set TTL? This could interfere with delayed stop though.
+token=$(coder tokens create --name scaletest_runner)
+if [[ $DRY_RUN == 1 ]]; then
+	token=${CODER_SESSION_TOKEN}
+fi
+unset CODER_SESSION_TOKEN
+echo -n "${token}" >"${CODER_CONFIG_DIR}/session"
+[[ $VERBOSE == 1 ]] && set -x # Restore logging (if enabled).
+
+if [[ ${SCALETEST_PARAM_CLEANUP_PREPARE} == 1 ]]; then
+	log "Cleaning up from previous runs (if applicable)..."
+	"${SCRIPTS_DIR}/cleanup.sh" prepare
+fi
+
+log "Preparation complete!"
+
+PROVISIONER_REPLICA_COUNT="${SCALETEST_PARAM_CREATE_CONCURRENCY:-0}"
+if [[ "${PROVISIONER_REPLICA_COUNT}" -eq 0 ]]; then
+	# TODO(Cian): what is a good default value here?
+	echo "Setting PROVISIONER_REPLICA_COUNT to 10 since SCALETEST_PARAM_CREATE_CONCURRENCY is 0"
+	PROVISIONER_REPLICA_COUNT=10
+fi
+log "Scaling up provisioners to ${PROVISIONER_REPLICA_COUNT}..."
+maybedryrun "$DRY_RUN" kubectl scale deployment/coder-provisioner \
+	--replicas "${PROVISIONER_REPLICA_COUNT}"
+log "Waiting for provisioners to scale up..."
+maybedryrun "$DRY_RUN" kubectl rollout status deployment/coder-provisioner
diff --git a/examples/scaletests/scaletest-runner/scripts/report.sh b/examples/scaletests/scaletest-runner/scripts/report.sh
new file mode 100755
index 0000000000000..0c6a5059ba37d
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/scripts/report.sh
@@ -0,0 +1,109 @@
+#!/bin/bash
+set -euo pipefail
+
+[[ $VERBOSE == 1 ]] && set -x
+
+status=$1
+shift
+
+case "${status}" in
+started) ;;
+completed) ;;
+failed) ;;
+*)
+	echo "Unknown status: ${status}" >&2
+	exit 1
+	;;
+esac
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh"
+
+# NOTE(mafredri): API returns HTML if we accidentally use `...//api` vs `.../api`.
+# https://github.com/coder/coder/issues/9877
+CODER_URL="${CODER_URL%/}"
+buildinfo="$(curl -sSL "${CODER_URL}/api/v2/buildinfo")"
+server_version="$(jq -r '.version' <<<"${buildinfo}")"
+server_version_commit="$(jq -r '.external_url' <<<"${buildinfo}")"
+
+# Since `coder show` doesn't support JSON output, we list the workspaces instead.
+# Use `command` here to bypass dry run.
+workspace_json="$(
+	command coder list --all --output json |
+		jq --arg workspace "${CODER_WORKSPACE}" --arg user "${CODER_USER}" 'map(select(.name == $workspace) | select(.owner_name == $user)) | .[0]'
+)"
+owner_name="$(jq -r '.latest_build.workspace_owner_name' <<<"${workspace_json}")"
+workspace_name="$(jq -r '.latest_build.workspace_name' <<<"${workspace_json}")"
+initiator_name="$(jq -r '.latest_build.initiator_name' <<<"${workspace_json}")"
+
+bullet='•'
+app_urls_raw="$(jq -r '.latest_build.resources[].agents[]?.apps | map(select(.external == true)) | .[] | .display_name, .url' <<<"${workspace_json}")"
+app_urls=()
+while read -r app_name; do
+	read -r app_url
+	bold=
+	if [[ ${status} != started ]] && [[ ${app_url} = *to=now* ]]; then
+		# Update Grafana URL with end stamp and make bold.
+		app_url="${app_url//to=now/to=$(($(date +%s) * 1000))}"
+		bold='*'
+	fi
+	app_urls+=("${bullet} ${bold}${app_name}${bold}: ${app_url}")
+done <<<"${app_urls_raw}"
+
+params=()
+header=
+
+case "${status}" in
+started)
+	created_at="$(jq -r '.latest_build.created_at' <<<"${workspace_json}")"
+	params=("${bullet} Options:")
+	while read -r param; do
+		params+=("    ${bullet} ${param}")
+	done <<<"$(jq -r '.latest_build.resources[].agents[]?.environment_variables | to_entries | map(select(.key | startswith("SCALETEST_PARAM_"))) | .[] | "`\(.key)`: `\(.value)`"' <<<"${workspace_json}")"
+
+	header="New scaletest started at \`${created_at}\` by \`${initiator_name}\` on ${CODER_URL} (<${server_version_commit}|\`${server_version}\`>)."
+	;;
+completed)
+	completed_at=$(date -Iseconds)
+	header="Scaletest completed at \`${completed_at}\` (started by \`${initiator_name}\`) on ${CODER_URL} (<${server_version_commit}|\`${server_version}\`>)."
+	;;
+failed)
+	failed_at=$(date -Iseconds)
+	header="Scaletest failed at \`${failed_at}\` (started by \`${initiator_name}\`) on ${CODER_URL} (<${server_version_commit}|\`${server_version}\`>)."
+	;;
+*)
+	echo "Unknown status: ${status}" >&2
+	exit 1
+	;;
+esac
+
+text_arr=(
+	"${header}"
+	""
+	"${bullet} *Comment:* ${SCALETEST_COMMENT}"
+	"${bullet} Workspace (runner): ${CODER_URL}/@${owner_name}/${workspace_name}"
+	"${bullet} Run ID: ${SCALETEST_RUN_ID}"
+	"${app_urls[@]}"
+	"${params[@]}"
+)
+
+text=
+for field in "${text_arr[@]}"; do
+	text+="${field}"$'\n'
+done
+
+json=$(
+	jq -n --arg text "${text}" '{
+		blocks: [
+			{
+				"type": "section",
+				"text": {
+					"type": "mrkdwn",
+					"text": $text
+				}
+			}
+		]
+	}'
+)
+
+maybedryrun "${DRY_RUN}" curl -X POST -H 'Content-type: application/json' --data "${json}" "${SLACK_WEBHOOK_URL}"
diff --git a/examples/scaletests/scaletest-runner/scripts/run.sh b/examples/scaletests/scaletest-runner/scripts/run.sh
new file mode 100755
index 0000000000000..47a6042a18598
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/scripts/run.sh
@@ -0,0 +1,369 @@
+#!/bin/bash
+set -euo pipefail
+
+[[ $VERBOSE == 1 ]] && set -x
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh"
+
+mapfile -t scaletest_load_scenarios < <(jq -r '. | join ("\n")' <<<"${SCALETEST_PARAM_LOAD_SCENARIOS}")
+export SCALETEST_PARAM_LOAD_SCENARIOS=("${scaletest_load_scenarios[@]}")
+
+log "Running scaletest..."
+set_status Running
+
+start_phase "Creating workspaces"
+if [[ ${SCALETEST_PARAM_SKIP_CREATE_WORKSPACES} == 0 ]]; then
+	# Note that we allow up to 5 failures to bring up the workspace, since
+	# we're creating a lot of workspaces at once and some of them may fail
+	# due to network issues or other transient errors.
+	coder exp scaletest create-workspaces \
+		--retry 5 \
+		--count "${SCALETEST_PARAM_NUM_WORKSPACES}" \
+		--template "${SCALETEST_PARAM_TEMPLATE}" \
+		--concurrency "${SCALETEST_PARAM_CREATE_CONCURRENCY}" \
+		--timeout 5h \
+		--job-timeout 5h \
+		--no-cleanup \
+		--output json:"${SCALETEST_RESULTS_DIR}/create-workspaces.json"
+	show_json "${SCALETEST_RESULTS_DIR}/create-workspaces.json"
+fi
+end_phase
+
+wait_baseline "${SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION}"
+
+non_greedy_agent_traffic_args=()
+if [[ ${SCALETEST_PARAM_GREEDY_AGENT} != 1 ]]; then
+	greedy_agent_traffic() { :; }
+else
+	echo "WARNING: Greedy agent enabled, this may cause the load tests to fail." >&2
+	non_greedy_agent_traffic_args=(
+		# Let the greedy agent traffic command be scraped.
+		# --scaletest-prometheus-address 0.0.0.0:21113
+		# --trace=false
+	)
+
+	annotate_grafana greedy_agent "Create greedy agent"
+
+	coder exp scaletest create-workspaces \
+		--count 1 \
+		--template "${SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE}" \
+		--concurrency 1 \
+		--timeout 5h \
+		--job-timeout 5h \
+		--no-cleanup \
+		--output json:"${SCALETEST_RESULTS_DIR}/create-workspaces-greedy-agent.json"
+
+	wait_baseline "${SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION}"
+
+	greedy_agent_traffic() {
+		local timeout=${1} scenario=${2}
+		# Run the greedy test for ~1/3 of the timeout.
+		delay=$((timeout * 60 / 3))
+
+		local type=web-terminal
+		args=()
+		if [[ ${scenario} == "SSH Traffic" ]]; then
+			type=ssh
+			args+=(--ssh)
+		fi
+
+		sleep "${delay}"
+		annotate_grafana greedy_agent "${scenario}: Greedy agent traffic"
+
+		# Produce load at about 1000MB/s (25MB/40ms).
+		set +e
+		coder exp scaletest workspace-traffic \
+			--template "${SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE}" \
+			--bytes-per-tick $((1024 * 1024 * 25)) \
+			--tick-interval 40ms \
+			--timeout "$((delay))s" \
+			--job-timeout "$((delay))s" \
+			--output json:"${SCALETEST_RESULTS_DIR}/traffic-${type}-greedy-agent.json" \
+			--scaletest-prometheus-address 0.0.0.0:21113 \
+			--trace=false \
+			"${args[@]}"
+		status=${?}
+		show_json "${SCALETEST_RESULTS_DIR}/traffic-${type}-greedy-agent.json"
+
+		export GRAFANA_ADD_TAGS=
+		if [[ ${status} != 0 ]]; then
+			GRAFANA_ADD_TAGS=error
+		fi
+		annotate_grafana_end greedy_agent "${scenario}: Greedy agent traffic"
+
+		return "${status}"
+	}
+fi
+
+run_scenario_cmd() {
+	local scenario=${1}
+	shift
+	local command=("$@")
+
+	set +e
+	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
+		annotate_grafana scenario "Load scenario: ${scenario}"
+	fi
+	"${command[@]}"
+	status=${?}
+	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
+		export GRAFANA_ADD_TAGS=
+		if [[ ${status} != 0 ]]; then
+			GRAFANA_ADD_TAGS=error
+		fi
+		annotate_grafana_end scenario "Load scenario: ${scenario}"
+	fi
+	exit "${status}"
+}
+
+declare -a pids=()
+declare -A pid_to_scenario=()
+declare -A failed=()
+target_start=0
+target_end=-1
+
+if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
+	start_phase "Load scenarios: ${SCALETEST_PARAM_LOAD_SCENARIOS[*]}"
+fi
+for scenario in "${SCALETEST_PARAM_LOAD_SCENARIOS[@]}"; do
+	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+		start_phase "Load scenario: ${scenario}"
+	fi
+
+	set +e
+	status=0
+	case "${scenario}" in
+	"SSH Traffic")
+		greedy_agent_traffic "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION}" "${scenario}" &
+		greedy_agent_traffic_pid=$!
+
+		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
+		target_end=$((target_start + target_count))
+		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
+			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
+			target_start=0
+			target_end=${target_count}
+		fi
+		run_scenario_cmd "${scenario}" coder exp scaletest workspace-traffic \
+			--template "${SCALETEST_PARAM_TEMPLATE}" \
+			--ssh \
+			--bytes-per-tick "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_BYTES_PER_TICK}" \
+			--tick-interval "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_TICK_INTERVAL}ms" \
+			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION}m" \
+			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION}m30s" \
+			--output json:"${SCALETEST_RESULTS_DIR}/traffic-ssh.json" \
+			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
+			--target-workspaces "${target_start}:${target_end}" \
+			"${non_greedy_agent_traffic_args[@]}" &
+		pids+=($!)
+		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+			wait "${pids[-1]}"
+			status=$?
+			show_json "${SCALETEST_RESULTS_DIR}/traffic-ssh.json"
+		else
+			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
+		fi
+		wait "${greedy_agent_traffic_pid}"
+		status2=$?
+		if [[ ${status} == 0 ]]; then
+			status=${status2}
+		fi
+		;;
+	"Web Terminal Traffic")
+		greedy_agent_traffic "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION}" "${scenario}" &
+		greedy_agent_traffic_pid=$!
+
+		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
+		target_end=$((target_start + target_count))
+		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
+			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
+			target_start=0
+			target_end=${target_count}
+		fi
+		run_scenario_cmd "${scenario}" coder exp scaletest workspace-traffic \
+			--template "${SCALETEST_PARAM_TEMPLATE}" \
+			--bytes-per-tick "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_BYTES_PER_TICK}" \
+			--tick-interval "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_TICK_INTERVAL}ms" \
+			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION}m" \
+			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION}m30s" \
+			--output json:"${SCALETEST_RESULTS_DIR}/traffic-web-terminal.json" \
+			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
+			--target-workspaces "${target_start}:${target_end}" \
+			"${non_greedy_agent_traffic_args[@]}" &
+		pids+=($!)
+		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+			wait "${pids[-1]}"
+			status=$?
+			show_json "${SCALETEST_RESULTS_DIR}/traffic-web-terminal.json"
+		else
+			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
+		fi
+		wait "${greedy_agent_traffic_pid}"
+		status2=$?
+		if [[ ${status} == 0 ]]; then
+			status=${status2}
+		fi
+		;;
+	"App Traffic")
+		greedy_agent_traffic "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION}" "${scenario}" &
+		greedy_agent_traffic_pid=$!
+
+		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
+		target_end=$((target_start + target_count))
+		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
+			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
+			target_start=0
+			target_end=${target_count}
+		fi
+		run_scenario_cmd "${scenario}" coder exp scaletest workspace-traffic \
+			--template "${SCALETEST_PARAM_TEMPLATE}" \
+			--bytes-per-tick "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_BYTES_PER_TICK}" \
+			--tick-interval "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_TICK_INTERVAL}ms" \
+			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION}m" \
+			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION}m30s" \
+			--output json:"${SCALETEST_RESULTS_DIR}/traffic-app.json" \
+			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
+			--app "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_MODE}" \
+			--target-workspaces "${target_start}:${target_end}" \
+			"${non_greedy_agent_traffic_args[@]}" &
+		pids+=($!)
+		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+			wait "${pids[-1]}"
+			status=$?
+			show_json "${SCALETEST_RESULTS_DIR}/traffic-app.json"
+		else
+			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
+		fi
+		wait "${greedy_agent_traffic_pid}"
+		status2=$?
+		if [[ ${status} == 0 ]]; then
+			status=${status2}
+		fi
+		;;
+	"Dashboard Traffic")
+		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
+		target_end=$((target_start + target_count))
+		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
+			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
+			target_start=0
+			target_end=${target_count}
+		fi
+		# TODO: Remove this once the dashboard traffic command is fixed,
+		# (i.e. once images are no longer dumped into PWD).
+		mkdir -p dashboard
+		pushd dashboard
+		run_scenario_cmd "${scenario}" coder exp scaletest dashboard \
+			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_DURATION}m" \
+			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_DURATION}m30s" \
+			--output json:"${SCALETEST_RESULTS_DIR}/traffic-dashboard.json" \
+			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
+			--target-users "${target_start}:${target_end}" \
+			>"${SCALETEST_RESULTS_DIR}/traffic-dashboard-output.log" &
+		pids+=($!)
+		popd
+		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+			wait "${pids[-1]}"
+			status=$?
+			show_json "${SCALETEST_RESULTS_DIR}/traffic-dashboard.json"
+		else
+			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
+		fi
+		;;
+
+	# Debug scenarios, for testing the runner.
+	"debug:greedy_agent_traffic")
+		greedy_agent_traffic 10 "${scenario}" &
+		pids+=($!)
+		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+			wait "${pids[-1]}"
+			status=$?
+		else
+			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
+		fi
+		;;
+	"debug:success")
+		{
+			maybedryrun "$DRY_RUN" sleep 10
+			true
+		} &
+		pids+=($!)
+		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+			wait "${pids[-1]}"
+			status=$?
+		else
+			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
+		fi
+		;;
+	"debug:error")
+		{
+			maybedryrun "$DRY_RUN" sleep 10
+			false
+		} &
+		pids+=($!)
+		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
+			wait "${pids[-1]}"
+			status=$?
+		else
+			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
+		fi
+		;;
+
+	*)
+		log "WARNING: Unknown load scenario: ${scenario}, skipping..."
+		;;
+	esac
+	set -e
+
+	# Allow targeting to be distributed evenly across workspaces when each
+	# scenario is run concurrently and all percentages add up to 100.
+	target_start=${target_end}
+
+	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
+		pid_to_scenario+=(["${pids[-1]}"]="${scenario}")
+		# Stagger the start of each scenario to avoid a burst of load and deted
+		# problematic scenarios.
+		sleep $((SCALETEST_PARAM_LOAD_SCENARIO_CONCURRENCY_STAGGER_DELAY_MINS * 60))
+		continue
+	fi
+
+	if ((status > 0)); then
+		log "Load scenario failed: ${scenario} (exit=${status})"
+		failed+=(["${scenario}"]="${status}")
+		PHASE_ADD_TAGS=error end_phase
+	else
+		end_phase
+	fi
+
+	wait_baseline "${SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION}"
+done
+if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
+	wait "${pids[@]}"
+	# Wait on all pids will wait until all have exited, but we need to
+	# check their individual exit codes.
+	for pid in "${pids[@]}"; do
+		wait "${pid}"
+		status=${?}
+		scenario=${pid_to_scenario[${pid}]}
+		if ((status > 0)); then
+			log "Load scenario failed: ${scenario} (exit=${status})"
+			failed+=(["${scenario}"]="${status}")
+		fi
+	done
+	if ((${#failed[@]} > 0)); then
+		PHASE_ADD_TAGS=error end_phase
+	else
+		end_phase
+	fi
+fi
+
+if ((${#failed[@]} > 0)); then
+	log "Load scenarios failed: ${!failed[*]}"
+	for scenario in "${!failed[@]}"; do
+		log "  ${scenario}: exit=${failed[$scenario]}"
+	done
+	exit 1
+fi
+
+log "Scaletest complete!"
+set_status Complete
diff --git a/examples/scaletests/scaletest-runner/shutdown.sh b/examples/scaletests/scaletest-runner/shutdown.sh
new file mode 100755
index 0000000000000..9e75864d73120
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/shutdown.sh
@@ -0,0 +1,30 @@
+#!/bin/bash
+set -e
+
+[[ $VERBOSE == 1 ]] && set -x
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh"
+
+cleanup() {
+	coder tokens remove scaletest_runner >/dev/null 2>&1 || true
+	rm -f "${CODER_CONFIG_DIR}/session"
+}
+trap cleanup EXIT
+
+annotate_grafana "workspace" "Agent stopping..."
+
+shutdown_event=shutdown_scale_down_only
+if [[ ${SCALETEST_PARAM_CLEANUP_STRATEGY} == on_stop ]]; then
+	shutdown_event=shutdown
+fi
+"${SCRIPTS_DIR}/cleanup.sh" "${shutdown_event}"
+
+annotate_grafana_end "workspace" "Agent running"
+
+appearance_json="$(get_appearance)"
+service_banner_message=$(jq -r '.service_banner.message' <<<"${appearance_json}")
+service_banner_message="${service_banner_message/% | */}"
+service_banner_color="#4CD473" # Green.
+
+set_appearance "${appearance_json}" "${service_banner_color}" "${service_banner_message}"
diff --git a/examples/scaletests/scaletest-runner/startup.sh b/examples/scaletests/scaletest-runner/startup.sh
new file mode 100755
index 0000000000000..3e4eb94f41810
--- /dev/null
+++ b/examples/scaletests/scaletest-runner/startup.sh
@@ -0,0 +1,181 @@
+#!/bin/bash
+set -euo pipefail
+
+[[ $VERBOSE == 1 ]] && set -x
+
+if [[ ${SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE} == "${SCALETEST_PARAM_TEMPLATE}" ]]; then
+	echo "ERROR: Greedy agent template must be different from the scaletest template." >&2
+	exit 1
+fi
+
+if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]] && [[ ${SCALETEST_PARAM_GREEDY_AGENT} == 1 ]]; then
+	echo "ERROR: Load scenario concurrency and greedy agent test cannot be enabled at the same time." >&2
+	exit 1
+fi
+
+# Unzip scripts and add to path.
+# shellcheck disable=SC2153
+echo "Extracting scaletest scripts into ${SCRIPTS_DIR}..."
+base64 -d <<<"${SCRIPTS_ZIP}" >/tmp/scripts.zip
+rm -rf "${SCRIPTS_DIR}" || true
+mkdir -p "${SCRIPTS_DIR}"
+unzip -o /tmp/scripts.zip -d "${SCRIPTS_DIR}"
+# Chmod to work around https://github.com/coder/coder/issues/10034
+chmod +x "${SCRIPTS_DIR}"/*.sh
+rm /tmp/scripts.zip
+
+echo "Cloning coder/coder repo..."
+if [[ ! -d "${HOME}/coder" ]]; then
+	git clone https://github.com/coder/coder.git "${HOME}/coder"
+fi
+(cd "${HOME}/coder" && git fetch -a && git checkout "${SCALETEST_PARAM_REPO_BRANCH}" && git pull)
+
+# Store the input parameters (for debugging).
+env | grep "^SCALETEST_" | sort >"${SCALETEST_RUN_DIR}/environ.txt"
+
+# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
+. "${SCRIPTS_DIR}/lib.sh"
+
+appearance_json="$(get_appearance)"
+service_banner_message=$(jq -r '.service_banner.message' <<<"${appearance_json}")
+service_banner_message="${service_banner_message/% | */}"
+service_banner_color="#D65D0F" # Orange.
+
+annotate_grafana "workspace" "Agent running" # Ended in shutdown.sh.
+
+{
+	pids=()
+	ports=()
+	declare -A pods=()
+	next_port=6061
+	for pod in $(kubectl get pods -l app.kubernetes.io/name=coder -o jsonpath='{.items[*].metadata.name}'); do
+		maybedryrun "${DRY_RUN}" kubectl -n coder-big port-forward "${pod}" "${next_port}:6060" &
+		pids+=($!)
+		ports+=("${next_port}")
+		pods[${next_port}]="${pod}"
+		next_port=$((next_port + 1))
+	done
+
+	trap 'trap - EXIT; kill -INT "${pids[@]}"; exit 1' INT EXIT
+
+	while :; do
+		# Sleep for short periods of time so that we can exit quickly.
+		# This adds up to ~300 when accounting for profile and trace.
+		for ((i = 0; i < 285; i++)); do
+			sleep 1
+		done
+		log "Grabbing pprof dumps"
+		start="$(date +%s)"
+		annotate_grafana "pprof" "Grab pprof dumps (start=${start})"
+		for type in allocs block heap goroutine mutex 'profile?seconds=10' 'trace?seconds=5'; do
+			for port in "${ports[@]}"; do
+				tidy_type="${type//\?/_}"
+				tidy_type="${tidy_type//=/_}"
+				maybedryrun "${DRY_RUN}" curl -sSL --output "${SCALETEST_PPROF_DIR}/pprof-${tidy_type}-${pods[${port}]}-${start}.gz" "http://localhost:${port}/debug/pprof/${type}"
+			done
+		done
+		annotate_grafana_end "pprof" "Grab pprof dumps (start=${start})"
+	done
+} &
+pprof_pid=$!
+
+logs_gathered=0
+gather_logs() {
+	if ((logs_gathered == 1)); then
+		return
+	fi
+	logs_gathered=1
+
+	# Gather logs from all coderd and provisioner instances, and all workspaces.
+	annotate_grafana "logs" "Gather logs"
+	podsraw="$(
+		kubectl -n coder-big get pods -l app.kubernetes.io/name=coder -o name
+		kubectl -n coder-big get pods -l app.kubernetes.io/name=coder-provisioner -o name || true
+		kubectl -n coder-big get pods -l app.kubernetes.io/name=coder-workspace -o name | grep "^pod/scaletest-" || true
+	)"
+	mapfile -t pods <<<"${podsraw}"
+	for pod in "${pods[@]}"; do
+		pod_name="${pod#pod/}"
+		kubectl -n coder-big logs "${pod}" --since-time="${SCALETEST_RUN_START_TIME}" >"${SCALETEST_LOGS_DIR}/${pod_name}.txt"
+	done
+	annotate_grafana_end "logs" "Gather logs"
+}
+
+set_appearance "${appearance_json}" "${service_banner_color}" "${service_banner_message} | Scaletest running: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE})!"
+
+# Show failure in the UI if script exits with error.
+on_exit() {
+	code=${?}
+	trap - ERR EXIT
+	set +e
+
+	kill -INT "${pprof_pid}"
+
+	message_color="#4CD473" # Green.
+	message_status=COMPLETE
+	if ((code > 0)); then
+		message_color="#D94A5D" # Red.
+		message_status=FAILED
+	fi
+
+	# In case the test failed before gathering logs, gather them before
+	# cleaning up, whilst the workspaces are still present.
+	gather_logs
+
+	case "${SCALETEST_PARAM_CLEANUP_STRATEGY}" in
+	on_stop)
+		# Handled by shutdown script.
+		;;
+	on_success)
+		if ((code == 0)); then
+			set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE}), cleaning up..."
+			"${SCRIPTS_DIR}/cleanup.sh" "${SCALETEST_PARAM_CLEANUP_STRATEGY}"
+		fi
+		;;
+	on_error)
+		if ((code > 0)); then
+			set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE}), cleaning up..."
+			"${SCRIPTS_DIR}/cleanup.sh" "${SCALETEST_PARAM_CLEANUP_STRATEGY}"
+		fi
+		;;
+	*)
+		set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE}), cleaning up..."
+		"${SCRIPTS_DIR}/cleanup.sh" "${SCALETEST_PARAM_CLEANUP_STRATEGY}"
+		;;
+	esac
+
+	set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE})!"
+
+	annotate_grafana_end "" "Start scaletest: ${SCALETEST_COMMENT}"
+
+	wait "${pprof_pid}"
+	exit "${code}"
+}
+trap on_exit EXIT
+
+on_err() {
+	code=${?}
+	trap - ERR
+	set +e
+
+	log "Scaletest failed!"
+	GRAFANA_EXTRA_TAGS=error set_status "Failed (exit=${code})"
+	"${SCRIPTS_DIR}/report.sh" failed
+	lock_status # Ensure we never rewrite the status after a failure.
+
+	exit "${code}"
+}
+trap on_err ERR
+
+# Pass session token since `prepare.sh` has not yet run.
+CODER_SESSION_TOKEN=$CODER_USER_TOKEN "${SCRIPTS_DIR}/report.sh" started
+annotate_grafana "" "Start scaletest: ${SCALETEST_COMMENT}"
+
+"${SCRIPTS_DIR}/prepare.sh"
+
+"${SCRIPTS_DIR}/run.sh"
+
+# Gather logs before ending the test.
+gather_logs
+
+"${SCRIPTS_DIR}/report.sh" completed

From 358f99025eb0ff75cb2601fe150f261150f130ae Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Tue, 19 Mar 2024 15:05:11 +0100
Subject: [PATCH 03/21] make fmt

---
 examples/scaletests/kubernetes-large/main.tf  | 134 +++++++++---------
 .../kubernetes-medium-greedy/main.tf          |   6 +-
 examples/scaletests/kubernetes-medium/main.tf | 134 +++++++++---------
 examples/scaletests/kubernetes-small/main.tf  | 134 +++++++++---------
 .../kubernetes-with-podmonitor/main.tf        |  18 +--
 5 files changed, 213 insertions(+), 213 deletions(-)

diff --git a/examples/scaletests/kubernetes-large/main.tf b/examples/scaletests/kubernetes-large/main.tf
index 98d5c552f9eaf..352db67bbcf22 100644
--- a/examples/scaletests/kubernetes-large/main.tf
+++ b/examples/scaletests/kubernetes-large/main.tf
@@ -1,82 +1,82 @@
-    terraform {
-      required_providers {
-        coder = {
-          source  = "coder/coder"
-          version = "~> 0.7.0"
-        }
-        kubernetes = {
-          source  = "hashicorp/kubernetes"
-          version = "~> 2.18"
-        }
-      }
+terraform {
+  required_providers {
+    coder = {
+      source  = "coder/coder"
+      version = "~> 0.7.0"
     }
+    kubernetes = {
+      source  = "hashicorp/kubernetes"
+      version = "~> 2.18"
+    }
+  }
+}
 
-    provider "coder" {}
+provider "coder" {}
 
-    provider "kubernetes" {
-      config_path = null # always use host
-    }
+provider "kubernetes" {
+  config_path = null # always use host
+}
 
-    data "coder_workspace" "me" {}
+data "coder_workspace" "me" {}
 
-    resource "coder_agent" "main" {
-      os                     = "linux"
-      arch                   = "amd64"
-      startup_script_timeout = 180
-      startup_script         = ""
-    }
+resource "coder_agent" "main" {
+  os                     = "linux"
+  arch                   = "amd64"
+  startup_script_timeout = 180
+  startup_script         = ""
+}
 
-    resource "kubernetes_pod" "main" {
-      count = data.coder_workspace.me.start_count
-      metadata {
-        name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-        namespace = "coder-big"
-        labels = {
-          "app.kubernetes.io/name"     = "coder-workspace"
-          "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-        }
+resource "kubernetes_pod" "main" {
+  count = data.coder_workspace.me.start_count
+  metadata {
+    name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    namespace = "coder-big"
+    labels = {
+      "app.kubernetes.io/name"     = "coder-workspace"
+      "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    }
+  }
+  spec {
+    security_context {
+      run_as_user = "1000"
+      fs_group    = "1000"
+    }
+    container {
+      name              = "dev"
+      image             = "docker.io/codercom/enterprise-minimal:ubuntu"
+      image_pull_policy = "Always"
+      command           = ["sh", "-c", coder_agent.main.init_script]
+      security_context {
+        run_as_user = "1000"
       }
-      spec {
-        security_context {
-          run_as_user = "1000"
-          fs_group    = "1000"
+      env {
+        name  = "CODER_AGENT_TOKEN"
+        value = coder_agent.main.token
+      }
+      resources {
+        requests = {
+          "cpu"    = "4"
+          "memory" = "4Gi"
         }
-        container {
-          name              = "dev"
-          image             = "docker.io/codercom/enterprise-minimal:ubuntu"
-          image_pull_policy = "Always"
-          command           = ["sh", "-c", coder_agent.main.init_script]
-          security_context {
-            run_as_user = "1000"
-          }
-          env {
-            name  = "CODER_AGENT_TOKEN"
-            value = coder_agent.main.token
-          }
-          resources {
-            requests = {
-              "cpu"    = "4"
-              "memory" = "4Gi"
-            }
-            limits = {
-              "cpu"    = "4"
-              "memory" = "4Gi"
-            }
-          }
+        limits = {
+          "cpu"    = "4"
+          "memory" = "4Gi"
         }
+      }
+    }
 
-        affinity {
-          node_affinity {
-            required_during_scheduling_ignored_during_execution {
-              node_selector_term {
-                match_expressions {
-                  key = "cloud.google.com/gke-nodepool"
-                  operator = "In"
-                  values = ["big-workspaces"]
-                }
-              }
+    affinity {
+      node_affinity {
+        required_during_scheduling_ignored_during_execution {
+          node_selector_term {
+            match_expressions {
+              key      = "cloud.google.com/gke-nodepool"
+              operator = "In"
+              values   = ["big-workspaces"]
             }
           }
         }
       }
     }
+  }
+}
diff --git a/examples/scaletests/kubernetes-medium-greedy/main.tf b/examples/scaletests/kubernetes-medium-greedy/main.tf
index 45f5b970d73c7..a0a5dd8742c56 100644
--- a/examples/scaletests/kubernetes-medium-greedy/main.tf
+++ b/examples/scaletests/kubernetes-medium-greedy/main.tf
@@ -137,7 +137,7 @@ resource "coder_agent" "main" {
     script       = "dd if=/dev/urandom bs=3072 count=1 status=none | base64"
     interval     = 1
     timeout      = 10
-  }  
+  }
 }
 
 resource "kubernetes_pod" "main" {
@@ -184,9 +184,9 @@ resource "kubernetes_pod" "main" {
         required_during_scheduling_ignored_during_execution {
           node_selector_term {
             match_expressions {
-              key = "cloud.google.com/gke-nodepool"
+              key      = "cloud.google.com/gke-nodepool"
               operator = "In"
-              values = ["big-workspaces"]
+              values   = ["big-workspaces"]
             }
           }
         }
diff --git a/examples/scaletests/kubernetes-medium/main.tf b/examples/scaletests/kubernetes-medium/main.tf
index b8ce10b4bdb8a..5dcd9588c1b33 100644
--- a/examples/scaletests/kubernetes-medium/main.tf
+++ b/examples/scaletests/kubernetes-medium/main.tf
@@ -1,82 +1,82 @@
-    terraform {
-      required_providers {
-        coder = {
-          source  = "coder/coder"
-          version = "~> 0.7.0"
-        }
-        kubernetes = {
-          source  = "hashicorp/kubernetes"
-          version = "~> 2.18"
-        }
-      }
+terraform {
+  required_providers {
+    coder = {
+      source  = "coder/coder"
+      version = "~> 0.7.0"
     }
+    kubernetes = {
+      source  = "hashicorp/kubernetes"
+      version = "~> 2.18"
+    }
+  }
+}
 
-    provider "coder" {}
+provider "coder" {}
 
-    provider "kubernetes" {
-      config_path = null # always use host
-    }
+provider "kubernetes" {
+  config_path = null # always use host
+}
 
-    data "coder_workspace" "me" {}
+data "coder_workspace" "me" {}
 
-    resource "coder_agent" "main" {
-      os                     = "linux"
-      arch                   = "amd64"
-      startup_script_timeout = 180
-      startup_script         = ""
-    }
+resource "coder_agent" "main" {
+  os                     = "linux"
+  arch                   = "amd64"
+  startup_script_timeout = 180
+  startup_script         = ""
+}
 
-    resource "kubernetes_pod" "main" {
-      count = data.coder_workspace.me.start_count
-      metadata {
-        name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-        namespace = "coder-big"
-        labels = {
-          "app.kubernetes.io/name"     = "coder-workspace"
-          "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-        }
+resource "kubernetes_pod" "main" {
+  count = data.coder_workspace.me.start_count
+  metadata {
+    name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    namespace = "coder-big"
+    labels = {
+      "app.kubernetes.io/name"     = "coder-workspace"
+      "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    }
+  }
+  spec {
+    security_context {
+      run_as_user = "1000"
+      fs_group    = "1000"
+    }
+    container {
+      name              = "dev"
+      image             = "docker.io/codercom/enterprise-minimal:ubuntu"
+      image_pull_policy = "Always"
+      command           = ["sh", "-c", coder_agent.main.init_script]
+      security_context {
+        run_as_user = "1000"
       }
-      spec {
-        security_context {
-          run_as_user = "1000"
-          fs_group    = "1000"
+      env {
+        name  = "CODER_AGENT_TOKEN"
+        value = coder_agent.main.token
+      }
+      resources {
+        requests = {
+          "cpu"    = "2"
+          "memory" = "2Gi"
         }
-        container {
-          name              = "dev"
-          image             = "docker.io/codercom/enterprise-minimal:ubuntu"
-          image_pull_policy = "Always"
-          command           = ["sh", "-c", coder_agent.main.init_script]
-          security_context {
-            run_as_user = "1000"
-          }
-          env {
-            name  = "CODER_AGENT_TOKEN"
-            value = coder_agent.main.token
-          }
-          resources {
-            requests = {
-              "cpu"    = "2"
-              "memory" = "2Gi"
-            }
-            limits = {
-              "cpu"    = "2"
-              "memory" = "2Gi"
-            }
-          }
+        limits = {
+          "cpu"    = "2"
+          "memory" = "2Gi"
         }
+      }
+    }
 
-        affinity {
-          node_affinity {
-            required_during_scheduling_ignored_during_execution {
-              node_selector_term {
-                match_expressions {
-                  key = "cloud.google.com/gke-nodepool"
-                  operator = "In"
-                  values = ["big-workspaces"]
-                }
-              }
+    affinity {
+      node_affinity {
+        required_during_scheduling_ignored_during_execution {
+          node_selector_term {
+            match_expressions {
+              key      = "cloud.google.com/gke-nodepool"
+              operator = "In"
+              values   = ["big-workspaces"]
             }
           }
         }
       }
     }
+  }
+}
diff --git a/examples/scaletests/kubernetes-small/main.tf b/examples/scaletests/kubernetes-small/main.tf
index b11308b4a2ccf..b59e4989544f5 100644
--- a/examples/scaletests/kubernetes-small/main.tf
+++ b/examples/scaletests/kubernetes-small/main.tf
@@ -1,82 +1,82 @@
-    terraform {
-      required_providers {
-        coder = {
-          source  = "coder/coder"
-          version = "~> 0.7.0"
-        }
-        kubernetes = {
-          source  = "hashicorp/kubernetes"
-          version = "~> 2.18"
-        }
-      }
+terraform {
+  required_providers {
+    coder = {
+      source  = "coder/coder"
+      version = "~> 0.7.0"
     }
+    kubernetes = {
+      source  = "hashicorp/kubernetes"
+      version = "~> 2.18"
+    }
+  }
+}
 
-    provider "coder" {}
+provider "coder" {}
 
-    provider "kubernetes" {
-      config_path = null # always use host
-    }
+provider "kubernetes" {
+  config_path = null # always use host
+}
 
-    data "coder_workspace" "me" {}
+data "coder_workspace" "me" {}
 
-    resource "coder_agent" "main" {
-      os                     = "linux"
-      arch                   = "amd64"
-      startup_script_timeout = 180
-      startup_script         = ""
-    }
+resource "coder_agent" "main" {
+  os                     = "linux"
+  arch                   = "amd64"
+  startup_script_timeout = 180
+  startup_script         = ""
+}
 
-    resource "kubernetes_pod" "main" {
-      count = data.coder_workspace.me.start_count
-      metadata {
-        name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-        namespace = "coder-big"
-        labels = {
-          "app.kubernetes.io/name"     = "coder-workspace"
-          "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-        }
+resource "kubernetes_pod" "main" {
+  count = data.coder_workspace.me.start_count
+  metadata {
+    name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    namespace = "coder-big"
+    labels = {
+      "app.kubernetes.io/name"     = "coder-workspace"
+      "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+    }
+  }
+  spec {
+    security_context {
+      run_as_user = "1000"
+      fs_group    = "1000"
+    }
+    container {
+      name              = "dev"
+      image             = "docker.io/codercom/enterprise-base:ubuntu"
+      image_pull_policy = "Always"
+      command           = ["sh", "-c", coder_agent.main.init_script]
+      security_context {
+        run_as_user = "1000"
       }
-      spec {
-        security_context {
-          run_as_user = "1000"
-          fs_group    = "1000"
+      env {
+        name  = "CODER_AGENT_TOKEN"
+        value = coder_agent.main.token
+      }
+      resources {
+        requests = {
+          "cpu"    = "1"
+          "memory" = "1Gi"
         }
-        container {
-          name              = "dev"
-          image             = "docker.io/codercom/enterprise-base:ubuntu"
-          image_pull_policy = "Always"
-          command           = ["sh", "-c", coder_agent.main.init_script]
-          security_context {
-            run_as_user = "1000"
-          }
-          env {
-            name  = "CODER_AGENT_TOKEN"
-            value = coder_agent.main.token
-          }
-          resources {
-            requests = {
-              "cpu"    = "1"
-              "memory" = "1Gi"
-            }
-            limits = {
-              "cpu"    = "1"
-              "memory" = "1Gi"
-            }
-          }
+        limits = {
+          "cpu"    = "1"
+          "memory" = "1Gi"
         }
+      }
+    }
 
-        affinity {
-          node_affinity {
-            required_during_scheduling_ignored_during_execution {
-              node_selector_term {
-                match_expressions {
-                  key = "cloud.google.com/gke-nodepool"
-                  operator = "In"
-                  values = ["big-workspaces"]
-                }
-              }
+    affinity {
+      node_affinity {
+        required_during_scheduling_ignored_during_execution {
+          node_selector_term {
+            match_expressions {
+              key      = "cloud.google.com/gke-nodepool"
+              operator = "In"
+              values   = ["big-workspaces"]
             }
           }
         }
       }
     }
+  }
+}
diff --git a/examples/scaletests/kubernetes-with-podmonitor/main.tf b/examples/scaletests/kubernetes-with-podmonitor/main.tf
index 1c6c732377728..722cbe71f7692 100644
--- a/examples/scaletests/kubernetes-with-podmonitor/main.tf
+++ b/examples/scaletests/kubernetes-with-podmonitor/main.tf
@@ -289,8 +289,8 @@ resource "kubernetes_pod" "main" {
       }
       port {
         container_port = 21112
-        name = "prometheus-http"
-        protocol = "TCP"
+        name           = "prometheus-http"
+        protocol       = "TCP"
       }
     }
 
@@ -325,9 +325,9 @@ resource "kubernetes_pod" "main" {
         required_during_scheduling_ignored_during_execution {
           node_selector_term {
             match_expressions {
-              key = "cloud.google.com/gke-nodepool"
+              key      = "cloud.google.com/gke-nodepool"
               operator = "In"
-              values = ["big-misc"] # avoid placing on the same nodes as scaletest workspaces
+              values   = ["big-misc"] # avoid placing on the same nodes as scaletest workspaces
             }
           }
         }
@@ -339,21 +339,21 @@ resource "kubernetes_pod" "main" {
 resource "kubernetes_manifest" "pod_monitor" {
   count = data.coder_workspace.me.start_count
   manifest = {
-  apiVersion = "monitoring.coreos.com/v1"
-      kind = "PodMonitor"
+    apiVersion = "monitoring.coreos.com/v1"
+    kind       = "PodMonitor"
     metadata = {
       namespace = var.namespace
-      name = "podmonitor-${local.workspace_pod_name}"
+      name      = "podmonitor-${local.workspace_pod_name}"
     }
     spec = {
       selector = {
         matchLabels = {
-          "app.kubernetes.io/instance": "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
+          "app.kubernetes.io/instance" : "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
         }
       }
       podMetricsEndpoints = [
         {
-          port = "prometheus-http"
+          port     = "prometheus-http"
           interval = "15s"
         }
       ]

From 51f9f0cf35d69ee0480aad2d904fa42be3527ff5 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Tue, 19 Mar 2024 15:29:04 +0100
Subject: [PATCH 04/21] Use mock Grafana url

---
 examples/scaletests/scaletest-runner/main.tf | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/examples/scaletests/scaletest-runner/main.tf b/examples/scaletests/scaletest-runner/main.tf
index 2a6eb8ca21ed5..256b79d96320b 100644
--- a/examples/scaletests/scaletest-runner/main.tf
+++ b/examples/scaletests/scaletest-runner/main.tf
@@ -44,7 +44,7 @@ locals {
   scaletest_run_id                               = "scaletest-${replace(time_static.start_time.rfc3339, ":", "-")}"
   scaletest_run_dir                              = "/home/coder/${local.scaletest_run_id}"
   scaletest_run_start_time                       = time_static.start_time.rfc3339
-  grafana_url                                    = "https://stats.dev.c8s.io"
+  grafana_url                                    = "https://grafana.corp.tld"
   grafana_dashboard_uid                          = "qLVSTR-Vz"
   grafana_dashboard_name                         = "coderv2-loadtest-dashboard"
 }
@@ -736,8 +736,7 @@ resource "coder_app" "prometheus" {
   agent_id     = coder_agent.main.id
   slug         = "01-prometheus"
   display_name = "Prometheus"
-  // https://stats.dev.c8s.io:9443/classic/graph?g0.range_input=2h&g0.end_input=2023-09-08%2015%3A58&g0.stacked=0&g0.expr=rate(pg_stat_database_xact_commit%7Bcluster%3D%22big%22%2Cdatname%3D%22big-coder%22%7D%5B1m%5D)&g0.tab=0
-  url      = "https://stats.dev.c8s.io:9443"
+  url      = "https://grafana.corp.tld:9443"
   icon     = "https://prometheus.io/assets/favicons/favicon-32x32.png"
   external = true
 }

From 4bafa5c64739ab05bffef66ef0c1c70e7add5848 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Wed, 20 Mar 2024 12:47:21 +0100
Subject: [PATCH 05/21] READMEs

---
 examples/scaletests/kubernetes-large/README.md         | 5 +++++
 examples/scaletests/kubernetes-medium-greedy/README.md | 5 +++++
 examples/scaletests/kubernetes-medium/README.md        | 5 +++++
 examples/scaletests/kubernetes-minimal/README.md       | 5 +++++
 examples/scaletests/kubernetes-small/README.md         | 5 +++++
 5 files changed, 25 insertions(+)
 create mode 100644 examples/scaletests/kubernetes-large/README.md
 create mode 100644 examples/scaletests/kubernetes-medium-greedy/README.md
 create mode 100644 examples/scaletests/kubernetes-medium/README.md
 create mode 100644 examples/scaletests/kubernetes-minimal/README.md
 create mode 100644 examples/scaletests/kubernetes-small/README.md

diff --git a/examples/scaletests/kubernetes-large/README.md b/examples/scaletests/kubernetes-large/README.md
new file mode 100644
index 0000000000000..2b0ae5cc296be
--- /dev/null
+++ b/examples/scaletests/kubernetes-large/README.md
@@ -0,0 +1,5 @@
+# kubernetes-large
+
+Provisions a large-sized workspace with no persistent storage.
+
+_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
diff --git a/examples/scaletests/kubernetes-medium-greedy/README.md b/examples/scaletests/kubernetes-medium-greedy/README.md
new file mode 100644
index 0000000000000..22e94bb262616
--- /dev/null
+++ b/examples/scaletests/kubernetes-medium-greedy/README.md
@@ -0,0 +1,5 @@
+# kubernetes-medium-greedy
+
+Provisions a medium-sized workspace with no persistent storage. Greedy agent variant.
+
+_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
diff --git a/examples/scaletests/kubernetes-medium/README.md b/examples/scaletests/kubernetes-medium/README.md
new file mode 100644
index 0000000000000..e2d5eae983114
--- /dev/null
+++ b/examples/scaletests/kubernetes-medium/README.md
@@ -0,0 +1,5 @@
+# kubernetes-medium
+
+Provisions a medium-sized workspace with no persistent storage.
+
+_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
diff --git a/examples/scaletests/kubernetes-minimal/README.md b/examples/scaletests/kubernetes-minimal/README.md
new file mode 100644
index 0000000000000..c56d3d477f821
--- /dev/null
+++ b/examples/scaletests/kubernetes-minimal/README.md
@@ -0,0 +1,5 @@
+# kubernetes-minimal
+
+Provisions a medium-sized workspace with no persistent storage. Greedy agent variant.
+
+_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
diff --git a/examples/scaletests/kubernetes-small/README.md b/examples/scaletests/kubernetes-small/README.md
new file mode 100644
index 0000000000000..56efbb98c3cb3
--- /dev/null
+++ b/examples/scaletests/kubernetes-small/README.md
@@ -0,0 +1,5 @@
+# kubernetes-small
+
+Provisions a small-sized workspace with no persistent storage.
+
+_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`

From 7dc0d5e27f7a9c1e68dbb1272ef2ee7d547c9bfd Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Wed, 20 Mar 2024 13:05:09 +0100
Subject: [PATCH 06/21] More todos

---
 docs/admin/scale.md                          | 16 ++++++++++++++++
 examples/scaletests/scaletest-runner/main.tf |  6 +++---
 2 files changed, 19 insertions(+), 3 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index 024983bb7a528..1ecd4a9b2f781 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -91,6 +91,22 @@ coder exp scaletest cleanup
 
 This will delete all workspaces and users with the prefix `scaletest-`.
 
+## Scale testing template
+
+TODO
+
+### Parameters
+
+TODO
+
+### Kubernetes cluster
+
+TODO
+
+### Observability
+
+TODO Grafana and logs
+
 ## Autoscaling
 
 We generally do not recommend using an autoscaler that modifies the number of
diff --git a/examples/scaletests/scaletest-runner/main.tf b/examples/scaletests/scaletest-runner/main.tf
index 256b79d96320b..2d17c66435f62 100644
--- a/examples/scaletests/scaletest-runner/main.tf
+++ b/examples/scaletests/scaletest-runner/main.tf
@@ -736,9 +736,9 @@ resource "coder_app" "prometheus" {
   agent_id     = coder_agent.main.id
   slug         = "01-prometheus"
   display_name = "Prometheus"
-  url      = "https://grafana.corp.tld:9443"
-  icon     = "https://prometheus.io/assets/favicons/favicon-32x32.png"
-  external = true
+  url          = "https://grafana.corp.tld:9443"
+  icon         = "https://prometheus.io/assets/favicons/favicon-32x32.png"
+  external     = true
 }
 
 resource "coder_app" "manual_cleanup" {

From 7df6afe6b3e786ebd4496f48348a3d9d01678c1d Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Wed, 20 Mar 2024 13:36:02 +0100
Subject: [PATCH 07/21] Move scaletests

---
 docs/admin/scale.md                                           | 4 +++-
 .../templates}/kubernetes-large/README.md                     | 0
 .../templates}/kubernetes-large/main.tf                       | 0
 .../templates}/kubernetes-medium-greedy/README.md             | 0
 .../templates}/kubernetes-medium-greedy/main.tf               | 0
 .../templates}/kubernetes-medium/README.md                    | 0
 .../templates}/kubernetes-medium/main.tf                      | 0
 .../templates}/kubernetes-minimal/README.md                   | 0
 .../templates}/kubernetes-minimal/main.tf                     | 0
 .../templates}/kubernetes-small/README.md                     | 0
 .../templates}/kubernetes-small/main.tf                       | 0
 .../templates}/kubernetes-with-podmonitor/README.md           | 0
 .../templates}/kubernetes-with-podmonitor/main.tf             | 0
 13 files changed, 3 insertions(+), 1 deletion(-)
 rename {examples/scaletests => scaletest/templates}/kubernetes-large/README.md (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-large/main.tf (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-medium-greedy/README.md (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-medium-greedy/main.tf (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-medium/README.md (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-medium/main.tf (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-minimal/README.md (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-minimal/main.tf (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-small/README.md (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-small/main.tf (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-with-podmonitor/README.md (100%)
 rename {examples/scaletests => scaletest/templates}/kubernetes-with-podmonitor/main.tf (100%)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index 1ecd4a9b2f781..dd018b98562ec 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -1,6 +1,8 @@
 We scale-test Coder with [a built-in utility](#scale-testing-utility) that can
 be used in your environment for insights into how Coder scales with your
-infrastructure.
+infrastructure. For scale-testing Kubernetes clusters we recommend to install
+and use the dedicated Coder template,
+[scaletest-runner](https://github.com/coder/coder/tree/main/scaletest/templates/scaletest-runner).
 
 Learn more about [Coder’s architecture](../about/architecture.md) and our
 [scale-testing methodology](architectures/index.md#scale-testing-methodology).
diff --git a/examples/scaletests/kubernetes-large/README.md b/scaletest/templates/kubernetes-large/README.md
similarity index 100%
rename from examples/scaletests/kubernetes-large/README.md
rename to scaletest/templates/kubernetes-large/README.md
diff --git a/examples/scaletests/kubernetes-large/main.tf b/scaletest/templates/kubernetes-large/main.tf
similarity index 100%
rename from examples/scaletests/kubernetes-large/main.tf
rename to scaletest/templates/kubernetes-large/main.tf
diff --git a/examples/scaletests/kubernetes-medium-greedy/README.md b/scaletest/templates/kubernetes-medium-greedy/README.md
similarity index 100%
rename from examples/scaletests/kubernetes-medium-greedy/README.md
rename to scaletest/templates/kubernetes-medium-greedy/README.md
diff --git a/examples/scaletests/kubernetes-medium-greedy/main.tf b/scaletest/templates/kubernetes-medium-greedy/main.tf
similarity index 100%
rename from examples/scaletests/kubernetes-medium-greedy/main.tf
rename to scaletest/templates/kubernetes-medium-greedy/main.tf
diff --git a/examples/scaletests/kubernetes-medium/README.md b/scaletest/templates/kubernetes-medium/README.md
similarity index 100%
rename from examples/scaletests/kubernetes-medium/README.md
rename to scaletest/templates/kubernetes-medium/README.md
diff --git a/examples/scaletests/kubernetes-medium/main.tf b/scaletest/templates/kubernetes-medium/main.tf
similarity index 100%
rename from examples/scaletests/kubernetes-medium/main.tf
rename to scaletest/templates/kubernetes-medium/main.tf
diff --git a/examples/scaletests/kubernetes-minimal/README.md b/scaletest/templates/kubernetes-minimal/README.md
similarity index 100%
rename from examples/scaletests/kubernetes-minimal/README.md
rename to scaletest/templates/kubernetes-minimal/README.md
diff --git a/examples/scaletests/kubernetes-minimal/main.tf b/scaletest/templates/kubernetes-minimal/main.tf
similarity index 100%
rename from examples/scaletests/kubernetes-minimal/main.tf
rename to scaletest/templates/kubernetes-minimal/main.tf
diff --git a/examples/scaletests/kubernetes-small/README.md b/scaletest/templates/kubernetes-small/README.md
similarity index 100%
rename from examples/scaletests/kubernetes-small/README.md
rename to scaletest/templates/kubernetes-small/README.md
diff --git a/examples/scaletests/kubernetes-small/main.tf b/scaletest/templates/kubernetes-small/main.tf
similarity index 100%
rename from examples/scaletests/kubernetes-small/main.tf
rename to scaletest/templates/kubernetes-small/main.tf
diff --git a/examples/scaletests/kubernetes-with-podmonitor/README.md b/scaletest/templates/kubernetes-with-podmonitor/README.md
similarity index 100%
rename from examples/scaletests/kubernetes-with-podmonitor/README.md
rename to scaletest/templates/kubernetes-with-podmonitor/README.md
diff --git a/examples/scaletests/kubernetes-with-podmonitor/main.tf b/scaletest/templates/kubernetes-with-podmonitor/main.tf
similarity index 100%
rename from examples/scaletests/kubernetes-with-podmonitor/main.tf
rename to scaletest/templates/kubernetes-with-podmonitor/main.tf

From 03279ae6f429ba74be752fdf5afd303d9effc075 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Wed, 20 Mar 2024 13:37:07 +0100
Subject: [PATCH 08/21] Move around templates

---
 .../scaletests/scaletest-runner/Dockerfile    |  36 -
 .../scaletests/scaletest-runner/README.md     |   9 -
 examples/scaletests/scaletest-runner/main.tf  | 960 ------------------
 .../scaletest-runner/metadata_phase.sh        |   6 -
 .../metadata_previous_phase.sh                |   6 -
 .../scaletest-runner/metadata_status.sh       |   6 -
 .../scaletest-runner/scripts/cleanup.sh       |  62 --
 .../scaletest-runner/scripts/lib.sh           | 313 ------
 .../scaletest-runner/scripts/prepare.sh       |  67 --
 .../scaletest-runner/scripts/report.sh        | 109 --
 .../scaletest-runner/scripts/run.sh           | 369 -------
 .../scaletests/scaletest-runner/shutdown.sh   |  30 -
 .../scaletests/scaletest-runner/startup.sh    | 181 ----
 scaletest/templates/scaletest-runner/main.tf  |  11 +-
 14 files changed, 6 insertions(+), 2159 deletions(-)
 delete mode 100644 examples/scaletests/scaletest-runner/Dockerfile
 delete mode 100644 examples/scaletests/scaletest-runner/README.md
 delete mode 100644 examples/scaletests/scaletest-runner/main.tf
 delete mode 100755 examples/scaletests/scaletest-runner/metadata_phase.sh
 delete mode 100755 examples/scaletests/scaletest-runner/metadata_previous_phase.sh
 delete mode 100755 examples/scaletests/scaletest-runner/metadata_status.sh
 delete mode 100755 examples/scaletests/scaletest-runner/scripts/cleanup.sh
 delete mode 100644 examples/scaletests/scaletest-runner/scripts/lib.sh
 delete mode 100755 examples/scaletests/scaletest-runner/scripts/prepare.sh
 delete mode 100755 examples/scaletests/scaletest-runner/scripts/report.sh
 delete mode 100755 examples/scaletests/scaletest-runner/scripts/run.sh
 delete mode 100755 examples/scaletests/scaletest-runner/shutdown.sh
 delete mode 100755 examples/scaletests/scaletest-runner/startup.sh

diff --git a/examples/scaletests/scaletest-runner/Dockerfile b/examples/scaletests/scaletest-runner/Dockerfile
deleted file mode 100644
index 9aa016b534a17..0000000000000
--- a/examples/scaletests/scaletest-runner/Dockerfile
+++ /dev/null
@@ -1,36 +0,0 @@
-# This image is used to run scaletest jobs and, although it is inside
-# the template directory, it is built separately and pushed to
-# gcr.io/coder-dev-1/scaletest-runner:latest.
-#
-# Future improvements will include versioning and including the version
-# in the template push.
-
-FROM codercom/enterprise-base:ubuntu
-
-ARG DEBIAN_FRONTEND=noninteractive
-
-USER root
-
-# TODO(mafredri): Remove unneeded dependencies once we have a clear idea of what's needed.
-RUN wget --quiet -O /tmp/terraform.zip https://releases.hashicorp.com/terraform/1.5.7/terraform_1.5.7_linux_amd64.zip \
-	&& unzip /tmp/terraform.zip -d /usr/local/bin \
-	&& rm /tmp/terraform.zip \
-	&& terraform --version
-
-RUN wget --quiet -O /tmp/envsubst "https://github.com/a8m/envsubst/releases/download/v1.2.0/envsubst-$(uname -s)-$(uname -m)" \
-	&& chmod +x /tmp/envsubst \
-	&& mv /tmp/envsubst /usr/local/bin
-
-RUN echo "deb [signed-by=/usr/share/keyrings/cloud.google.gpg] https://packages.cloud.google.com/apt cloud-sdk main" | tee -a /etc/apt/sources.list.d/google-cloud-sdk.list \
-	&& curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key --keyring /usr/share/keyrings/cloud.google.gpg add - \
-	&& apt-get update \
-	&& apt-get install --yes \
-	google-cloud-cli \
-	jq \
-	kubectl \
-	zstd \
-	&& gcloud --version \
-	&& kubectl version --client \
-	&& rm -rf /var/lib/apt/lists/*
-
-USER coder
diff --git a/examples/scaletests/scaletest-runner/README.md b/examples/scaletests/scaletest-runner/README.md
deleted file mode 100644
index 6c048211e1ad4..0000000000000
--- a/examples/scaletests/scaletest-runner/README.md
+++ /dev/null
@@ -1,9 +0,0 @@
----
-name: Scaletest Runner
-description: Run a scaletest.
-tags: [local]
----
-
-# Scaletest Runner
-
-Run a scaletest.
diff --git a/examples/scaletests/scaletest-runner/main.tf b/examples/scaletests/scaletest-runner/main.tf
deleted file mode 100644
index 2d17c66435f62..0000000000000
--- a/examples/scaletests/scaletest-runner/main.tf
+++ /dev/null
@@ -1,960 +0,0 @@
-terraform {
-  required_providers {
-    coder = {
-      source  = "coder/coder"
-      version = "~> 0.12"
-    }
-    kubernetes = {
-      source  = "hashicorp/kubernetes"
-      version = "~> 2.22"
-    }
-  }
-}
-
-resource "time_static" "start_time" {
-  # We don't set `count = data.coder_workspace.me.start_count` here because then
-  # we can't use this value in `locals`, but we want to trigger recreation when
-  # the scaletest is restarted.
-  triggers = {
-    count : data.coder_workspace.me.start_count
-    token : data.coder_workspace.me.owner_session_token # Rely on this being re-generated every start.
-  }
-}
-
-resource "null_resource" "permission_check" {
-  count = data.coder_workspace.me.start_count
-
-  # Limit which users can create a workspace in this template.
-  # The "default" user and workspace are present because they are needed
-  # for the plan, and consequently, updating the template.
-  lifecycle {
-    precondition {
-      condition     = can(regex("^(default/default|scaletest/runner)$", "${data.coder_workspace.me.owner}/${data.coder_workspace.me.name}"))
-      error_message = "User and workspace name is not allowed, expected 'scaletest/runner'."
-    }
-  }
-}
-
-locals {
-  workspace_pod_name                             = "coder-scaletest-runner-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-  workspace_pod_instance                         = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-  workspace_pod_termination_grace_period_seconds = 5 * 60 * 60 # 5 hours (cleanup timeout).
-  service_account_name                           = "scaletest-sa"
-  home_disk_size                                 = 10
-  scaletest_run_id                               = "scaletest-${replace(time_static.start_time.rfc3339, ":", "-")}"
-  scaletest_run_dir                              = "/home/coder/${local.scaletest_run_id}"
-  scaletest_run_start_time                       = time_static.start_time.rfc3339
-  grafana_url                                    = "https://grafana.corp.tld"
-  grafana_dashboard_uid                          = "qLVSTR-Vz"
-  grafana_dashboard_name                         = "coderv2-loadtest-dashboard"
-}
-
-data "coder_provisioner" "me" {
-}
-
-data "coder_workspace" "me" {
-}
-
-data "coder_parameter" "verbose" {
-  order       = 1
-  type        = "bool"
-  name        = "Verbose"
-  default     = false
-  description = "Show debug output."
-  mutable     = true
-  ephemeral   = true
-}
-
-data "coder_parameter" "dry_run" {
-  order       = 2
-  type        = "bool"
-  name        = "Dry-run"
-  default     = true
-  description = "Perform a dry-run to see what would happen."
-  mutable     = true
-  ephemeral   = true
-}
-
-data "coder_parameter" "repo_branch" {
-  order       = 3
-  type        = "string"
-  name        = "Branch"
-  default     = "main"
-  description = "Branch of coder/coder repo to check out (only useful for developing the runner)."
-  mutable     = true
-}
-
-data "coder_parameter" "comment" {
-  order       = 4
-  type        = "string"
-  name        = "Comment"
-  default     = ""
-  description = "Describe **what** you're testing and **why** you're testing it."
-  mutable     = true
-  ephemeral   = true
-}
-
-data "coder_parameter" "create_concurrency" {
-  order       = 10
-  type        = "number"
-  name        = "Create concurrency"
-  default     = 10
-  description = "The number of workspaces to create concurrently."
-  mutable     = true
-
-  # Setting zero = unlimited, but perhaps not a good idea,
-  # we can raise this limit instead.
-  validation {
-    min = 1
-    max = 100
-  }
-}
-
-data "coder_parameter" "job_concurrency" {
-  order       = 11
-  type        = "number"
-  name        = "Job concurrency"
-  default     = 0
-  description = "The number of concurrent jobs (e.g. when producing workspace traffic)."
-  mutable     = true
-
-  # Setting zero = unlimited, but perhaps not a good idea,
-  # we can raise this limit instead.
-  validation {
-    min = 0
-  }
-}
-
-data "coder_parameter" "cleanup_concurrency" {
-  order       = 12
-  type        = "number"
-  name        = "Cleanup concurrency"
-  default     = 10
-  description = "The number of concurrent cleanup jobs."
-  mutable     = true
-
-  # Setting zero = unlimited, but perhaps not a good idea,
-  # we can raise this limit instead.
-  validation {
-    min = 1
-    max = 100
-  }
-}
-
-data "coder_parameter" "cleanup_strategy" {
-  order       = 13
-  name        = "Cleanup strategy"
-  default     = "always"
-  description = "The strategy used to cleanup workspaces after the scaletest is complete."
-  mutable     = true
-  ephemeral   = true
-  option {
-    name        = "Always"
-    value       = "always"
-    description = "Automatically cleanup workspaces after the scaletest ends."
-  }
-  option {
-    name        = "On stop"
-    value       = "on_stop"
-    description = "Cleanup workspaces when the workspace is stopped."
-  }
-  option {
-    name        = "On success"
-    value       = "on_success"
-    description = "Automatically cleanup workspaces after the scaletest is complete if no error occurs."
-  }
-  option {
-    name        = "On error"
-    value       = "on_error"
-    description = "Automatically cleanup workspaces after the scaletest is complete if an error occurs."
-  }
-}
-
-data "coder_parameter" "cleanup_prepare" {
-  order       = 14
-  type        = "bool"
-  name        = "Cleanup before scaletest"
-  default     = true
-  description = "Cleanup existing scaletest users and workspaces before the scaletest starts (prepare phase)."
-  mutable     = true
-  ephemeral   = true
-}
-
-
-data "coder_parameter" "workspace_template" {
-  order        = 20
-  name         = "workspace_template"
-  display_name = "Workspace Template"
-  description  = "The template used for workspace creation."
-  default      = "kubernetes-minimal"
-  icon         = "/emojis/1f4dc.png" # Scroll.
-  mutable      = true
-  option {
-    name        = "Minimal"
-    value       = "kubernetes-minimal" # Feather.
-    icon        = "/emojis/1fab6.png"
-    description = "Sized to fit approx. 32 per t2d-standard-8 instance."
-  }
-  option {
-    name        = "Small"
-    value       = "kubernetes-small"
-    icon        = "/emojis/1f42d.png" # Mouse.
-    description = "Provisions a small-sized workspace with no persistent storage."
-  }
-  option {
-    name        = "Medium"
-    value       = "kubernetes-medium"
-    icon        = "/emojis/1f436.png" # Dog.
-    description = "Provisions a medium-sized workspace with no persistent storage."
-  }
-  option {
-    name        = "Medium (Greedy)"
-    value       = "kubernetes-medium-greedy"
-    icon        = "/emojis/1f436.png" # Dog.
-    description = "Provisions a medium-sized workspace with no persistent storage. Greedy agent variant."
-  }
-  option {
-    name        = "Large"
-    value       = "kubernetes-large"
-    icon        = "/emojis/1f434.png" # Horse.
-    description = "Provisions a large-sized workspace with no persistent storage."
-  }
-}
-
-data "coder_parameter" "num_workspaces" {
-  order       = 21
-  type        = "number"
-  name        = "Number of workspaces to create"
-  default     = 100
-  description = "The scaletest suite will create this number of workspaces."
-  mutable     = true
-
-  validation {
-    min = 0
-    max = 2000
-  }
-}
-
-data "coder_parameter" "skip_create_workspaces" {
-  order       = 22
-  type        = "bool"
-  name        = "DEBUG: Skip creating workspaces"
-  default     = false
-  description = "Skip creating workspaces (for resuming failed scaletests or debugging)"
-  mutable     = true
-}
-
-
-data "coder_parameter" "load_scenarios" {
-  order       = 23
-  name        = "Load Scenarios"
-  type        = "list(string)"
-  description = "The load scenarios to run."
-  mutable     = true
-  ephemeral   = true
-  default = jsonencode([
-    "SSH Traffic",
-    "Web Terminal Traffic",
-    "App Traffic",
-    "Dashboard Traffic",
-  ])
-}
-
-data "coder_parameter" "load_scenario_run_concurrently" {
-  order       = 24
-  name        = "Run Load Scenarios Concurrently"
-  type        = "bool"
-  default     = false
-  description = "Run all load scenarios concurrently, this setting enables the load scenario percentages so that they can be assigned a percentage of 1-100%."
-  mutable     = true
-}
-
-data "coder_parameter" "load_scenario_concurrency_stagger_delay_mins" {
-  order       = 25
-  name        = "Load Scenario Concurrency Stagger Delay"
-  type        = "number"
-  default     = 3
-  description = "The number of minutes to wait between starting each load scenario when run concurrently."
-  mutable     = true
-}
-
-data "coder_parameter" "load_scenario_ssh_traffic_duration" {
-  order       = 30
-  name        = "SSH Traffic Duration"
-  type        = "number"
-  description = "The duration of the SSH traffic load scenario in minutes."
-  mutable     = true
-  default     = 30
-  validation {
-    min = 1
-    max = 1440 // 24 hours.
-  }
-}
-
-data "coder_parameter" "load_scenario_ssh_bytes_per_tick" {
-  order       = 31
-  name        = "SSH Bytes Per Tick"
-  type        = "number"
-  description = "The number of bytes to send per tick in the SSH traffic load scenario."
-  mutable     = true
-  default     = 1024
-  validation {
-    min = 1
-  }
-}
-
-data "coder_parameter" "load_scenario_ssh_tick_interval" {
-  order       = 32
-  name        = "SSH Tick Interval"
-  type        = "number"
-  description = "The number of milliseconds between each tick in the SSH traffic load scenario."
-  mutable     = true
-  default     = 100
-  validation {
-    min = 1
-  }
-}
-
-data "coder_parameter" "load_scenario_ssh_traffic_percentage" {
-  order       = 33
-  name        = "SSH Traffic Percentage"
-  type        = "number"
-  description = "The percentage of workspaces that should be targeted for SSH traffic."
-  mutable     = true
-  default     = 100
-  validation {
-    min = 1
-    max = 100
-  }
-}
-
-data "coder_parameter" "load_scenario_web_terminal_traffic_duration" {
-  order       = 40
-  name        = "Web Terminal Traffic Duration"
-  type        = "number"
-  description = "The duration of the web terminal traffic load scenario in minutes."
-  mutable     = true
-  default     = 30
-  validation {
-    min = 1
-    max = 1440 // 24 hours.
-  }
-}
-
-data "coder_parameter" "load_scenario_web_terminal_bytes_per_tick" {
-  order       = 41
-  name        = "Web Terminal Bytes Per Tick"
-  type        = "number"
-  description = "The number of bytes to send per tick in the web terminal traffic load scenario."
-  mutable     = true
-  default     = 1024
-  validation {
-    min = 1
-  }
-}
-
-data "coder_parameter" "load_scenario_web_terminal_tick_interval" {
-  order       = 42
-  name        = "Web Terminal Tick Interval"
-  type        = "number"
-  description = "The number of milliseconds between each tick in the web terminal traffic load scenario."
-  mutable     = true
-  default     = 100
-  validation {
-    min = 1
-  }
-}
-
-data "coder_parameter" "load_scenario_web_terminal_traffic_percentage" {
-  order       = 43
-  name        = "Web Terminal Traffic Percentage"
-  type        = "number"
-  description = "The percentage of workspaces that should be targeted for web terminal traffic."
-  mutable     = true
-  default     = 100
-  validation {
-    min = 1
-    max = 100
-  }
-}
-
-data "coder_parameter" "load_scenario_app_traffic_duration" {
-  order       = 50
-  name        = "App Traffic Duration"
-  type        = "number"
-  description = "The duration of the app traffic load scenario in minutes."
-  mutable     = true
-  default     = 30
-  validation {
-    min = 1
-    max = 1440 // 24 hours.
-  }
-}
-
-data "coder_parameter" "load_scenario_app_bytes_per_tick" {
-  order       = 51
-  name        = "App Bytes Per Tick"
-  type        = "number"
-  description = "The number of bytes to send per tick in the app traffic load scenario."
-  mutable     = true
-  default     = 1024
-  validation {
-    min = 1
-  }
-}
-
-data "coder_parameter" "load_scenario_app_tick_interval" {
-  order       = 52
-  name        = "App Tick Interval"
-  type        = "number"
-  description = "The number of milliseconds between each tick in the app traffic load scenario."
-  mutable     = true
-  default     = 100
-  validation {
-    min = 1
-  }
-}
-
-data "coder_parameter" "load_scenario_app_traffic_percentage" {
-  order       = 53
-  name        = "App Traffic Percentage"
-  type        = "number"
-  description = "The percentage of workspaces that should be targeted for app traffic."
-  mutable     = true
-  default     = 100
-  validation {
-    min = 1
-    max = 100
-  }
-}
-
-data "coder_parameter" "load_scenario_app_traffic_mode" {
-  order       = 54
-  name        = "App Traffic Mode"
-  default     = "wsec"
-  description = "The mode of the app traffic load scenario."
-  mutable     = true
-  option {
-    name        = "WebSocket Echo"
-    value       = "wsec"
-    description = "Send traffic to the workspace via the app websocket and read it back."
-  }
-  option {
-    name        = "WebSocket Read (Random)"
-    value       = "wsra"
-    description = "Read traffic from the workspace via the app websocket."
-  }
-  option {
-    name        = "WebSocket Write (Discard)"
-    value       = "wsdi"
-    description = "Send traffic to the workspace via the app websocket."
-  }
-}
-
-data "coder_parameter" "load_scenario_dashboard_traffic_duration" {
-  order       = 60
-  name        = "Dashboard Traffic Duration"
-  type        = "number"
-  description = "The duration of the dashboard traffic load scenario in minutes."
-  mutable     = true
-  default     = 30
-  validation {
-    min = 1
-    max = 1440 // 24 hours.
-  }
-}
-
-data "coder_parameter" "load_scenario_dashboard_traffic_percentage" {
-  order       = 61
-  name        = "Dashboard Traffic Percentage"
-  type        = "number"
-  description = "The percentage of users that should be targeted for dashboard traffic."
-  mutable     = true
-  default     = 100
-  validation {
-    min = 1
-    max = 100
-  }
-}
-
-data "coder_parameter" "load_scenario_baseline_duration" {
-  order       = 100
-  name        = "Baseline Wait Duration"
-  type        = "number"
-  description = "The duration to wait before starting a load scenario in minutes."
-  mutable     = true
-  default     = 5
-  validation {
-    min = 0
-    max = 60
-  }
-}
-
-data "coder_parameter" "greedy_agent" {
-  order       = 200
-  type        = "bool"
-  name        = "Greedy Agent"
-  default     = false
-  description = "If true, the agent will attempt to consume all available resources."
-  mutable     = true
-  ephemeral   = true
-}
-
-data "coder_parameter" "greedy_agent_template" {
-  order        = 201
-  name         = "Greedy Agent Template"
-  display_name = "Greedy Agent Template"
-  description  = "The template used for the greedy agent workspace (must not be same as workspace template)."
-  default      = "kubernetes-medium-greedy"
-  icon         = "/emojis/1f4dc.png" # Scroll.
-  mutable      = true
-  option {
-    name        = "Minimal"
-    value       = "kubernetes-minimal" # Feather.
-    icon        = "/emojis/1fab6.png"
-    description = "Sized to fit approx. 32 per t2d-standard-8 instance."
-  }
-  option {
-    name        = "Small"
-    value       = "kubernetes-small"
-    icon        = "/emojis/1f42d.png" # Mouse.
-    description = "Provisions a small-sized workspace with no persistent storage."
-  }
-  option {
-    name        = "Medium"
-    value       = "kubernetes-medium"
-    icon        = "/emojis/1f436.png" # Dog.
-    description = "Provisions a medium-sized workspace with no persistent storage."
-  }
-  option {
-    name        = "Medium (Greedy)"
-    value       = "kubernetes-medium-greedy"
-    icon        = "/emojis/1f436.png" # Dog.
-    description = "Provisions a medium-sized workspace with no persistent storage. Greedy agent variant."
-  }
-  option {
-    name        = "Large"
-    value       = "kubernetes-large"
-    icon        = "/emojis/1f434.png" # Horse.
-    description = "Provisions a large-sized workspace with no persistent storage."
-  }
-}
-
-data "coder_parameter" "namespace" {
-  order       = 999
-  type        = "string"
-  name        = "Namespace"
-  default     = "coder-big"
-  description = "The Kubernetes namespace to create the scaletest runner resources in."
-}
-
-data "archive_file" "scripts_zip" {
-  type        = "zip"
-  output_path = "${path.module}/scripts.zip"
-  source_dir  = "${path.module}/scripts"
-}
-
-resource "coder_agent" "main" {
-  arch = data.coder_provisioner.me.arch
-  dir  = local.scaletest_run_dir
-  os   = "linux"
-  env = {
-    VERBOSE : data.coder_parameter.verbose.value ? "1" : "0",
-    DRY_RUN : data.coder_parameter.dry_run.value ? "1" : "0",
-    CODER_CONFIG_DIR : "/home/coder/.config/coderv2",
-    CODER_USER_TOKEN : data.coder_workspace.me.owner_session_token,
-    CODER_URL : data.coder_workspace.me.access_url,
-    CODER_USER : data.coder_workspace.me.owner,
-    CODER_WORKSPACE : data.coder_workspace.me.name,
-
-    # Global scaletest envs that may affect each `coder exp scaletest` invocation.
-    CODER_SCALETEST_PROMETHEUS_ADDRESS : "0.0.0.0:21112",
-    CODER_SCALETEST_PROMETHEUS_WAIT : "60s",
-    CODER_SCALETEST_CONCURRENCY : "${data.coder_parameter.job_concurrency.value}",
-    CODER_SCALETEST_CLEANUP_CONCURRENCY : "${data.coder_parameter.cleanup_concurrency.value}",
-
-    # Expose as params as well, for reporting (TODO(mafredri): refactor, only have one).
-    SCALETEST_PARAM_SCALETEST_CONCURRENCY : "${data.coder_parameter.job_concurrency.value}",
-    SCALETEST_PARAM_SCALETEST_CLEANUP_CONCURRENCY : "${data.coder_parameter.cleanup_concurrency.value}",
-
-    # Local envs passed as arguments to `coder exp scaletest` invocations.
-    SCALETEST_RUN_ID : local.scaletest_run_id,
-    SCALETEST_RUN_DIR : local.scaletest_run_dir,
-    SCALETEST_RUN_START_TIME : local.scaletest_run_start_time,
-    SCALETEST_PROMETHEUS_START_PORT : "21112",
-
-    # Comment is a scaletest param, but we want to surface it separately from
-    # the rest, so we use a different name.
-    SCALETEST_COMMENT : data.coder_parameter.comment.value != "" ? data.coder_parameter.comment.value : "No comment provided",
-
-    SCALETEST_PARAM_TEMPLATE : data.coder_parameter.workspace_template.value,
-    SCALETEST_PARAM_REPO_BRANCH : data.coder_parameter.repo_branch.value,
-    SCALETEST_PARAM_NUM_WORKSPACES : data.coder_parameter.num_workspaces.value,
-    SCALETEST_PARAM_SKIP_CREATE_WORKSPACES : data.coder_parameter.skip_create_workspaces.value ? "1" : "0",
-    SCALETEST_PARAM_CREATE_CONCURRENCY : "${data.coder_parameter.create_concurrency.value}",
-    SCALETEST_PARAM_CLEANUP_STRATEGY : data.coder_parameter.cleanup_strategy.value,
-    SCALETEST_PARAM_CLEANUP_PREPARE : data.coder_parameter.cleanup_prepare.value ? "1" : "0",
-    SCALETEST_PARAM_LOAD_SCENARIOS : data.coder_parameter.load_scenarios.value,
-    SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY : data.coder_parameter.load_scenario_run_concurrently.value ? "1" : "0",
-    SCALETEST_PARAM_LOAD_SCENARIO_CONCURRENCY_STAGGER_DELAY_MINS : "${data.coder_parameter.load_scenario_concurrency_stagger_delay_mins.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_ssh_traffic_duration.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_BYTES_PER_TICK : "${data.coder_parameter.load_scenario_ssh_bytes_per_tick.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_TICK_INTERVAL : "${data.coder_parameter.load_scenario_ssh_tick_interval.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_ssh_traffic_percentage.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_web_terminal_traffic_duration.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_BYTES_PER_TICK : "${data.coder_parameter.load_scenario_web_terminal_bytes_per_tick.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_TICK_INTERVAL : "${data.coder_parameter.load_scenario_web_terminal_tick_interval.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_web_terminal_traffic_percentage.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_app_traffic_duration.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_BYTES_PER_TICK : "${data.coder_parameter.load_scenario_app_bytes_per_tick.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_TICK_INTERVAL : "${data.coder_parameter.load_scenario_app_tick_interval.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_app_traffic_percentage.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_MODE : data.coder_parameter.load_scenario_app_traffic_mode.value,
-    SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_DURATION : "${data.coder_parameter.load_scenario_dashboard_traffic_duration.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_PERCENTAGE : "${data.coder_parameter.load_scenario_dashboard_traffic_percentage.value}",
-    SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION : "${data.coder_parameter.load_scenario_baseline_duration.value}",
-    SCALETEST_PARAM_GREEDY_AGENT : data.coder_parameter.greedy_agent.value ? "1" : "0",
-    SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE : data.coder_parameter.greedy_agent_template.value,
-
-    GRAFANA_URL : local.grafana_url,
-
-    SCRIPTS_ZIP : filebase64(data.archive_file.scripts_zip.output_path),
-    SCRIPTS_DIR : "/tmp/scripts",
-  }
-  display_apps {
-    vscode     = false
-    ssh_helper = false
-  }
-  startup_script_timeout  = 86400
-  shutdown_script_timeout = 7200
-  startup_script_behavior = "blocking"
-  startup_script          = file("startup.sh")
-  shutdown_script         = file("shutdown.sh")
-
-  # IDEA(mafredri): It would be pretty cool to define metadata to expect JSON output, each field/item could become a separate metadata item.
-  # Scaletest metadata.
-  metadata {
-    display_name = "Scaletest status"
-    key          = "00_scaletest_status"
-    script       = file("metadata_status.sh")
-    interval     = 1
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Scaletest phase"
-    key          = "01_scaletest_phase"
-    script       = file("metadata_phase.sh")
-    interval     = 1
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Scaletest phase (previous)"
-    key          = "02_scaletest_previous_phase"
-    script       = file("metadata_previous_phase.sh")
-    interval     = 1
-    timeout      = 1
-  }
-
-  # Misc workspace metadata.
-  metadata {
-    display_name = "CPU Usage"
-    key          = "80_cpu_usage"
-    script       = "coder stat cpu"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "RAM Usage"
-    key          = "81_ram_usage"
-    script       = "coder stat mem"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Home Disk"
-    key          = "82_home_disk"
-    script       = "coder stat disk --path $${HOME}"
-    interval     = 60
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "CPU Usage (Host)"
-    key          = "83_cpu_usage_host"
-    script       = "coder stat cpu --host"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Memory Usage (Host)"
-    key          = "84_mem_usage_host"
-    script       = "coder stat mem --host"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Load Average (Host)"
-    key          = "85_load_host"
-    # Get load avg scaled by number of cores.
-    script   = <<-EOS
-      echo "`cat /proc/loadavg | awk '{ print $1 }'` `nproc`" | awk '{ printf "%0.2f", $1/$2 }'
-    EOS
-    interval = 60
-    timeout  = 1
-  }
-}
-
-module "code-server" {
-  source          = "https://registry.coder.com/modules/code-server"
-  agent_id        = coder_agent.main.id
-  install_version = "4.8.3"
-  folder          = local.scaletest_run_dir
-}
-
-module "filebrowser" {
-  source   = "https://registry.coder.com/modules/filebrowser"
-  agent_id = coder_agent.main.id
-  folder   = local.scaletest_run_dir
-}
-
-resource "coder_app" "grafana" {
-  agent_id     = coder_agent.main.id
-  slug         = "00-grafana"
-  display_name = "Grafana"
-  url          = "${local.grafana_url}/d/${local.grafana_dashboard_uid}/${local.grafana_dashboard_name}?orgId=1&from=${time_static.start_time.unix * 1000}&to=now"
-  icon         = "https://grafana.com/static/assets/img/fav32.png"
-  external     = true
-}
-
-resource "coder_app" "prometheus" {
-  agent_id     = coder_agent.main.id
-  slug         = "01-prometheus"
-  display_name = "Prometheus"
-  url          = "https://grafana.corp.tld:9443"
-  icon         = "https://prometheus.io/assets/favicons/favicon-32x32.png"
-  external     = true
-}
-
-resource "coder_app" "manual_cleanup" {
-  agent_id     = coder_agent.main.id
-  slug         = "02-manual-cleanup"
-  display_name = "Manual cleanup"
-  icon         = "/emojis/1f9f9.png"
-  command      = "/tmp/scripts/cleanup.sh manual"
-}
-
-resource "kubernetes_persistent_volume_claim" "home" {
-  depends_on = [null_resource.permission_check]
-  metadata {
-    name      = "${local.workspace_pod_name}-home"
-    namespace = data.coder_parameter.namespace.value
-    labels = {
-      "app.kubernetes.io/name"     = "coder-pvc"
-      "app.kubernetes.io/instance" = "coder-pvc-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-      "app.kubernetes.io/part-of"  = "coder"
-      // Coder specific labels.
-      "com.coder.resource"       = "true"
-      "com.coder.workspace.id"   = data.coder_workspace.me.id
-      "com.coder.workspace.name" = data.coder_workspace.me.name
-      "com.coder.user.id"        = data.coder_workspace.me.owner_id
-      "com.coder.user.username"  = data.coder_workspace.me.owner
-    }
-    annotations = {
-      "com.coder.user.email" = data.coder_workspace.me.owner_email
-    }
-  }
-  wait_until_bound = false
-  spec {
-    access_modes = ["ReadWriteOnce"]
-    resources {
-      requests = {
-        storage = "${local.home_disk_size}Gi"
-      }
-    }
-  }
-}
-
-resource "kubernetes_pod" "main" {
-  depends_on = [null_resource.permission_check]
-  count      = data.coder_workspace.me.start_count
-  metadata {
-    name      = local.workspace_pod_name
-    namespace = data.coder_parameter.namespace.value
-    labels = {
-      "app.kubernetes.io/name"     = "coder-workspace"
-      "app.kubernetes.io/instance" = local.workspace_pod_instance
-      "app.kubernetes.io/part-of"  = "coder"
-      // Coder specific labels.
-      "com.coder.resource"       = "true"
-      "com.coder.workspace.id"   = data.coder_workspace.me.id
-      "com.coder.workspace.name" = data.coder_workspace.me.name
-      "com.coder.user.id"        = data.coder_workspace.me.owner_id
-      "com.coder.user.username"  = data.coder_workspace.me.owner
-    }
-    annotations = {
-      "com.coder.user.email" = data.coder_workspace.me.owner_email
-    }
-  }
-  # Set the pod delete timeout to termination_grace_period_seconds + 1m.
-  timeouts {
-    delete = "${(local.workspace_pod_termination_grace_period_seconds + 120)}s"
-  }
-  spec {
-    security_context {
-      run_as_user = "1000"
-      fs_group    = "1000"
-    }
-
-    # Allow this pod to perform scale tests.
-    service_account_name = local.service_account_name
-
-    # Allow the coder agent to perform graceful shutdown and cleanup of
-    # scaletest resources. We add an extra minute so ensure work
-    # completion is prioritized over timeout.
-    termination_grace_period_seconds = local.workspace_pod_termination_grace_period_seconds + 60
-
-    container {
-      name              = "dev"
-      image             = "gcr.io/coder-dev-1/scaletest-runner:latest"
-      image_pull_policy = "Always"
-      command           = ["sh", "-c", coder_agent.main.init_script]
-      security_context {
-        run_as_user = "1000"
-      }
-      env {
-        name  = "CODER_AGENT_TOKEN"
-        value = coder_agent.main.token
-      }
-      env {
-        name  = "CODER_AGENT_LOG_DIR"
-        value = "${local.scaletest_run_dir}/logs"
-      }
-      env {
-        name = "GRAFANA_API_TOKEN"
-        value_from {
-          secret_key_ref {
-            name = data.kubernetes_secret.grafana_editor_api_token.metadata[0].name
-            key  = "token"
-          }
-        }
-      }
-      env {
-        name = "SLACK_WEBHOOK_URL"
-        value_from {
-          secret_key_ref {
-            name = data.kubernetes_secret.slack_scaletest_notifications_webhook_url.metadata[0].name
-            key  = "url"
-          }
-        }
-      }
-      resources {
-        requests = {
-          "cpu"    = "250m"
-          "memory" = "512Mi"
-        }
-      }
-      volume_mount {
-        mount_path = "/home/coder"
-        name       = "home"
-        read_only  = false
-      }
-      dynamic "port" {
-        for_each = data.coder_parameter.load_scenario_run_concurrently.value ? jsondecode(data.coder_parameter.load_scenarios.value) : [""]
-        iterator = it
-        content {
-          container_port = 21112 + it.key
-          name           = "prom-http${it.key}"
-          protocol       = "TCP"
-        }
-      }
-    }
-
-    volume {
-      name = "home"
-      persistent_volume_claim {
-        claim_name = kubernetes_persistent_volume_claim.home.metadata.0.name
-        read_only  = false
-      }
-    }
-
-    affinity {
-      pod_anti_affinity {
-        // This affinity attempts to spread out all workspace pods evenly across
-        // nodes.
-        preferred_during_scheduling_ignored_during_execution {
-          weight = 1
-          pod_affinity_term {
-            topology_key = "kubernetes.io/hostname"
-            label_selector {
-              match_expressions {
-                key      = "app.kubernetes.io/name"
-                operator = "In"
-                values   = ["coder-workspace"]
-              }
-            }
-          }
-        }
-      }
-      node_affinity {
-        required_during_scheduling_ignored_during_execution {
-          node_selector_term {
-            match_expressions {
-              key      = "cloud.google.com/gke-nodepool"
-              operator = "In"
-              values   = ["big-workspacetraffic"] # Avoid placing on the same nodes as scaletest workspaces.
-            }
-          }
-        }
-      }
-    }
-  }
-}
-
-data "kubernetes_secret" "grafana_editor_api_token" {
-  metadata {
-    name      = "grafana-editor-api-token"
-    namespace = data.coder_parameter.namespace.value
-  }
-}
-
-data "kubernetes_secret" "slack_scaletest_notifications_webhook_url" {
-  metadata {
-    name      = "slack-scaletest-notifications-webhook-url"
-    namespace = data.coder_parameter.namespace.value
-  }
-}
-
-resource "kubernetes_manifest" "pod_monitor" {
-  count = data.coder_workspace.me.start_count
-  manifest = {
-    apiVersion = "monitoring.coreos.com/v1"
-    kind       = "PodMonitor"
-    metadata = {
-      namespace = data.coder_parameter.namespace.value
-      name      = "podmonitor-${local.workspace_pod_name}"
-    }
-    spec = {
-      selector = {
-        matchLabels = {
-          "app.kubernetes.io/instance" : local.workspace_pod_instance
-        }
-      }
-      podMetricsEndpoints = [
-        # NOTE(mafredri): We could add more information here by including the
-        # scenario name in the port name (although it's limited to 15 chars so
-        # it needs to be short). That said, someone looking at the stats can
-        # assume that there's a 1-to-1 mapping between scenario# and port.
-        for i, _ in data.coder_parameter.load_scenario_run_concurrently.value ? jsondecode(data.coder_parameter.load_scenarios.value) : [""] : {
-          port     = "prom-http${i}"
-          interval = "15s"
-        }
-      ]
-    }
-  }
-}
diff --git a/examples/scaletests/scaletest-runner/metadata_phase.sh b/examples/scaletests/scaletest-runner/metadata_phase.sh
deleted file mode 100755
index 755a8ba084db7..0000000000000
--- a/examples/scaletests/scaletest-runner/metadata_phase.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-#!/bin/bash
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh"
-
-get_phase
diff --git a/examples/scaletests/scaletest-runner/metadata_previous_phase.sh b/examples/scaletests/scaletest-runner/metadata_previous_phase.sh
deleted file mode 100755
index c858687b72ad8..0000000000000
--- a/examples/scaletests/scaletest-runner/metadata_previous_phase.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-#!/bin/bash
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh" 2>/dev/null || return
-
-get_previous_phase
diff --git a/examples/scaletests/scaletest-runner/metadata_status.sh b/examples/scaletests/scaletest-runner/metadata_status.sh
deleted file mode 100755
index 8ec45f0875c1d..0000000000000
--- a/examples/scaletests/scaletest-runner/metadata_status.sh
+++ /dev/null
@@ -1,6 +0,0 @@
-#!/bin/bash
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh" 2>/dev/null || return
-
-get_status
diff --git a/examples/scaletests/scaletest-runner/scripts/cleanup.sh b/examples/scaletests/scaletest-runner/scripts/cleanup.sh
deleted file mode 100755
index c80982497b5e9..0000000000000
--- a/examples/scaletests/scaletest-runner/scripts/cleanup.sh
+++ /dev/null
@@ -1,62 +0,0 @@
-#!/bin/bash
-set -euo pipefail
-
-[[ $VERBOSE == 1 ]] && set -x
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh"
-
-event=${1:-}
-
-if [[ -z $event ]]; then
-	event=manual
-fi
-
-do_cleanup() {
-	start_phase "Cleanup (${event})"
-	coder exp scaletest cleanup \
-		--cleanup-job-timeout 2h \
-		--cleanup-timeout 5h |
-		tee "${SCALETEST_RESULTS_DIR}/cleanup-${event}.txt"
-	end_phase
-}
-
-do_scaledown() {
-	start_phase "Scale down provisioners (${event})"
-	maybedryrun "$DRY_RUN" kubectl scale deployment/coder-provisioner --replicas 1
-	maybedryrun "$DRY_RUN" kubectl rollout status deployment/coder-provisioner
-	end_phase
-}
-
-case "${event}" in
-manual)
-	echo -n 'WARNING: This will clean up all scaletest resources, continue? (y/n) '
-	read -r -n 1
-	if [[ $REPLY != [yY] ]]; then
-		echo $'\nAborting...'
-		exit 1
-	fi
-	echo
-
-	do_cleanup
-	do_scaledown
-
-	echo 'Press any key to continue...'
-	read -s -r -n 1
-	;;
-prepare)
-	do_cleanup
-	;;
-on_stop) ;; # Do nothing, handled by "shutdown".
-always | on_success | on_error | shutdown)
-	do_cleanup
-	do_scaledown
-	;;
-shutdown_scale_down_only)
-	do_scaledown
-	;;
-*)
-	echo "Unknown event: ${event}" >&2
-	exit 1
-	;;
-esac
diff --git a/examples/scaletests/scaletest-runner/scripts/lib.sh b/examples/scaletests/scaletest-runner/scripts/lib.sh
deleted file mode 100644
index 868dd5c078d2e..0000000000000
--- a/examples/scaletests/scaletest-runner/scripts/lib.sh
+++ /dev/null
@@ -1,313 +0,0 @@
-#!/bin/bash
-set -euo pipefail
-
-# Only source this script once, this env comes from sourcing
-# scripts/lib.sh from coder/coder below.
-if [[ ${SCRIPTS_LIB_IS_SOURCED:-0} == 1 ]]; then
-	return 0
-fi
-
-# Source scripts/lib.sh from coder/coder for common functions.
-# shellcheck source=scripts/lib.sh
-. "${HOME}/coder/scripts/lib.sh"
-
-# Make shellcheck happy.
-DRY_RUN=${DRY_RUN:-0}
-
-# Environment variables shared between scripts.
-SCALETEST_STATE_DIR="${SCALETEST_RUN_DIR}/state"
-SCALETEST_PHASE_FILE="${SCALETEST_STATE_DIR}/phase"
-# shellcheck disable=SC2034
-SCALETEST_RESULTS_DIR="${SCALETEST_RUN_DIR}/results"
-SCALETEST_LOGS_DIR="${SCALETEST_RUN_DIR}/logs"
-SCALETEST_PPROF_DIR="${SCALETEST_RUN_DIR}/pprof"
-# https://github.com/kubernetes/kubernetes/issues/72501 :-(
-SCALETEST_CODER_BINARY="/tmp/coder-full-${SCALETEST_RUN_ID}"
-
-mkdir -p "${SCALETEST_STATE_DIR}" "${SCALETEST_RESULTS_DIR}" "${SCALETEST_LOGS_DIR}" "${SCALETEST_PPROF_DIR}"
-
-coder() {
-	if [[ ! -x "${SCALETEST_CODER_BINARY}" ]]; then
-		log "Fetching full coder binary..."
-		fetch_coder_full
-	fi
-	maybedryrun "${DRY_RUN}" "${SCALETEST_CODER_BINARY}" "${@}"
-}
-
-show_json() {
-	maybedryrun "${DRY_RUN}" jq 'del(.. | .logs?)' "${1}"
-}
-
-set_status() {
-	dry_run=
-	if [[ ${DRY_RUN} == 1 ]]; then
-		dry_run=" (dry-run)"
-	fi
-	prev_status=$(get_status)
-	if [[ ${prev_status} != *"Not started"* ]]; then
-		annotate_grafana_end "status" "Status: ${prev_status}"
-	fi
-	echo "$(date -Ins) ${*}${dry_run}" >>"${SCALETEST_STATE_DIR}/status"
-
-	annotate_grafana "status" "Status: ${*}"
-
-	status_lower=$(tr '[:upper:]' '[:lower:]' <<<"${*}")
-	set_pod_status_annotation "${status_lower}"
-}
-lock_status() {
-	chmod 0440 "${SCALETEST_STATE_DIR}/status"
-}
-get_status() {
-	# Order of importance (reverse of creation).
-	if [[ -f "${SCALETEST_STATE_DIR}/status" ]]; then
-		tail -n1 "${SCALETEST_STATE_DIR}/status" | cut -d' ' -f2-
-	else
-		echo "Not started"
-	fi
-}
-
-phase_num=0
-start_phase() {
-	# This may be incremented from another script, so we read it every time.
-	if [[ -f "${SCALETEST_PHASE_FILE}" ]]; then
-		phase_num=$(grep -c START: "${SCALETEST_PHASE_FILE}")
-	fi
-	phase_num=$((phase_num + 1))
-	log "Start phase ${phase_num}: ${*}"
-	echo "$(date -Ins) START:${phase_num}: ${*}" >>"${SCALETEST_PHASE_FILE}"
-
-	GRAFANA_EXTRA_TAGS="${PHASE_TYPE:-phase-default}" annotate_grafana "phase" "Phase ${phase_num}: ${*}"
-}
-end_phase() {
-	phase=$(tail -n 1 "${SCALETEST_PHASE_FILE}" | grep "START:${phase_num}:" | cut -d' ' -f3-)
-	if [[ -z ${phase} ]]; then
-		log "BUG: Could not find start phase ${phase_num} in ${SCALETEST_PHASE_FILE}"
-		return 1
-	fi
-	log "End phase ${phase_num}: ${phase}"
-	echo "$(date -Ins) END:${phase_num}: ${phase}" >>"${SCALETEST_PHASE_FILE}"
-
-	GRAFANA_EXTRA_TAGS="${PHASE_TYPE:-phase-default}" GRAFANA_ADD_TAGS="${PHASE_ADD_TAGS:-}" annotate_grafana_end "phase" "Phase ${phase_num}: ${phase}"
-}
-get_phase() {
-	if [[ -f "${SCALETEST_PHASE_FILE}" ]]; then
-		phase_raw=$(tail -n1 "${SCALETEST_PHASE_FILE}")
-		phase=$(echo "${phase_raw}" | cut -d' ' -f3-)
-		if [[ ${phase_raw} == *"END:"* ]]; then
-			phase+=" [done]"
-		fi
-		echo "${phase}"
-	else
-		echo "None"
-	fi
-}
-get_previous_phase() {
-	if [[ -f "${SCALETEST_PHASE_FILE}" ]] && [[ $(grep -c START: "${SCALETEST_PHASE_FILE}") -gt 1 ]]; then
-		grep START: "${SCALETEST_PHASE_FILE}" | tail -n2 | head -n1 | cut -d' ' -f3-
-	else
-		echo "None"
-	fi
-}
-
-annotate_grafana() {
-	local tags=${1} text=${2} start=${3:-$(($(date +%s) * 1000))}
-	local json resp id
-
-	if [[ -z $tags ]]; then
-		tags="scaletest,runner"
-	else
-		tags="scaletest,runner,${tags}"
-	fi
-	if [[ -n ${GRAFANA_EXTRA_TAGS:-} ]]; then
-		tags="${tags},${GRAFANA_EXTRA_TAGS}"
-	fi
-
-	log "Annotating Grafana (start=${start}): ${text} [${tags}]"
-
-	json="$(
-		jq \
-			--argjson time "${start}" \
-			--arg text "${text}" \
-			--arg tags "${tags}" \
-			'{time: $time, tags: $tags | split(","), text: $text}' <<<'{}'
-	)"
-	if [[ ${DRY_RUN} == 1 ]]; then
-		echo "FAKEID:${tags}:${text}:${start}" >>"${SCALETEST_STATE_DIR}/grafana-annotations"
-		log "Would have annotated Grafana, data=${json}"
-		return 0
-	fi
-	if ! resp="$(
-		curl -sSL \
-			--insecure \
-			-H "Authorization: Bearer ${GRAFANA_API_TOKEN}" \
-			-H "Content-Type: application/json" \
-			-d "${json}" \
-			"${GRAFANA_URL}/api/annotations"
-	)"; then
-		# Don't abort scaletest just because we couldn't annotate Grafana.
-		log "Failed to annotate Grafana: ${resp}"
-		return 0
-	fi
-
-	if [[ $(jq -r '.message' <<<"${resp}") != "Annotation added" ]]; then
-		log "Failed to annotate Grafana: ${resp}"
-		return 0
-	fi
-
-	log "Grafana annotation added!"
-
-	id="$(jq -r '.id' <<<"${resp}")"
-	echo "${id}:${tags}:${text}:${start}" >>"${SCALETEST_STATE_DIR}/grafana-annotations"
-}
-annotate_grafana_end() {
-	local tags=${1} text=${2} start=${3:-} end=${4:-$(($(date +%s) * 1000))}
-	local id json resp
-
-	if [[ -z $tags ]]; then
-		tags="scaletest,runner"
-	else
-		tags="scaletest,runner,${tags}"
-	fi
-	if [[ -n ${GRAFANA_EXTRA_TAGS:-} ]]; then
-		tags="${tags},${GRAFANA_EXTRA_TAGS}"
-	fi
-
-	if ! id=$(grep ":${tags}:${text}:${start}" "${SCALETEST_STATE_DIR}/grafana-annotations" | sort -n | tail -n1 | cut -d: -f1); then
-		log "NOTICE: Could not find Grafana annotation to end: '${tags}:${text}:${start}', skipping..."
-		return 0
-	fi
-
-	log "Updating Grafana annotation (end=${end}): ${text} [${tags}, add=${GRAFANA_ADD_TAGS:-}]"
-
-	if [[ -n ${GRAFANA_ADD_TAGS:-} ]]; then
-		json="$(
-			jq -n \
-				--argjson timeEnd "${end}" \
-				--arg tags "${tags},${GRAFANA_ADD_TAGS}" \
-				'{timeEnd: $timeEnd, tags: $tags | split(",")}'
-		)"
-	else
-		json="$(
-			jq -n \
-				--argjson timeEnd "${end}" \
-				'{timeEnd: $timeEnd}'
-		)"
-	fi
-	if [[ ${DRY_RUN} == 1 ]]; then
-		log "Would have patched Grafana annotation: id=${id}, data=${json}"
-		return 0
-	fi
-	if ! resp="$(
-		curl -sSL \
-			--insecure \
-			-H "Authorization: Bearer ${GRAFANA_API_TOKEN}" \
-			-H "Content-Type: application/json" \
-			-X PATCH \
-			-d "${json}" \
-			"${GRAFANA_URL}/api/annotations/${id}"
-	)"; then
-		# Don't abort scaletest just because we couldn't annotate Grafana.
-		log "Failed to annotate Grafana end: ${resp}"
-		return 0
-	fi
-
-	if [[ $(jq -r '.message' <<<"${resp}") != "Annotation patched" ]]; then
-		log "Failed to annotate Grafana end: ${resp}"
-		return 0
-	fi
-
-	log "Grafana annotation patched!"
-}
-
-wait_baseline() {
-	s=${1:-2}
-	PHASE_TYPE="phase-wait" start_phase "Waiting ${s}m to establish baseline"
-	maybedryrun "$DRY_RUN" sleep $((s * 60))
-	PHASE_TYPE="phase-wait" end_phase
-}
-
-get_appearance() {
-	session_token=$CODER_USER_TOKEN
-	if [[ -f "${CODER_CONFIG_DIR}/session" ]]; then
-		session_token="$(<"${CODER_CONFIG_DIR}/session")"
-	fi
-	curl -sSL \
-		-H "Coder-Session-Token: ${session_token}" \
-		"${CODER_URL}/api/v2/appearance"
-}
-set_appearance() {
-	local json=$1 color=$2 message=$3
-
-	session_token=$CODER_USER_TOKEN
-	if [[ -f "${CODER_CONFIG_DIR}/session" ]]; then
-		session_token="$(<"${CODER_CONFIG_DIR}/session")"
-	fi
-	newjson="$(
-		jq \
-			--arg color "${color}" \
-			--arg message "${message}" \
-			'. | .service_banner.message |= $message | .service_banner.background_color |= $color' <<<"${json}"
-	)"
-	maybedryrun "${DRY_RUN}" curl -sSL \
-		-X PUT \
-		-H 'Content-Type: application/json' \
-		-H "Coder-Session-Token: ${session_token}" \
-		--data "${newjson}" \
-		"${CODER_URL}/api/v2/appearance"
-}
-
-namespace() {
-	cat /var/run/secrets/kubernetes.io/serviceaccount/namespace
-}
-coder_pods() {
-	kubectl get pods \
-		--namespace "$(namespace)" \
-		--selector "app.kubernetes.io/name=coder,app.kubernetes.io/part-of=coder" \
-		--output jsonpath='{.items[*].metadata.name}'
-}
-
-# fetch_coder_full fetches the full (non-slim) coder binary from one of the coder pods
-# running in the same namespace as the current pod.
-fetch_coder_full() {
-	if [[ -x "${SCALETEST_CODER_BINARY}" ]]; then
-		log "Full Coder binary already exists at ${SCALETEST_CODER_BINARY}"
-		return 0
-	fi
-	ns=$(namespace)
-	if [[ -z "${ns}" ]]; then
-		log "Could not determine namespace!"
-		return 1
-	fi
-	log "Namespace from serviceaccount token is ${ns}"
-	pods=$(coder_pods)
-	if [[ -z ${pods} ]]; then
-		log "Could not find coder pods!"
-		return 1
-	fi
-	pod=$(cut -d ' ' -f 1 <<<"${pods}")
-	if [[ -z ${pod} ]]; then
-		log "Could not find coder pod!"
-		return 1
-	fi
-	log "Fetching full Coder binary from ${pod}"
-	# We need --retries due to https://github.com/kubernetes/kubernetes/issues/60140 :(
-	maybedryrun "${DRY_RUN}" kubectl \
-		--namespace "${ns}" \
-		cp \
-		--container coder \
-		--retries 10 \
-		"${pod}:/opt/coder" "${SCALETEST_CODER_BINARY}"
-	maybedryrun "${DRY_RUN}" chmod +x "${SCALETEST_CODER_BINARY}"
-	log "Full Coder binary downloaded to ${SCALETEST_CODER_BINARY}"
-}
-
-# set_pod_status_annotation annotates the currently running pod with the key
-# com.coder.scaletest.status. It will overwrite the previous status.
-set_pod_status_annotation() {
-	if [[ $# -ne 1 ]]; then
-		log "BUG: Must specify an annotation value"
-		return 1
-	else
-		maybedryrun "${DRY_RUN}" kubectl --namespace "$(namespace)" annotate pod "$(hostname)" "com.coder.scaletest.status=$1" --overwrite
-	fi
-}
diff --git a/examples/scaletests/scaletest-runner/scripts/prepare.sh b/examples/scaletests/scaletest-runner/scripts/prepare.sh
deleted file mode 100755
index 90b2dd05f945f..0000000000000
--- a/examples/scaletests/scaletest-runner/scripts/prepare.sh
+++ /dev/null
@@ -1,67 +0,0 @@
-#!/bin/bash
-set -euo pipefail
-
-[[ $VERBOSE == 1 ]] && set -x
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh"
-
-mkdir -p "${SCALETEST_STATE_DIR}"
-mkdir -p "${SCALETEST_RESULTS_DIR}"
-
-log "Preparing scaletest workspace environment..."
-set_status Preparing
-
-log "Compressing previous run logs (if applicable)..."
-mkdir -p "${HOME}/archive"
-for dir in "${HOME}/scaletest-"*; do
-	if [[ ${dir} = "${SCALETEST_RUN_DIR}" ]]; then
-		continue
-	fi
-	if [[ -d ${dir} ]]; then
-		name="$(basename "${dir}")"
-		(
-			cd "$(dirname "${dir}")"
-			ZSTD_CLEVEL=12 maybedryrun "$DRY_RUN" tar --zstd -cf "${HOME}/archive/${name}.tar.zst" "${name}"
-		)
-		maybedryrun "$DRY_RUN" rm -rf "${dir}"
-	fi
-done
-
-log "Creating coder CLI token (needed for cleanup during shutdown)..."
-
-mkdir -p "${CODER_CONFIG_DIR}"
-echo -n "${CODER_URL}" >"${CODER_CONFIG_DIR}/url"
-
-set +x # Avoid logging the token.
-# Persist configuration for shutdown script too since the
-# owner token is invalidated immediately on workspace stop.
-export CODER_SESSION_TOKEN=${CODER_USER_TOKEN}
-coder tokens delete scaletest_runner >/dev/null 2>&1 || true
-# TODO(mafredri): Set TTL? This could interfere with delayed stop though.
-token=$(coder tokens create --name scaletest_runner)
-if [[ $DRY_RUN == 1 ]]; then
-	token=${CODER_SESSION_TOKEN}
-fi
-unset CODER_SESSION_TOKEN
-echo -n "${token}" >"${CODER_CONFIG_DIR}/session"
-[[ $VERBOSE == 1 ]] && set -x # Restore logging (if enabled).
-
-if [[ ${SCALETEST_PARAM_CLEANUP_PREPARE} == 1 ]]; then
-	log "Cleaning up from previous runs (if applicable)..."
-	"${SCRIPTS_DIR}/cleanup.sh" prepare
-fi
-
-log "Preparation complete!"
-
-PROVISIONER_REPLICA_COUNT="${SCALETEST_PARAM_CREATE_CONCURRENCY:-0}"
-if [[ "${PROVISIONER_REPLICA_COUNT}" -eq 0 ]]; then
-	# TODO(Cian): what is a good default value here?
-	echo "Setting PROVISIONER_REPLICA_COUNT to 10 since SCALETEST_PARAM_CREATE_CONCURRENCY is 0"
-	PROVISIONER_REPLICA_COUNT=10
-fi
-log "Scaling up provisioners to ${PROVISIONER_REPLICA_COUNT}..."
-maybedryrun "$DRY_RUN" kubectl scale deployment/coder-provisioner \
-	--replicas "${PROVISIONER_REPLICA_COUNT}"
-log "Waiting for provisioners to scale up..."
-maybedryrun "$DRY_RUN" kubectl rollout status deployment/coder-provisioner
diff --git a/examples/scaletests/scaletest-runner/scripts/report.sh b/examples/scaletests/scaletest-runner/scripts/report.sh
deleted file mode 100755
index 0c6a5059ba37d..0000000000000
--- a/examples/scaletests/scaletest-runner/scripts/report.sh
+++ /dev/null
@@ -1,109 +0,0 @@
-#!/bin/bash
-set -euo pipefail
-
-[[ $VERBOSE == 1 ]] && set -x
-
-status=$1
-shift
-
-case "${status}" in
-started) ;;
-completed) ;;
-failed) ;;
-*)
-	echo "Unknown status: ${status}" >&2
-	exit 1
-	;;
-esac
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh"
-
-# NOTE(mafredri): API returns HTML if we accidentally use `...//api` vs `.../api`.
-# https://github.com/coder/coder/issues/9877
-CODER_URL="${CODER_URL%/}"
-buildinfo="$(curl -sSL "${CODER_URL}/api/v2/buildinfo")"
-server_version="$(jq -r '.version' <<<"${buildinfo}")"
-server_version_commit="$(jq -r '.external_url' <<<"${buildinfo}")"
-
-# Since `coder show` doesn't support JSON output, we list the workspaces instead.
-# Use `command` here to bypass dry run.
-workspace_json="$(
-	command coder list --all --output json |
-		jq --arg workspace "${CODER_WORKSPACE}" --arg user "${CODER_USER}" 'map(select(.name == $workspace) | select(.owner_name == $user)) | .[0]'
-)"
-owner_name="$(jq -r '.latest_build.workspace_owner_name' <<<"${workspace_json}")"
-workspace_name="$(jq -r '.latest_build.workspace_name' <<<"${workspace_json}")"
-initiator_name="$(jq -r '.latest_build.initiator_name' <<<"${workspace_json}")"
-
-bullet='•'
-app_urls_raw="$(jq -r '.latest_build.resources[].agents[]?.apps | map(select(.external == true)) | .[] | .display_name, .url' <<<"${workspace_json}")"
-app_urls=()
-while read -r app_name; do
-	read -r app_url
-	bold=
-	if [[ ${status} != started ]] && [[ ${app_url} = *to=now* ]]; then
-		# Update Grafana URL with end stamp and make bold.
-		app_url="${app_url//to=now/to=$(($(date +%s) * 1000))}"
-		bold='*'
-	fi
-	app_urls+=("${bullet} ${bold}${app_name}${bold}: ${app_url}")
-done <<<"${app_urls_raw}"
-
-params=()
-header=
-
-case "${status}" in
-started)
-	created_at="$(jq -r '.latest_build.created_at' <<<"${workspace_json}")"
-	params=("${bullet} Options:")
-	while read -r param; do
-		params+=("    ${bullet} ${param}")
-	done <<<"$(jq -r '.latest_build.resources[].agents[]?.environment_variables | to_entries | map(select(.key | startswith("SCALETEST_PARAM_"))) | .[] | "`\(.key)`: `\(.value)`"' <<<"${workspace_json}")"
-
-	header="New scaletest started at \`${created_at}\` by \`${initiator_name}\` on ${CODER_URL} (<${server_version_commit}|\`${server_version}\`>)."
-	;;
-completed)
-	completed_at=$(date -Iseconds)
-	header="Scaletest completed at \`${completed_at}\` (started by \`${initiator_name}\`) on ${CODER_URL} (<${server_version_commit}|\`${server_version}\`>)."
-	;;
-failed)
-	failed_at=$(date -Iseconds)
-	header="Scaletest failed at \`${failed_at}\` (started by \`${initiator_name}\`) on ${CODER_URL} (<${server_version_commit}|\`${server_version}\`>)."
-	;;
-*)
-	echo "Unknown status: ${status}" >&2
-	exit 1
-	;;
-esac
-
-text_arr=(
-	"${header}"
-	""
-	"${bullet} *Comment:* ${SCALETEST_COMMENT}"
-	"${bullet} Workspace (runner): ${CODER_URL}/@${owner_name}/${workspace_name}"
-	"${bullet} Run ID: ${SCALETEST_RUN_ID}"
-	"${app_urls[@]}"
-	"${params[@]}"
-)
-
-text=
-for field in "${text_arr[@]}"; do
-	text+="${field}"$'\n'
-done
-
-json=$(
-	jq -n --arg text "${text}" '{
-		blocks: [
-			{
-				"type": "section",
-				"text": {
-					"type": "mrkdwn",
-					"text": $text
-				}
-			}
-		]
-	}'
-)
-
-maybedryrun "${DRY_RUN}" curl -X POST -H 'Content-type: application/json' --data "${json}" "${SLACK_WEBHOOK_URL}"
diff --git a/examples/scaletests/scaletest-runner/scripts/run.sh b/examples/scaletests/scaletest-runner/scripts/run.sh
deleted file mode 100755
index 47a6042a18598..0000000000000
--- a/examples/scaletests/scaletest-runner/scripts/run.sh
+++ /dev/null
@@ -1,369 +0,0 @@
-#!/bin/bash
-set -euo pipefail
-
-[[ $VERBOSE == 1 ]] && set -x
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh"
-
-mapfile -t scaletest_load_scenarios < <(jq -r '. | join ("\n")' <<<"${SCALETEST_PARAM_LOAD_SCENARIOS}")
-export SCALETEST_PARAM_LOAD_SCENARIOS=("${scaletest_load_scenarios[@]}")
-
-log "Running scaletest..."
-set_status Running
-
-start_phase "Creating workspaces"
-if [[ ${SCALETEST_PARAM_SKIP_CREATE_WORKSPACES} == 0 ]]; then
-	# Note that we allow up to 5 failures to bring up the workspace, since
-	# we're creating a lot of workspaces at once and some of them may fail
-	# due to network issues or other transient errors.
-	coder exp scaletest create-workspaces \
-		--retry 5 \
-		--count "${SCALETEST_PARAM_NUM_WORKSPACES}" \
-		--template "${SCALETEST_PARAM_TEMPLATE}" \
-		--concurrency "${SCALETEST_PARAM_CREATE_CONCURRENCY}" \
-		--timeout 5h \
-		--job-timeout 5h \
-		--no-cleanup \
-		--output json:"${SCALETEST_RESULTS_DIR}/create-workspaces.json"
-	show_json "${SCALETEST_RESULTS_DIR}/create-workspaces.json"
-fi
-end_phase
-
-wait_baseline "${SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION}"
-
-non_greedy_agent_traffic_args=()
-if [[ ${SCALETEST_PARAM_GREEDY_AGENT} != 1 ]]; then
-	greedy_agent_traffic() { :; }
-else
-	echo "WARNING: Greedy agent enabled, this may cause the load tests to fail." >&2
-	non_greedy_agent_traffic_args=(
-		# Let the greedy agent traffic command be scraped.
-		# --scaletest-prometheus-address 0.0.0.0:21113
-		# --trace=false
-	)
-
-	annotate_grafana greedy_agent "Create greedy agent"
-
-	coder exp scaletest create-workspaces \
-		--count 1 \
-		--template "${SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE}" \
-		--concurrency 1 \
-		--timeout 5h \
-		--job-timeout 5h \
-		--no-cleanup \
-		--output json:"${SCALETEST_RESULTS_DIR}/create-workspaces-greedy-agent.json"
-
-	wait_baseline "${SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION}"
-
-	greedy_agent_traffic() {
-		local timeout=${1} scenario=${2}
-		# Run the greedy test for ~1/3 of the timeout.
-		delay=$((timeout * 60 / 3))
-
-		local type=web-terminal
-		args=()
-		if [[ ${scenario} == "SSH Traffic" ]]; then
-			type=ssh
-			args+=(--ssh)
-		fi
-
-		sleep "${delay}"
-		annotate_grafana greedy_agent "${scenario}: Greedy agent traffic"
-
-		# Produce load at about 1000MB/s (25MB/40ms).
-		set +e
-		coder exp scaletest workspace-traffic \
-			--template "${SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE}" \
-			--bytes-per-tick $((1024 * 1024 * 25)) \
-			--tick-interval 40ms \
-			--timeout "$((delay))s" \
-			--job-timeout "$((delay))s" \
-			--output json:"${SCALETEST_RESULTS_DIR}/traffic-${type}-greedy-agent.json" \
-			--scaletest-prometheus-address 0.0.0.0:21113 \
-			--trace=false \
-			"${args[@]}"
-		status=${?}
-		show_json "${SCALETEST_RESULTS_DIR}/traffic-${type}-greedy-agent.json"
-
-		export GRAFANA_ADD_TAGS=
-		if [[ ${status} != 0 ]]; then
-			GRAFANA_ADD_TAGS=error
-		fi
-		annotate_grafana_end greedy_agent "${scenario}: Greedy agent traffic"
-
-		return "${status}"
-	}
-fi
-
-run_scenario_cmd() {
-	local scenario=${1}
-	shift
-	local command=("$@")
-
-	set +e
-	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
-		annotate_grafana scenario "Load scenario: ${scenario}"
-	fi
-	"${command[@]}"
-	status=${?}
-	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
-		export GRAFANA_ADD_TAGS=
-		if [[ ${status} != 0 ]]; then
-			GRAFANA_ADD_TAGS=error
-		fi
-		annotate_grafana_end scenario "Load scenario: ${scenario}"
-	fi
-	exit "${status}"
-}
-
-declare -a pids=()
-declare -A pid_to_scenario=()
-declare -A failed=()
-target_start=0
-target_end=-1
-
-if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
-	start_phase "Load scenarios: ${SCALETEST_PARAM_LOAD_SCENARIOS[*]}"
-fi
-for scenario in "${SCALETEST_PARAM_LOAD_SCENARIOS[@]}"; do
-	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-		start_phase "Load scenario: ${scenario}"
-	fi
-
-	set +e
-	status=0
-	case "${scenario}" in
-	"SSH Traffic")
-		greedy_agent_traffic "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION}" "${scenario}" &
-		greedy_agent_traffic_pid=$!
-
-		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
-		target_end=$((target_start + target_count))
-		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
-			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
-			target_start=0
-			target_end=${target_count}
-		fi
-		run_scenario_cmd "${scenario}" coder exp scaletest workspace-traffic \
-			--template "${SCALETEST_PARAM_TEMPLATE}" \
-			--ssh \
-			--bytes-per-tick "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_BYTES_PER_TICK}" \
-			--tick-interval "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_TICK_INTERVAL}ms" \
-			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION}m" \
-			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_SSH_TRAFFIC_DURATION}m30s" \
-			--output json:"${SCALETEST_RESULTS_DIR}/traffic-ssh.json" \
-			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
-			--target-workspaces "${target_start}:${target_end}" \
-			"${non_greedy_agent_traffic_args[@]}" &
-		pids+=($!)
-		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-			wait "${pids[-1]}"
-			status=$?
-			show_json "${SCALETEST_RESULTS_DIR}/traffic-ssh.json"
-		else
-			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
-		fi
-		wait "${greedy_agent_traffic_pid}"
-		status2=$?
-		if [[ ${status} == 0 ]]; then
-			status=${status2}
-		fi
-		;;
-	"Web Terminal Traffic")
-		greedy_agent_traffic "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION}" "${scenario}" &
-		greedy_agent_traffic_pid=$!
-
-		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
-		target_end=$((target_start + target_count))
-		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
-			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
-			target_start=0
-			target_end=${target_count}
-		fi
-		run_scenario_cmd "${scenario}" coder exp scaletest workspace-traffic \
-			--template "${SCALETEST_PARAM_TEMPLATE}" \
-			--bytes-per-tick "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_BYTES_PER_TICK}" \
-			--tick-interval "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_TICK_INTERVAL}ms" \
-			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION}m" \
-			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_WEB_TERMINAL_TRAFFIC_DURATION}m30s" \
-			--output json:"${SCALETEST_RESULTS_DIR}/traffic-web-terminal.json" \
-			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
-			--target-workspaces "${target_start}:${target_end}" \
-			"${non_greedy_agent_traffic_args[@]}" &
-		pids+=($!)
-		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-			wait "${pids[-1]}"
-			status=$?
-			show_json "${SCALETEST_RESULTS_DIR}/traffic-web-terminal.json"
-		else
-			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
-		fi
-		wait "${greedy_agent_traffic_pid}"
-		status2=$?
-		if [[ ${status} == 0 ]]; then
-			status=${status2}
-		fi
-		;;
-	"App Traffic")
-		greedy_agent_traffic "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION}" "${scenario}" &
-		greedy_agent_traffic_pid=$!
-
-		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
-		target_end=$((target_start + target_count))
-		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
-			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
-			target_start=0
-			target_end=${target_count}
-		fi
-		run_scenario_cmd "${scenario}" coder exp scaletest workspace-traffic \
-			--template "${SCALETEST_PARAM_TEMPLATE}" \
-			--bytes-per-tick "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_BYTES_PER_TICK}" \
-			--tick-interval "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_TICK_INTERVAL}ms" \
-			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION}m" \
-			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_DURATION}m30s" \
-			--output json:"${SCALETEST_RESULTS_DIR}/traffic-app.json" \
-			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
-			--app "${SCALETEST_PARAM_LOAD_SCENARIO_APP_TRAFFIC_MODE}" \
-			--target-workspaces "${target_start}:${target_end}" \
-			"${non_greedy_agent_traffic_args[@]}" &
-		pids+=($!)
-		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-			wait "${pids[-1]}"
-			status=$?
-			show_json "${SCALETEST_RESULTS_DIR}/traffic-app.json"
-		else
-			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
-		fi
-		wait "${greedy_agent_traffic_pid}"
-		status2=$?
-		if [[ ${status} == 0 ]]; then
-			status=${status2}
-		fi
-		;;
-	"Dashboard Traffic")
-		target_count=$(jq -n --argjson percentage "${SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_PERCENTAGE}" --argjson num_workspaces "${SCALETEST_PARAM_NUM_WORKSPACES}" '$percentage / 100 * $num_workspaces | floor')
-		target_end=$((target_start + target_count))
-		if [[ ${target_end} -gt ${SCALETEST_PARAM_NUM_WORKSPACES} ]]; then
-			log "WARNING: Target count ${target_end} exceeds number of workspaces ${SCALETEST_PARAM_NUM_WORKSPACES}, using ${SCALETEST_PARAM_NUM_WORKSPACES} instead."
-			target_start=0
-			target_end=${target_count}
-		fi
-		# TODO: Remove this once the dashboard traffic command is fixed,
-		# (i.e. once images are no longer dumped into PWD).
-		mkdir -p dashboard
-		pushd dashboard
-		run_scenario_cmd "${scenario}" coder exp scaletest dashboard \
-			--timeout "${SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_DURATION}m" \
-			--job-timeout "${SCALETEST_PARAM_LOAD_SCENARIO_DASHBOARD_TRAFFIC_DURATION}m30s" \
-			--output json:"${SCALETEST_RESULTS_DIR}/traffic-dashboard.json" \
-			--scaletest-prometheus-address "0.0.0.0:${SCALETEST_PROMETHEUS_START_PORT}" \
-			--target-users "${target_start}:${target_end}" \
-			>"${SCALETEST_RESULTS_DIR}/traffic-dashboard-output.log" &
-		pids+=($!)
-		popd
-		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-			wait "${pids[-1]}"
-			status=$?
-			show_json "${SCALETEST_RESULTS_DIR}/traffic-dashboard.json"
-		else
-			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
-		fi
-		;;
-
-	# Debug scenarios, for testing the runner.
-	"debug:greedy_agent_traffic")
-		greedy_agent_traffic 10 "${scenario}" &
-		pids+=($!)
-		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-			wait "${pids[-1]}"
-			status=$?
-		else
-			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
-		fi
-		;;
-	"debug:success")
-		{
-			maybedryrun "$DRY_RUN" sleep 10
-			true
-		} &
-		pids+=($!)
-		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-			wait "${pids[-1]}"
-			status=$?
-		else
-			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
-		fi
-		;;
-	"debug:error")
-		{
-			maybedryrun "$DRY_RUN" sleep 10
-			false
-		} &
-		pids+=($!)
-		if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 0 ]]; then
-			wait "${pids[-1]}"
-			status=$?
-		else
-			SCALETEST_PROMETHEUS_START_PORT=$((SCALETEST_PROMETHEUS_START_PORT + 1))
-		fi
-		;;
-
-	*)
-		log "WARNING: Unknown load scenario: ${scenario}, skipping..."
-		;;
-	esac
-	set -e
-
-	# Allow targeting to be distributed evenly across workspaces when each
-	# scenario is run concurrently and all percentages add up to 100.
-	target_start=${target_end}
-
-	if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
-		pid_to_scenario+=(["${pids[-1]}"]="${scenario}")
-		# Stagger the start of each scenario to avoid a burst of load and deted
-		# problematic scenarios.
-		sleep $((SCALETEST_PARAM_LOAD_SCENARIO_CONCURRENCY_STAGGER_DELAY_MINS * 60))
-		continue
-	fi
-
-	if ((status > 0)); then
-		log "Load scenario failed: ${scenario} (exit=${status})"
-		failed+=(["${scenario}"]="${status}")
-		PHASE_ADD_TAGS=error end_phase
-	else
-		end_phase
-	fi
-
-	wait_baseline "${SCALETEST_PARAM_LOAD_SCENARIO_BASELINE_DURATION}"
-done
-if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]]; then
-	wait "${pids[@]}"
-	# Wait on all pids will wait until all have exited, but we need to
-	# check their individual exit codes.
-	for pid in "${pids[@]}"; do
-		wait "${pid}"
-		status=${?}
-		scenario=${pid_to_scenario[${pid}]}
-		if ((status > 0)); then
-			log "Load scenario failed: ${scenario} (exit=${status})"
-			failed+=(["${scenario}"]="${status}")
-		fi
-	done
-	if ((${#failed[@]} > 0)); then
-		PHASE_ADD_TAGS=error end_phase
-	else
-		end_phase
-	fi
-fi
-
-if ((${#failed[@]} > 0)); then
-	log "Load scenarios failed: ${!failed[*]}"
-	for scenario in "${!failed[@]}"; do
-		log "  ${scenario}: exit=${failed[$scenario]}"
-	done
-	exit 1
-fi
-
-log "Scaletest complete!"
-set_status Complete
diff --git a/examples/scaletests/scaletest-runner/shutdown.sh b/examples/scaletests/scaletest-runner/shutdown.sh
deleted file mode 100755
index 9e75864d73120..0000000000000
--- a/examples/scaletests/scaletest-runner/shutdown.sh
+++ /dev/null
@@ -1,30 +0,0 @@
-#!/bin/bash
-set -e
-
-[[ $VERBOSE == 1 ]] && set -x
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh"
-
-cleanup() {
-	coder tokens remove scaletest_runner >/dev/null 2>&1 || true
-	rm -f "${CODER_CONFIG_DIR}/session"
-}
-trap cleanup EXIT
-
-annotate_grafana "workspace" "Agent stopping..."
-
-shutdown_event=shutdown_scale_down_only
-if [[ ${SCALETEST_PARAM_CLEANUP_STRATEGY} == on_stop ]]; then
-	shutdown_event=shutdown
-fi
-"${SCRIPTS_DIR}/cleanup.sh" "${shutdown_event}"
-
-annotate_grafana_end "workspace" "Agent running"
-
-appearance_json="$(get_appearance)"
-service_banner_message=$(jq -r '.service_banner.message' <<<"${appearance_json}")
-service_banner_message="${service_banner_message/% | */}"
-service_banner_color="#4CD473" # Green.
-
-set_appearance "${appearance_json}" "${service_banner_color}" "${service_banner_message}"
diff --git a/examples/scaletests/scaletest-runner/startup.sh b/examples/scaletests/scaletest-runner/startup.sh
deleted file mode 100755
index 3e4eb94f41810..0000000000000
--- a/examples/scaletests/scaletest-runner/startup.sh
+++ /dev/null
@@ -1,181 +0,0 @@
-#!/bin/bash
-set -euo pipefail
-
-[[ $VERBOSE == 1 ]] && set -x
-
-if [[ ${SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE} == "${SCALETEST_PARAM_TEMPLATE}" ]]; then
-	echo "ERROR: Greedy agent template must be different from the scaletest template." >&2
-	exit 1
-fi
-
-if [[ ${SCALETEST_PARAM_LOAD_SCENARIO_RUN_CONCURRENTLY} == 1 ]] && [[ ${SCALETEST_PARAM_GREEDY_AGENT} == 1 ]]; then
-	echo "ERROR: Load scenario concurrency and greedy agent test cannot be enabled at the same time." >&2
-	exit 1
-fi
-
-# Unzip scripts and add to path.
-# shellcheck disable=SC2153
-echo "Extracting scaletest scripts into ${SCRIPTS_DIR}..."
-base64 -d <<<"${SCRIPTS_ZIP}" >/tmp/scripts.zip
-rm -rf "${SCRIPTS_DIR}" || true
-mkdir -p "${SCRIPTS_DIR}"
-unzip -o /tmp/scripts.zip -d "${SCRIPTS_DIR}"
-# Chmod to work around https://github.com/coder/coder/issues/10034
-chmod +x "${SCRIPTS_DIR}"/*.sh
-rm /tmp/scripts.zip
-
-echo "Cloning coder/coder repo..."
-if [[ ! -d "${HOME}/coder" ]]; then
-	git clone https://github.com/coder/coder.git "${HOME}/coder"
-fi
-(cd "${HOME}/coder" && git fetch -a && git checkout "${SCALETEST_PARAM_REPO_BRANCH}" && git pull)
-
-# Store the input parameters (for debugging).
-env | grep "^SCALETEST_" | sort >"${SCALETEST_RUN_DIR}/environ.txt"
-
-# shellcheck disable=SC2153 source=scaletest/templates/scaletest-runner/scripts/lib.sh
-. "${SCRIPTS_DIR}/lib.sh"
-
-appearance_json="$(get_appearance)"
-service_banner_message=$(jq -r '.service_banner.message' <<<"${appearance_json}")
-service_banner_message="${service_banner_message/% | */}"
-service_banner_color="#D65D0F" # Orange.
-
-annotate_grafana "workspace" "Agent running" # Ended in shutdown.sh.
-
-{
-	pids=()
-	ports=()
-	declare -A pods=()
-	next_port=6061
-	for pod in $(kubectl get pods -l app.kubernetes.io/name=coder -o jsonpath='{.items[*].metadata.name}'); do
-		maybedryrun "${DRY_RUN}" kubectl -n coder-big port-forward "${pod}" "${next_port}:6060" &
-		pids+=($!)
-		ports+=("${next_port}")
-		pods[${next_port}]="${pod}"
-		next_port=$((next_port + 1))
-	done
-
-	trap 'trap - EXIT; kill -INT "${pids[@]}"; exit 1' INT EXIT
-
-	while :; do
-		# Sleep for short periods of time so that we can exit quickly.
-		# This adds up to ~300 when accounting for profile and trace.
-		for ((i = 0; i < 285; i++)); do
-			sleep 1
-		done
-		log "Grabbing pprof dumps"
-		start="$(date +%s)"
-		annotate_grafana "pprof" "Grab pprof dumps (start=${start})"
-		for type in allocs block heap goroutine mutex 'profile?seconds=10' 'trace?seconds=5'; do
-			for port in "${ports[@]}"; do
-				tidy_type="${type//\?/_}"
-				tidy_type="${tidy_type//=/_}"
-				maybedryrun "${DRY_RUN}" curl -sSL --output "${SCALETEST_PPROF_DIR}/pprof-${tidy_type}-${pods[${port}]}-${start}.gz" "http://localhost:${port}/debug/pprof/${type}"
-			done
-		done
-		annotate_grafana_end "pprof" "Grab pprof dumps (start=${start})"
-	done
-} &
-pprof_pid=$!
-
-logs_gathered=0
-gather_logs() {
-	if ((logs_gathered == 1)); then
-		return
-	fi
-	logs_gathered=1
-
-	# Gather logs from all coderd and provisioner instances, and all workspaces.
-	annotate_grafana "logs" "Gather logs"
-	podsraw="$(
-		kubectl -n coder-big get pods -l app.kubernetes.io/name=coder -o name
-		kubectl -n coder-big get pods -l app.kubernetes.io/name=coder-provisioner -o name || true
-		kubectl -n coder-big get pods -l app.kubernetes.io/name=coder-workspace -o name | grep "^pod/scaletest-" || true
-	)"
-	mapfile -t pods <<<"${podsraw}"
-	for pod in "${pods[@]}"; do
-		pod_name="${pod#pod/}"
-		kubectl -n coder-big logs "${pod}" --since-time="${SCALETEST_RUN_START_TIME}" >"${SCALETEST_LOGS_DIR}/${pod_name}.txt"
-	done
-	annotate_grafana_end "logs" "Gather logs"
-}
-
-set_appearance "${appearance_json}" "${service_banner_color}" "${service_banner_message} | Scaletest running: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE})!"
-
-# Show failure in the UI if script exits with error.
-on_exit() {
-	code=${?}
-	trap - ERR EXIT
-	set +e
-
-	kill -INT "${pprof_pid}"
-
-	message_color="#4CD473" # Green.
-	message_status=COMPLETE
-	if ((code > 0)); then
-		message_color="#D94A5D" # Red.
-		message_status=FAILED
-	fi
-
-	# In case the test failed before gathering logs, gather them before
-	# cleaning up, whilst the workspaces are still present.
-	gather_logs
-
-	case "${SCALETEST_PARAM_CLEANUP_STRATEGY}" in
-	on_stop)
-		# Handled by shutdown script.
-		;;
-	on_success)
-		if ((code == 0)); then
-			set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE}), cleaning up..."
-			"${SCRIPTS_DIR}/cleanup.sh" "${SCALETEST_PARAM_CLEANUP_STRATEGY}"
-		fi
-		;;
-	on_error)
-		if ((code > 0)); then
-			set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE}), cleaning up..."
-			"${SCRIPTS_DIR}/cleanup.sh" "${SCALETEST_PARAM_CLEANUP_STRATEGY}"
-		fi
-		;;
-	*)
-		set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE}), cleaning up..."
-		"${SCRIPTS_DIR}/cleanup.sh" "${SCALETEST_PARAM_CLEANUP_STRATEGY}"
-		;;
-	esac
-
-	set_appearance "${appearance_json}" "${message_color}" "${service_banner_message} | Scaletest ${message_status}: [${CODER_USER}/${CODER_WORKSPACE}](${CODER_URL}/@${CODER_USER}/${CODER_WORKSPACE})!"
-
-	annotate_grafana_end "" "Start scaletest: ${SCALETEST_COMMENT}"
-
-	wait "${pprof_pid}"
-	exit "${code}"
-}
-trap on_exit EXIT
-
-on_err() {
-	code=${?}
-	trap - ERR
-	set +e
-
-	log "Scaletest failed!"
-	GRAFANA_EXTRA_TAGS=error set_status "Failed (exit=${code})"
-	"${SCRIPTS_DIR}/report.sh" failed
-	lock_status # Ensure we never rewrite the status after a failure.
-
-	exit "${code}"
-}
-trap on_err ERR
-
-# Pass session token since `prepare.sh` has not yet run.
-CODER_SESSION_TOKEN=$CODER_USER_TOKEN "${SCRIPTS_DIR}/report.sh" started
-annotate_grafana "" "Start scaletest: ${SCALETEST_COMMENT}"
-
-"${SCRIPTS_DIR}/prepare.sh"
-
-"${SCRIPTS_DIR}/run.sh"
-
-# Gather logs before ending the test.
-gather_logs
-
-"${SCRIPTS_DIR}/report.sh" completed
diff --git a/scaletest/templates/scaletest-runner/main.tf b/scaletest/templates/scaletest-runner/main.tf
index 42fa785cc4732..2d17c66435f62 100644
--- a/scaletest/templates/scaletest-runner/main.tf
+++ b/scaletest/templates/scaletest-runner/main.tf
@@ -44,7 +44,7 @@ locals {
   scaletest_run_id                               = "scaletest-${replace(time_static.start_time.rfc3339, ":", "-")}"
   scaletest_run_dir                              = "/home/coder/${local.scaletest_run_id}"
   scaletest_run_start_time                       = time_static.start_time.rfc3339
-  grafana_url                                    = "https://stats.dev.c8s.io"
+  grafana_url                                    = "https://grafana.corp.tld"
   grafana_dashboard_uid                          = "qLVSTR-Vz"
   grafana_dashboard_name                         = "coderv2-loadtest-dashboard"
 }
@@ -625,6 +625,8 @@ resource "coder_agent" "main" {
     vscode     = false
     ssh_helper = false
   }
+  startup_script_timeout  = 86400
+  shutdown_script_timeout = 7200
   startup_script_behavior = "blocking"
   startup_script          = file("startup.sh")
   shutdown_script         = file("shutdown.sh")
@@ -734,10 +736,9 @@ resource "coder_app" "prometheus" {
   agent_id     = coder_agent.main.id
   slug         = "01-prometheus"
   display_name = "Prometheus"
-  // https://stats.dev.c8s.io:9443/classic/graph?g0.range_input=2h&g0.end_input=2023-09-08%2015%3A58&g0.stacked=0&g0.expr=rate(pg_stat_database_xact_commit%7Bcluster%3D%22big%22%2Cdatname%3D%22big-coder%22%7D%5B1m%5D)&g0.tab=0
-  url      = "https://stats.dev.c8s.io:9443"
-  icon     = "https://prometheus.io/assets/favicons/favicon-32x32.png"
-  external = true
+  url          = "https://grafana.corp.tld:9443"
+  icon         = "https://prometheus.io/assets/favicons/favicon-32x32.png"
+  external     = true
 }
 
 resource "coder_app" "manual_cleanup" {

From 4fc71432f05699ff1d9476640734abdd799aef3d Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Wed, 20 Mar 2024 17:49:23 +0100
Subject: [PATCH 09/21] command

---
 docs/admin/scale.md | 92 ++++++++++++++++++++++++++++-----------------
 1 file changed, 57 insertions(+), 35 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index dd018b98562ec..f16efaf2d2086 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -10,7 +10,9 @@ Learn more about [Coder’s architecture](../about/architecture.md) and our
 ## Recent scale tests
 
 > Note: the below information is for reference purposes only, and are not
-> intended to be used as guidelines for infrastructure sizing.
+> intended to be used as guidelines for infrastructure sizing. Review the
+> [Reference Architectures](architectures/index.md) for hardware sizing
+> recommendations.
 
 | Environment      | Coder CPU | Coder RAM | Coder Replicas | Database          | Users | Concurrent builds | Concurrent connections (Terminal/SSH) | Coder Version | Last tested  |
 | ---------------- | --------- | --------- | -------------- | ----------------- | ----- | ----------------- | ------------------------------------- | ------------- | ------------ |
@@ -29,58 +31,76 @@ Since Coder's performance is highly dependent on the templates and workflows you
 support, you may wish to use our internal scale testing utility against your own
 environments.
 
-> Note: This utility is intended for internal use only. It is not subject to any
-> compatibility guarantees, and may cause interruptions for your users. To avoid
-> potential outages and orphaned resources, we recommend running scale tests on
-> a secondary "staging" environment. Run it against a production environment at
-> your own risk.
+> Note: This utility is experimental. It is not subject to any compatibility
+> guarantees, and may cause interruptions for your users. To avoid potential
+> outages and orphaned resources, we recommend running scale tests on a
+> secondary "staging" environment or a dedicated
+> [Kubernetes playground cluster](https://github.com/coder/coder/tree/main/scaletest/templates).
+> Run it against a production environment at your own risk.
 
-### Workspace Creation
+### Create workspaces
 
-The following command will run our scale test against your own Coder deployment.
-You can also specify a template name and any parameter values.
+The following command will provision a number of Coder workspaces using the
+specified template and extra parameters.
 
 ```shell
 coder exp scaletest create-workspaces \
-    --count 1000 \
-    --template "kubernetes" \
-    --concurrency 0 \
-    --cleanup-concurrency 0 \
-    --parameter "home_disk_size=10" \
-    --run-command "sleep 2 && echo hello"
+		--retry 5 \
+		--count "${SCALETEST_PARAM_NUM_WORKSPACES}" \
+		--template "${SCALETEST_PARAM_TEMPLATE}" \
+		--concurrency "${SCALETEST_PARAM_CREATE_CONCURRENCY}" \
+		--timeout 5h \
+		--job-timeout 5h \
+		--no-cleanup \
+		--output json:"${SCALETEST_RESULTS_DIR}/create-workspaces.json"
 
 # Run `coder exp scaletest create-workspaces --help` for all usage
 ```
 
-The test does the following:
+The command does the following:
 
-1. create `1000` workspaces
-1. establish SSH connection to each workspace
-1. run `sleep 3 && echo hello` on each workspace via the web terminal
-1. close connections, attempt to delete all workspaces
-1. return results (e.g. `998 succeeded, 2 failed to connect`)
-
-Concurrency is configurable. `concurrency 0` means the scaletest test will
-attempt to create & connect to all workspaces immediately.
-
-If you wish to leave the workspaces running for a period of time, you can
-specify `--no-cleanup` to skip the cleanup step. You are responsible for
-deleting these resources later.
+1. Create `${SCALETEST_PARAM_NUM_WORKSPACES}` workspaces concurrently
+   (concurrency level: `${SCALETEST_PARAM_CREATE_CONCURRENCY}`) using the
+   template `${SCALETEST_PARAM_TEMPLATE}`.
+1. Leave workspaces running to use in next steps (`--no-cleanup` option).
+1. Store provisioning results in JSON format.
+1. If you don't want the creation process to be interrupted by any errors, use
+   the `--retry 5` flag.
 
 ### Traffic Generation
 
 Given an existing set of workspaces created previously with `create-workspaces`,
-the following command will generate traffic similar to that of Coder's web
-terminal against those workspaces.
+the following command will generate traffic similar to that of Coder's Web
+Terminal against those workspaces.
 
 ```shell
+# Produce load at about 1000MB/s (25MB/40ms).
 coder exp scaletest workspace-traffic \
-    --byes-per-tick 128 \
-    --tick-interval 100ms \
-    --concurrency 0
+	--template "${SCALETEST_PARAM_GREEDY_AGENT_TEMPLATE}" \
+	--bytes-per-tick $((1024 * 1024 * 25)) \
+	--tick-interval 40ms \
+	--timeout "$((delay))s" \
+	--job-timeout "$((delay))s" \
+	--scaletest-prometheus-address 0.0.0.0:21113 \
+  --target-workspaces "0:100" \
+	--trace=false \
+  --output json:"${SCALETEST_RESULTS_DIR}/traffic-${type}-greedy-agent.json"
 ```
 
-To generate SSH traffic, add the `--ssh` flag.
+Traffic generation can be parametrized:
+
+1. Send `bytes-per-tick` every `tick-interval`.
+1. Enable tracing for performance debugging.
+1. Target a range of workspaces with `--target-workspaces 0:100`.
+1. For dashboard traffic: Target a range of users with `--target-users 0:100`.
+1. Store provisioning results in JSON format.
+
+The `workspace-traffic` supports also other modes - SSH traffic, workspace app:
+
+1. For SSH traffic: Use `--ssh` flag to generate SSH traffic instead of Web
+   Terminal.
+1. For workspace app traffic: Use `--app [wsdi|wsec|wsra]` flag to select app
+   behavior. (modes: _WebSocket discard_, _WebSocket echo_, _WebSocket read_).
 
 ### Cleanup
 
@@ -88,7 +108,9 @@ The scaletest utility will attempt to clean up all workspaces it creates. If you
 wish to clean up all workspaces, you can run the following command:
 
 ```shell
-coder exp scaletest cleanup
+coder exp scaletest cleanup \
+	--cleanup-job-timeout 2h \
+	--cleanup-timeout 15min
 ```
 
 This will delete all workspaces and users with the prefix `scaletest-`.

From feb8f9f5965500207cfe2012730cd8eccb3942fe Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Thu, 21 Mar 2024 12:32:36 +0100
Subject: [PATCH 10/21] WIP

---
 docs/admin/scale.md | 45 ++++++++++++++++++++++++++++++++++++++++++---
 1 file changed, 42 insertions(+), 3 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index f16efaf2d2086..a8c5a00efcf57 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -117,15 +117,54 @@ This will delete all workspaces and users with the prefix `scaletest-`.
 
 ## Scale testing template
 
-TODO
+Besides the CLI utility, consider using a dedicated
+[scaletest-runner](https://github.com/coder/coder/tree/main/scaletest/templates/scaletest-runner)
+template for testing large scale Kubernetes clusters.
+
+The template deploys a main workspace with scripts used to orchestrate Coder to
+create workspaces, generate workspace traffic, or load tests workspace apps.
 
 ### Parameters
 
-TODO
+The _scaletest-runner_ offers the following configuration options:
+
+- workspace template selecting Kubernetes cluster size:
+  minimal/small/medium/large (_default_: minimal)
+- number of workspaces
+- wait duration between scenarios or staggered approach
+
+The template exposes parameters to control the traffic dimensions for SSH
+connections, workspace apps, and dashboard tests:
+
+- traffic duration of the load test scenario
+- traffic percentage of targeted workspaces
+- bytes per tick and tick interval
+- _For workspace apps_: modes (echo, read random data, or write and discard)
+
+Scale testing concurrency can be controlled with the following parameters:
+
+- enable parallel scenarios - interleave different traffic patterns (SSH,
+  workspace apps, dashboard traffic, etc.)
+- workspace creation concurrency level (_default_: 10)
+- job concurrency level - generate workspace traffic using multiple jobs
+  (_default_: 0)
+- cleanup concurrency level
 
 ### Kubernetes cluster
 
-TODO
+Depending on the traffic projections, operators can deploy different sample
+clusters to perform scale tests. It is recommend to learn how to operate the
+scaletest-runner before running it against the staging cluster (or production at
+your own risk).
+
+There are a few cluster options available:
+
+- minimal
+- small
+- medium
+- large
+
+TODO greedy
 
 ### Observability
 

From 01c4297b8fd1595ae7bea33045cf5ab66cb6dedd Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Thu, 21 Mar 2024 13:05:52 +0100
Subject: [PATCH 11/21] Clusters

---
 docs/admin/scale.md                           | 53 +++++++++++--------
 .../templates/kubernetes-minimal/README.md    |  2 +-
 2 files changed, 33 insertions(+), 22 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index a8c5a00efcf57..b8ea79dad4c3d 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -117,54 +117,65 @@ This will delete all workspaces and users with the prefix `scaletest-`.
 
 ## Scale testing template
 
-Besides the CLI utility, consider using a dedicated
+Consider using a dedicated
 [scaletest-runner](https://github.com/coder/coder/tree/main/scaletest/templates/scaletest-runner)
-template for testing large scale Kubernetes clusters.
+template alongside the CLI utility for testing large-scale Kubernetes clusters.
 
-The template deploys a main workspace with scripts used to orchestrate Coder to
-create workspaces, generate workspace traffic, or load tests workspace apps.
+The template deploys a main workspace with scripts used to orchestrate Coder,
+creating workspaces, generating workspace traffic, or load-testing workspace
+apps.
 
 ### Parameters
 
 The _scaletest-runner_ offers the following configuration options:
 
-- workspace template selecting Kubernetes cluster size:
+- Workspace template selecting Kubernetes cluster size:
   minimal/small/medium/large (_default_: minimal)
-- number of workspaces
-- wait duration between scenarios or staggered approach
+- Number of workspaces
+- Wait duration between scenarios or staggered approach
 
 The template exposes parameters to control the traffic dimensions for SSH
 connections, workspace apps, and dashboard tests:
 
-- traffic duration of the load test scenario
-- traffic percentage of targeted workspaces
-- bytes per tick and tick interval
+- Traffic duration of the load test scenario
+- Traffic percentage of targeted workspaces
+- Bytes per tick and tick interval
 - _For workspace apps_: modes (echo, read random data, or write and discard)
 
 Scale testing concurrency can be controlled with the following parameters:
 
-- enable parallel scenarios - interleave different traffic patterns (SSH,
+- Enable parallel scenarios - interleave different traffic patterns (SSH,
   workspace apps, dashboard traffic, etc.)
-- workspace creation concurrency level (_default_: 10)
-- job concurrency level - generate workspace traffic using multiple jobs
+- Workspace creation concurrency level (_default_: 10)
+- Job concurrency level - generate workspace traffic using multiple jobs
   (_default_: 0)
-- cleanup concurrency level
+- Cleanup concurrency level
 
 ### Kubernetes cluster
 
 Depending on the traffic projections, operators can deploy different sample
-clusters to perform scale tests. It is recommend to learn how to operate the
+clusters to perform scale tests. It is recommended to learn how to operate the
 scaletest-runner before running it against the staging cluster (or production at
 your own risk).
 
-There are a few cluster options available:
+There are a few cluster options
+[available](https://github.com/coder/coder/tree/main/scaletest/templates/scaletest-runner):
 
-- minimal
-- small
-- medium
-- large
+| Cluster size | vCPU | Memory | Persisted storage | Details                                               |
+| ------------ | ---- | ------ | ----------------- | ----------------------------------------------------- |
+| minimal      | 1    | 2 Gi   | None              |                                                       |
+| small        | 1    | 1 Gi   | None              |                                                       |
+| medium       | 2    | 2 Gi   | None              | Medium-sized cluster offers the greedy agent variant. |
+| large        | 4    | 4 Gi   | None              |                                                       |
 
-TODO greedy
+#### Greedy agent
+
+The greedy agent variant is a template modification that forces the Coder agent
+to transmit large metadata (size: 4K) while emitting stats. The transmission of
+large chunks puts extra overhead on coderd instances and agents while processing
+and storing the data.
+
+Use this template variant to verify limits of the cluster performance.
 
 ### Observability
 
diff --git a/scaletest/templates/kubernetes-minimal/README.md b/scaletest/templates/kubernetes-minimal/README.md
index c56d3d477f821..a4e76f8b24611 100644
--- a/scaletest/templates/kubernetes-minimal/README.md
+++ b/scaletest/templates/kubernetes-minimal/README.md
@@ -1,5 +1,5 @@
 # kubernetes-minimal
 
-Provisions a medium-sized workspace with no persistent storage. Greedy agent variant.
+Provisions a minimal-sized workspace with no persistent storage.
 
 _Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`

From e9b7803e6d15dee9cba420afe0a76b11a7284c9e Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Thu, 21 Mar 2024 13:07:31 +0100
Subject: [PATCH 12/21] no pod monitor

---
 .../kubernetes-with-podmonitor/README.md      |  98 -----
 .../kubernetes-with-podmonitor/main.tf        | 362 ------------------
 2 files changed, 460 deletions(-)
 delete mode 100644 scaletest/templates/kubernetes-with-podmonitor/README.md
 delete mode 100644 scaletest/templates/kubernetes-with-podmonitor/main.tf

diff --git a/scaletest/templates/kubernetes-with-podmonitor/README.md b/scaletest/templates/kubernetes-with-podmonitor/README.md
deleted file mode 100644
index 6c04af8ea6a63..0000000000000
--- a/scaletest/templates/kubernetes-with-podmonitor/README.md
+++ /dev/null
@@ -1,98 +0,0 @@
----
-name: Develop in Kubernetes
-description: Get started with Kubernetes development.
-tags: [cloud, kubernetes]
-icon: /icon/k8s.png
----
-
-# Getting started
-
-This template creates a pod running the `codercom/enterprise-base:ubuntu` image.
-
-## Authentication
-
-This template can authenticate using in-cluster authentication, or using a kubeconfig local to the
-Coder host. For additional authentication options, consult the [Kubernetes provider
-documentation](https://registry.terraform.io/providers/hashicorp/kubernetes/latest/docs).
-
-### kubeconfig on Coder host
-
-If the Coder host has a local `~/.kube/config`, you can use this to authenticate
-with Coder. Make sure this is done with same user that's running the `coder` service.
-
-To use this authentication, set the parameter `use_kubeconfig` to true.
-
-### In-cluster authentication
-
-If the Coder host runs in a Pod on the same Kubernetes cluster as you are creating workspaces in,
-you can use in-cluster authentication.
-
-To use this authentication, set the parameter `use_kubeconfig` to false.
-
-The Terraform provisioner will automatically use the service account associated with the pod to
-authenticate to Kubernetes. Be sure to bind a [role with appropriate permission](#rbac) to the
-service account. For example, assuming the Coder host runs in the same namespace as you intend
-to create workspaces:
-
-```yaml
-apiVersion: v1
-kind: ServiceAccount
-metadata:
-  name: coder
-
----
-apiVersion: rbac.authorization.k8s.io/v1
-kind: RoleBinding
-metadata:
-  name: coder
-subjects:
-  - kind: ServiceAccount
-    name: coder
-roleRef:
-  kind: Role
-  name: coder
-  apiGroup: rbac.authorization.k8s.io
-```
-
-Then start the Coder host with `serviceAccountName: coder` in the pod spec.
-
-### Authenticate against external clusters
-
-You may want to deploy workspaces on a cluster outside of the Coder control plane. Refer to the [Coder docs](https://coder.com/docs/v2/latest/platforms/kubernetes/additional-clusters) to learn how to modify your template to authenticate against external clusters.
-
-## Namespace
-
-The target namespace in which the pod will be deployed is defined via the `coder_workspace`
-variable. The namespace must exist prior to creating workspaces.
-
-## Persistence
-
-The `/home/coder` directory in this example is persisted via the attached PersistentVolumeClaim.
-Any data saved outside of this directory will be wiped when the workspace stops.
-
-Since most binary installations and environment configurations live outside of
-the `/home` directory, we suggest including these in the `startup_script` argument
-of the `coder_agent` resource block, which will run each time the workspace starts up.
-
-For example, when installing the `aws` CLI, the install script will place the
-`aws` binary in `/usr/local/bin/aws`. To ensure the `aws` CLI is persisted across
-workspace starts/stops, include the following code in the `coder_agent` resource
-block of your workspace template:
-
-```terraform
-resource "coder_agent" "main" {
-  startup_script = <<-EOT
-    set -e
-    # install AWS CLI
-    curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip"
-    unzip awscliv2.zip
-    sudo ./aws/install
-  EOT
-}
-```
-
-## code-server
-
-`code-server` is installed via the `startup_script` argument in the `coder_agent`
-resource block. The `coder_app` resource is defined to access `code-server` through
-the dashboard UI over `localhost:13337`.
diff --git a/scaletest/templates/kubernetes-with-podmonitor/main.tf b/scaletest/templates/kubernetes-with-podmonitor/main.tf
deleted file mode 100644
index 722cbe71f7692..0000000000000
--- a/scaletest/templates/kubernetes-with-podmonitor/main.tf
+++ /dev/null
@@ -1,362 +0,0 @@
-terraform {
-  required_providers {
-    coder = {
-      source  = "coder/coder"
-      version = "~> 0.7.0"
-    }
-    kubernetes = {
-      source  = "hashicorp/kubernetes"
-      version = "~> 2.18"
-    }
-  }
-}
-
-provider "coder" {
-}
-
-variable "use_kubeconfig" {
-  type        = bool
-  description = <<-EOF
-  Use host kubeconfig? (true/false)
-
-  Set this to false if the Coder host is itself running as a Pod on the same
-  Kubernetes cluster as you are deploying workspaces to.
-
-  Set this to true if the Coder host is running outside the Kubernetes cluster
-  for workspaces.  A valid "~/.kube/config" must be present on the Coder host.
-  EOF
-  default     = false
-}
-
-variable "namespace" {
-  type        = string
-  description = "The Kubernetes namespace to create workspaces in (must exist prior to creating workspaces)"
-}
-
-data "coder_parameter" "cpu" {
-  name         = "cpu"
-  display_name = "CPU"
-  description  = "The number of CPU cores"
-  default      = "2"
-  icon         = "/icon/memory.svg"
-  mutable      = true
-  option {
-    name  = "2 Cores"
-    value = "2"
-  }
-  option {
-    name  = "4 Cores"
-    value = "4"
-  }
-  option {
-    name  = "6 Cores"
-    value = "6"
-  }
-  option {
-    name  = "8 Cores"
-    value = "8"
-  }
-}
-
-data "coder_parameter" "memory" {
-  name         = "memory"
-  display_name = "Memory"
-  description  = "The amount of memory in GB"
-  default      = "2"
-  icon         = "/icon/memory.svg"
-  mutable      = true
-  option {
-    name  = "2 GB"
-    value = "2"
-  }
-  option {
-    name  = "4 GB"
-    value = "4"
-  }
-  option {
-    name  = "6 GB"
-    value = "6"
-  }
-  option {
-    name  = "8 GB"
-    value = "8"
-  }
-  option {
-    name  = "16 GB"
-    value = "16"
-  }
-  option {
-    name  = "24 GB"
-    value = "24"
-  }
-}
-
-data "coder_parameter" "home_disk_size" {
-  name         = "home_disk_size"
-  display_name = "Home disk size"
-  description  = "The size of the home disk in GB"
-  default      = "10"
-  type         = "number"
-  icon         = "/emojis/1f4be.png"
-  mutable      = false
-  validation {
-    min = 1
-    max = 99999
-  }
-}
-
-provider "kubernetes" {
-  # Authenticate via ~/.kube/config or a Coder-specific ServiceAccount, depending on admin preferences
-  config_path = var.use_kubeconfig == true ? "~/.kube/config" : null
-}
-
-data "coder_workspace" "me" {}
-
-resource "coder_agent" "main" {
-  os                     = "linux"
-  arch                   = "amd64"
-  startup_script_timeout = 180
-  startup_script         = <<-EOT
-    set -e
-
-    # install and start code-server
-    curl -fsSL https://code-server.dev/install.sh | sh -s -- --method=standalone --prefix=/tmp/code-server --version 4.11.0
-    /tmp/code-server/bin/code-server --auth none --port 13337 >/tmp/code-server.log 2>&1 &
-  EOT
-
-  # The following metadata blocks are optional. They are used to display
-  # information about your workspace in the dashboard. You can remove them
-  # if you don't want to display any information.
-  # For basic resources, you can use the `coder stat` command.
-  # If you need more control, you can write your own script.
-  metadata {
-    display_name = "CPU Usage"
-    key          = "0_cpu_usage"
-    script       = "coder stat cpu"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "RAM Usage"
-    key          = "1_ram_usage"
-    script       = "coder stat mem"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Home Disk"
-    key          = "3_home_disk"
-    script       = "coder stat disk --path $${HOME}"
-    interval     = 60
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "CPU Usage (Host)"
-    key          = "4_cpu_usage_host"
-    script       = "coder stat cpu --host"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Memory Usage (Host)"
-    key          = "5_mem_usage_host"
-    script       = "coder stat mem --host"
-    interval     = 10
-    timeout      = 1
-  }
-
-  metadata {
-    display_name = "Load Average (Host)"
-    key          = "6_load_host"
-    # get load avg scaled by number of cores
-    script   = <<EOT
-      echo "`cat /proc/loadavg | awk '{ print $1 }'` `nproc`" | awk '{ printf "%0.2f", $1/$2 }'
-    EOT
-    interval = 60
-    timeout  = 1
-  }
-}
-
-# code-server
-resource "coder_app" "code-server" {
-  agent_id     = coder_agent.main.id
-  slug         = "code-server"
-  display_name = "code-server"
-  icon         = "/icon/code.svg"
-  url          = "http://localhost:13337?folder=/home/coder"
-  subdomain    = false
-  share        = "owner"
-
-  healthcheck {
-    url       = "http://localhost:13337/healthz"
-    interval  = 3
-    threshold = 10
-  }
-}
-
-
-locals {
-  workspace_pod_name = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-  workspace_pvc_name = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}-home"
-}
-
-
-resource "kubernetes_persistent_volume_claim" "home" {
-  metadata {
-    name      = local.workspace_pvc_name
-    namespace = var.namespace
-    labels = {
-      "app.kubernetes.io/name"     = "coder-pvc"
-      "app.kubernetes.io/instance" = "coder-pvc-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-      "app.kubernetes.io/part-of"  = "coder"
-      // Coder specific labels.
-      "com.coder.resource"       = "true"
-      "com.coder.workspace.id"   = data.coder_workspace.me.id
-      "com.coder.workspace.name" = data.coder_workspace.me.name
-      "com.coder.user.id"        = data.coder_workspace.me.owner_id
-      "com.coder.user.username"  = data.coder_workspace.me.owner
-    }
-    annotations = {
-      "com.coder.user.email" = data.coder_workspace.me.owner_email
-    }
-  }
-  wait_until_bound = false
-  spec {
-    access_modes = ["ReadWriteOnce"]
-    resources {
-      requests = {
-        storage = "${data.coder_parameter.home_disk_size.value}Gi"
-      }
-    }
-  }
-}
-
-resource "kubernetes_pod" "main" {
-  count = data.coder_workspace.me.start_count
-  metadata {
-    name      = "coder-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-    namespace = var.namespace
-    labels = {
-      "app.kubernetes.io/name"     = "coder-workspace"
-      "app.kubernetes.io/instance" = "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-      "app.kubernetes.io/part-of"  = "coder"
-      // Coder specific labels.
-      "com.coder.resource"       = "true"
-      "com.coder.workspace.id"   = data.coder_workspace.me.id
-      "com.coder.workspace.name" = data.coder_workspace.me.name
-      "com.coder.user.id"        = data.coder_workspace.me.owner_id
-      "com.coder.user.username"  = data.coder_workspace.me.owner
-    }
-    annotations = {
-      "com.coder.user.email" = data.coder_workspace.me.owner_email
-    }
-  }
-  spec {
-    security_context {
-      run_as_user = "1000"
-      fs_group    = "1000"
-    }
-    container {
-      name              = "dev"
-      image             = "codercom/enterprise-base:ubuntu"
-      image_pull_policy = "Always"
-      command           = ["sh", "-c", coder_agent.main.init_script]
-      security_context {
-        run_as_user = "1000"
-      }
-      env {
-        name  = "CODER_AGENT_TOKEN"
-        value = coder_agent.main.token
-      }
-      resources {
-        requests = {
-          "cpu"    = "250m"
-          "memory" = "512Mi"
-        }
-        limits = {
-          "cpu"    = "${data.coder_parameter.cpu.value}"
-          "memory" = "${data.coder_parameter.memory.value}Gi"
-        }
-      }
-      volume_mount {
-        mount_path = "/home/coder"
-        name       = "home"
-        read_only  = false
-      }
-      port {
-        container_port = 21112
-        name           = "prometheus-http"
-        protocol       = "TCP"
-      }
-    }
-
-    volume {
-      name = "home"
-      persistent_volume_claim {
-        claim_name = kubernetes_persistent_volume_claim.home.metadata.0.name
-        read_only  = false
-      }
-    }
-
-
-    affinity {
-      pod_anti_affinity {
-        // This affinity attempts to spread out all workspace pods evenly across
-        // nodes.
-        preferred_during_scheduling_ignored_during_execution {
-          weight = 1
-          pod_affinity_term {
-            topology_key = "kubernetes.io/hostname"
-            label_selector {
-              match_expressions {
-                key      = "app.kubernetes.io/name"
-                operator = "In"
-                values   = ["coder-workspace"]
-              }
-            }
-          }
-        }
-      }
-      node_affinity {
-        required_during_scheduling_ignored_during_execution {
-          node_selector_term {
-            match_expressions {
-              key      = "cloud.google.com/gke-nodepool"
-              operator = "In"
-              values   = ["big-misc"] # avoid placing on the same nodes as scaletest workspaces
-            }
-          }
-        }
-      }
-    }
-  }
-}
-
-resource "kubernetes_manifest" "pod_monitor" {
-  count = data.coder_workspace.me.start_count
-  manifest = {
-    apiVersion = "monitoring.coreos.com/v1"
-    kind       = "PodMonitor"
-    metadata = {
-      namespace = var.namespace
-      name      = "podmonitor-${local.workspace_pod_name}"
-    }
-    spec = {
-      selector = {
-        matchLabels = {
-          "app.kubernetes.io/instance" : "coder-workspace-${lower(data.coder_workspace.me.owner)}-${lower(data.coder_workspace.me.name)}"
-        }
-      }
-      podMetricsEndpoints = [
-        {
-          port     = "prometheus-http"
-          interval = "15s"
-        }
-      ]
-    }
-  }
-}
\ No newline at end of file

From 04fdb5265d58d94046baccf69413347bcffb182a Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Thu, 21 Mar 2024 15:13:38 +0100
Subject: [PATCH 13/21] Mention graphs

---
 docs/admin/scale.md | 23 +++++++++++++++++++----
 1 file changed, 19 insertions(+), 4 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index b8ea79dad4c3d..b3337afea696f 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -170,16 +170,31 @@ There are a few cluster options
 
 #### Greedy agent
 
-The greedy agent variant is a template modification that forces the Coder agent
-to transmit large metadata (size: 4K) while emitting stats. The transmission of
-large chunks puts extra overhead on coderd instances and agents while processing
+The greedy agent variant is a template modification that makes the Coder agent
+transmit large metadata (size: 4K) while reporting stats. The transmission of
+large chunks puts extra overhead on coderd instances and agents when handling
 and storing the data.
 
 Use this template variant to verify limits of the cluster performance.
 
 ### Observability
 
-TODO Grafana and logs
+During scale tests, operators can monitor progress using a Grafana dashboard.
+Coder offers a comprehensive overview
+[dashboard](https://github.com/coder/coder/blob/main/scaletest/scaletest_dashboard.json)
+that can seamlessly integrate into the internal Grafana deployment.
+
+This dashboard provides insights into various aspects, including:
+
+- Utilization of resources within the Coder control plane (CPU, memory, pods)
+- Database performance metrics (CPU, memory, I/O, connections, queries)
+- Coderd API performance (requests, latency, error rate)
+- Resource consumption within Coder workspaces (CPU, memory, network usage)
+- Internal metrics related to provisioner jobs
+
+It is highly recommended to deploy a solution for centralized log collection and
+aggregation. The presence of error logs may indicate an underscaled deployment
+of Coder, necessitating action from operators.
 
 ## Autoscaling
 

From 9a53217ab4ed92e4cc294b08be75a84c92469694 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 08:58:00 +0100
Subject: [PATCH 14/21] Cian's comments

---
 docs/admin/scale.md | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index b3337afea696f..942f34209ab00 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -35,7 +35,7 @@ environments.
 > guarantees, and may cause interruptions for your users. To avoid potential
 > outages and orphaned resources, we recommend running scale tests on a
 > secondary "staging" environment or a dedicated
-> [Kubernetes playground cluster](https://github.com/coder/coder/tree/main/scaletest/templates).
+> [Kubernetes playground cluster](https://github.com/coder/coder/tree/main/scaletest/terraform).
 > Run it against a production environment at your own risk.
 
 ### Create workspaces
@@ -82,7 +82,7 @@ coder exp scaletest workspace-traffic \
 	--timeout "$((delay))s" \
 	--job-timeout "$((delay))s" \
 	--scaletest-prometheus-address 0.0.0.0:21113 \
-  --target-workspaces "0:100" \
+	--target-workspaces "0:100" \
 	--trace=false \
   --output json:"${SCALETEST_RESULTS_DIR}/traffic-${type}-greedy-agent.json"
 ```
@@ -94,6 +94,8 @@ Traffic generation can be parametrized:
 1. Target a range of workspaces with `--target-workspaces 0:100`.
 1. For dashboard traffic: Target a range of users with `--target-users 0:100`.
 1. Store provisioning results in JSON format.
+1. Expose a dedicated Prometheus address (`--scaletest-prometheus-address`) for
+   scaletest-specific metrics.
 
 The `workspace-traffic` supports also other modes - SSH traffic, workspace app:
 
@@ -129,8 +131,9 @@ apps.
 
 The _scaletest-runner_ offers the following configuration options:
 
-- Workspace template selecting Kubernetes cluster size:
-  minimal/small/medium/large (_default_: minimal)
+- Workspace size selection: minimal/small/medium/large (_default_: minimal,
+  which contains just enough resources for a Coder agent to run without
+  additional workloads)
 - Number of workspaces
 - Wait duration between scenarios or staggered approach
 
@@ -161,12 +164,12 @@ your own risk).
 There are a few cluster options
 [available](https://github.com/coder/coder/tree/main/scaletest/templates/scaletest-runner):
 
-| Cluster size | vCPU | Memory | Persisted storage | Details                                               |
-| ------------ | ---- | ------ | ----------------- | ----------------------------------------------------- |
-| minimal      | 1    | 2 Gi   | None              |                                                       |
-| small        | 1    | 1 Gi   | None              |                                                       |
-| medium       | 2    | 2 Gi   | None              | Medium-sized cluster offers the greedy agent variant. |
-| large        | 4    | 4 Gi   | None              |                                                       |
+| Workspace size | vCPU | Memory | Persisted storage | Details                                               |
+| -------------- | ---- | ------ | ----------------- | ----------------------------------------------------- |
+| minimal        | 1    | 2 Gi   | None              |                                                       |
+| small          | 1    | 1 Gi   | None              |                                                       |
+| medium         | 2    | 2 Gi   | None              | Medium-sized cluster offers the greedy agent variant. |
+| large          | 4    | 4 Gi   | None              |                                                       |
 
 #### Greedy agent
 

From d1b0ddca5ea9fc145add37908c3adafbf997412e Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 09:26:33 +0100
Subject: [PATCH 15/21] WIP

---
 docs/admin/scale.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index 942f34209ab00..25779d60de48e 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -156,13 +156,13 @@ Scale testing concurrency can be controlled with the following parameters:
 
 ### Kubernetes cluster
 
-Depending on the traffic projections, operators can deploy different sample
-clusters to perform scale tests. It is recommended to learn how to operate the
-scaletest-runner before running it against the staging cluster (or production at
-your own risk).
+It is recommended to learn how to operate the _scaletest-runner_ before running
+it against the staging cluster (or production at your own risk). Coder provides
+different
+[workspace configurations](https://github.com/coder/coder/tree/main/scaletest/templates)
+that operators can deploy depending on the traffic projections.
 
-There are a few cluster options
-[available](https://github.com/coder/coder/tree/main/scaletest/templates/scaletest-runner):
+There are a few cluster options available:
 
 | Workspace size | vCPU | Memory | Persisted storage | Details                                               |
 | -------------- | ---- | ------ | ----------------- | ----------------------------------------------------- |

From fe4e743ec8da7f3a11d4364f6d806163eab6fe77 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 09:36:56 +0100
Subject: [PATCH 16/21] Noted

---
 docs/admin/scale.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index 25779d60de48e..6accd8ce1e5b9 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -195,6 +195,9 @@ This dashboard provides insights into various aspects, including:
 - Resource consumption within Coder workspaces (CPU, memory, network usage)
 - Internal metrics related to provisioner jobs
 
+Note: Database metrics are disabled by default and can be enabled by setting the
+environment variable `CODER_PROMETHEUS_COLLECT_DB_METRICS` to `true`.
+
 It is highly recommended to deploy a solution for centralized log collection and
 aggregation. The presence of error logs may indicate an underscaled deployment
 of Coder, necessitating action from operators.

From e91f036057f0b9a073010353e582b37da277a2b2 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 11:12:55 +0100
Subject: [PATCH 17/21] Fix

---
 scaletest/templates/kubernetes-minimal/main.tf | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/scaletest/templates/kubernetes-minimal/main.tf b/scaletest/templates/kubernetes-minimal/main.tf
index 6d04fb68a33ed..3bd56046f400b 100644
--- a/scaletest/templates/kubernetes-minimal/main.tf
+++ b/scaletest/templates/kubernetes-minimal/main.tf
@@ -152,7 +152,7 @@ resource "kubernetes_deployment" "main" {
                 match_expressions {
                   key      = "cloud.google.com/gke-nodepool"
                   operator = "In"
-                  values   = ["big-workspaces", "big-workspaces2"]
+                  values   = ["big-workspaces"]
                 }
               }
             }

From e9189e3b5ddce92b31db3c5d1974df1b283946b0 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 11:24:45 +0100
Subject: [PATCH 18/21] Use template vars

---
 scaletest/templates/kubernetes-large/main.tf         | 8 +++++++-
 scaletest/templates/kubernetes-medium-greedy/main.tf | 8 +++++++-
 scaletest/templates/kubernetes-medium/main.tf        | 8 +++++++-
 scaletest/templates/kubernetes-minimal/main.tf       | 8 +++++++-
 scaletest/templates/kubernetes-small/main.tf         | 8 +++++++-
 5 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/scaletest/templates/kubernetes-large/main.tf b/scaletest/templates/kubernetes-large/main.tf
index 352db67bbcf22..161d4448bab64 100644
--- a/scaletest/templates/kubernetes-large/main.tf
+++ b/scaletest/templates/kubernetes-large/main.tf
@@ -17,6 +17,12 @@ provider "kubernetes" {
   config_path = null # always use host
 }
 
+variable "kubernetes_nodepool_workspaces" {
+  description = "Kubernetes nodepool for Coder workspaces"
+  type        = string
+  default     = "big-workspaces"
+}
+
 data "coder_workspace" "me" {}
 
 resource "coder_agent" "main" {
@@ -72,7 +78,7 @@ resource "kubernetes_pod" "main" {
             match_expressions {
               key      = "cloud.google.com/gke-nodepool"
               operator = "In"
-              values   = ["big-workspaces"]
+              values   = ["${var.kubernetes_nodepool_workspaces}"]
             }
           }
         }
diff --git a/scaletest/templates/kubernetes-medium-greedy/main.tf b/scaletest/templates/kubernetes-medium-greedy/main.tf
index a0a5dd8742c56..8a70eced34426 100644
--- a/scaletest/templates/kubernetes-medium-greedy/main.tf
+++ b/scaletest/templates/kubernetes-medium-greedy/main.tf
@@ -17,6 +17,12 @@ provider "kubernetes" {
   config_path = null # always use host
 }
 
+variable "kubernetes_nodepool_workspaces" {
+  description = "Kubernetes nodepool for Coder workspaces"
+  type        = string
+  default     = "big-workspaces"
+}
+
 data "coder_workspace" "me" {}
 
 resource "coder_agent" "main" {
@@ -186,7 +192,7 @@ resource "kubernetes_pod" "main" {
             match_expressions {
               key      = "cloud.google.com/gke-nodepool"
               operator = "In"
-              values   = ["big-workspaces"]
+              values   = ["${var.kubernetes_nodepool_workspaces}"]
             }
           }
         }
diff --git a/scaletest/templates/kubernetes-medium/main.tf b/scaletest/templates/kubernetes-medium/main.tf
index 5dcd9588c1b33..5e3980a0e252e 100644
--- a/scaletest/templates/kubernetes-medium/main.tf
+++ b/scaletest/templates/kubernetes-medium/main.tf
@@ -17,6 +17,12 @@ provider "kubernetes" {
   config_path = null # always use host
 }
 
+variable "kubernetes_nodepool_workspaces" {
+  description = "Kubernetes nodepool for Coder workspaces"
+  type        = string
+  default     = "big-workspaces"
+}
+
 data "coder_workspace" "me" {}
 
 resource "coder_agent" "main" {
@@ -72,7 +78,7 @@ resource "kubernetes_pod" "main" {
             match_expressions {
               key      = "cloud.google.com/gke-nodepool"
               operator = "In"
-              values   = ["big-workspaces"]
+              values   = ["${var.kubernetes_nodepool_workspaces}"]
             }
           }
         }
diff --git a/scaletest/templates/kubernetes-minimal/main.tf b/scaletest/templates/kubernetes-minimal/main.tf
index 3bd56046f400b..7ad97f7a89e85 100644
--- a/scaletest/templates/kubernetes-minimal/main.tf
+++ b/scaletest/templates/kubernetes-minimal/main.tf
@@ -17,6 +17,12 @@ provider "kubernetes" {
   config_path = null # always use host
 }
 
+variable "kubernetes_nodepool_workspaces" {
+  description = "Kubernetes nodepool for Coder workspaces"
+  type        = string
+  default     = "big-workspaces"
+}
+
 data "coder_workspace" "me" {}
 
 resource "coder_agent" "m" {
@@ -152,7 +158,7 @@ resource "kubernetes_deployment" "main" {
                 match_expressions {
                   key      = "cloud.google.com/gke-nodepool"
                   operator = "In"
-                  values   = ["big-workspaces"]
+                  values   = ["${var.kubernetes_nodepool_workspaces}"]
                 }
               }
             }
diff --git a/scaletest/templates/kubernetes-small/main.tf b/scaletest/templates/kubernetes-small/main.tf
index b59e4989544f5..0c81ba245b1df 100644
--- a/scaletest/templates/kubernetes-small/main.tf
+++ b/scaletest/templates/kubernetes-small/main.tf
@@ -17,6 +17,12 @@ provider "kubernetes" {
   config_path = null # always use host
 }
 
+variable "kubernetes_nodepool_workspaces" {
+  description = "Kubernetes nodepool for Coder workspaces"
+  type        = string
+  default     = "big-workspaces"
+}
+
 data "coder_workspace" "me" {}
 
 resource "coder_agent" "main" {
@@ -72,7 +78,7 @@ resource "kubernetes_pod" "main" {
             match_expressions {
               key      = "cloud.google.com/gke-nodepool"
               operator = "In"
-              values   = ["big-workspaces"]
+              values   = ["${var.kubernetes_nodepool_workspaces}"]
             }
           }
         }

From 00eecf5cdf80ad434f46770f4fef92d2cbcfb4b7 Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 11:27:19 +0100
Subject: [PATCH 19/21] Add note

---
 docs/admin/scale.md | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/docs/admin/scale.md b/docs/admin/scale.md
index 6accd8ce1e5b9..8f059c0e86c79 100644
--- a/docs/admin/scale.md
+++ b/docs/admin/scale.md
@@ -171,6 +171,9 @@ There are a few cluster options available:
 | medium         | 2    | 2 Gi   | None              | Medium-sized cluster offers the greedy agent variant. |
 | large          | 4    | 4 Gi   | None              |                                                       |
 
+Note: Review the selected cluster template and edit the node affinity to match
+your setup.
+
 #### Greedy agent
 
 The greedy agent variant is a template modification that makes the Coder agent

From 6a69f7b65671d70af7b1c56cd7e6fe962b4baabc Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 11:40:58 +0100
Subject: [PATCH 20/21] Fix: nodepool

---
 scaletest/templates/kubernetes-large/README.md         | 4 +++-
 scaletest/templates/kubernetes-medium-greedy/README.md | 4 +++-
 scaletest/templates/kubernetes-medium/README.md        | 4 +++-
 scaletest/templates/kubernetes-minimal/README.md       | 4 +++-
 scaletest/templates/kubernetes-small/README.md         | 4 +++-
 5 files changed, 15 insertions(+), 5 deletions(-)

diff --git a/scaletest/templates/kubernetes-large/README.md b/scaletest/templates/kubernetes-large/README.md
index 2b0ae5cc296be..5621780243ada 100644
--- a/scaletest/templates/kubernetes-large/README.md
+++ b/scaletest/templates/kubernetes-large/README.md
@@ -2,4 +2,6 @@
 
 Provisions a large-sized workspace with no persistent storage.
 
-_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
+_Note_: It is assumed you will be running workspaces on a dedicated GKE nodepool.
+By default, this template sets a node affinity of `cloud.google.com/gke-nodepool` = `big-workspaces`.
+The nodepool affinity can be customized with the variable `kubernetes_nodepool_workspaces`.
diff --git a/scaletest/templates/kubernetes-medium-greedy/README.md b/scaletest/templates/kubernetes-medium-greedy/README.md
index 22e94bb262616..d29c36f10da3a 100644
--- a/scaletest/templates/kubernetes-medium-greedy/README.md
+++ b/scaletest/templates/kubernetes-medium-greedy/README.md
@@ -2,4 +2,6 @@
 
 Provisions a medium-sized workspace with no persistent storage. Greedy agent variant.
 
-_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
+_Note_: It is assumed you will be running workspaces on a dedicated GKE nodepool.
+By default, this template sets a node affinity of `cloud.google.com/gke-nodepool` = `big-workspaces`.
+The nodepool affinity can be customized with the variable `kubernetes_nodepool_workspaces`.
diff --git a/scaletest/templates/kubernetes-medium/README.md b/scaletest/templates/kubernetes-medium/README.md
index e2d5eae983114..6f63bfb62c25a 100644
--- a/scaletest/templates/kubernetes-medium/README.md
+++ b/scaletest/templates/kubernetes-medium/README.md
@@ -2,4 +2,6 @@
 
 Provisions a medium-sized workspace with no persistent storage.
 
-_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
+_Note_: It is assumed you will be running workspaces on a dedicated GKE nodepool.
+By default, this template sets a node affinity of `cloud.google.com/gke-nodepool` = `big-workspaces`.
+The nodepool affinity can be customized with the variable `kubernetes_nodepool_workspaces`.
diff --git a/scaletest/templates/kubernetes-minimal/README.md b/scaletest/templates/kubernetes-minimal/README.md
index a4e76f8b24611..767570337dbf6 100644
--- a/scaletest/templates/kubernetes-minimal/README.md
+++ b/scaletest/templates/kubernetes-minimal/README.md
@@ -2,4 +2,6 @@
 
 Provisions a minimal-sized workspace with no persistent storage.
 
-_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
+_Note_: It is assumed you will be running workspaces on a dedicated GKE nodepool.
+By default, this template sets a node affinity of `cloud.google.com/gke-nodepool` = `big-workspaces`.
+The nodepool affinity can be customized with the variable `kubernetes_nodepool_workspaces`.
diff --git a/scaletest/templates/kubernetes-small/README.md b/scaletest/templates/kubernetes-small/README.md
index 56efbb98c3cb3..df5475bd32d70 100644
--- a/scaletest/templates/kubernetes-small/README.md
+++ b/scaletest/templates/kubernetes-small/README.md
@@ -2,4 +2,6 @@
 
 Provisions a small-sized workspace with no persistent storage.
 
-_Requires_: `cloud.google.com/gke-nodepool` = `big-workspaces`
+_Note_: It is assumed you will be running workspaces on a dedicated GKE nodepool.
+By default, this template sets a node affinity of `cloud.google.com/gke-nodepool` = `big-workspaces`.
+The nodepool affinity can be customized with the variable `kubernetes_nodepool_workspaces`.

From 8d1609075076778197e98093ed475b673b3a121c Mon Sep 17 00:00:00 2001
From: Marcin Tojek <marcin@coder.com>
Date: Fri, 22 Mar 2024 12:21:18 +0100
Subject: [PATCH 21/21] Try: force make gen

---
 .github/workflows/ci.yaml | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/.github/workflows/ci.yaml b/.github/workflows/ci.yaml
index ad21801cbdab4..8db445e798f42 100644
--- a/.github/workflows/ci.yaml
+++ b/.github/workflows/ci.yaml
@@ -604,6 +604,9 @@ jobs:
       - name: Setup sqlc
         uses: ./.github/actions/setup-sqlc
 
+      - name: make gen
+        run: "make --output-sync -j -B gen"
+
       - name: Format
         run: |
           cd offlinedocs