Resource alerting (custom notifications) #15177
Labels
customer-requested
Features requested by enterprise customers. Only humans may set this.
observability
Issues related to observability (metrics, dashboards, alerts, opentelemetry)
roadmap
https://coder.com/roadmap. Only humans may set this.
Problem statement
Our customers have noted that many users are experiencing build failures due to OOD, OOM while in their workspaces. Being kicked out of your workspace mid-flow is potentially the worst experience we can subject developers to.
Users can become unaware of resource consumption in their workspaces. While we expose it through coder metadata and
coder stat
, it's not accessible or actionable enough. Developers are not negatively impacted by overutilization until a failure happens; they’re not negatively impacted by underutilization since they're not responsible for cloud costs. It's much easier for developers to adopt a just-in-case mentality and opt for generous resources.Administrators care greatly about resource utilization but have difficulty communicating the urgency of action to developers while working.
Proposal
We extend our notification system to allow administrators to define custom signals such as resource alerts. Then, they can trigger some custom notification with an API endpoint. For example, they could define a "high disk usage" notification. A module running on the workspace agent would trigger this alert by listening to
coder stat
. When the workspace is approaching overutilization, the developer owning that workspace gets a notification and CTA to rebuild with new parameters.In the past we've called this "Custom IDE Notifications" because we should also introduce delivery into the IDEs via our extensions. Developers should be able to take action without leaving flow.
The text was updated successfully, but these errors were encountered: