-
Notifications
You must be signed in to change notification settings - Fork 881
docs: add new scaling doc to best practices section #15904
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0af3756
to
6451c29
Compare
|
||
### Scaling | ||
|
||
Coder Server can be scaled both vertically for bigger instances and horizontally |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ericpaulsen you had mentioned linking to the replica & resources values in our Helm chart
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some inline comments, but I don't need to review again.
- Capture infrastructure metrics like CPU, memory, open files, and network I/O for all | ||
Coder Server, external provisioner daemon, workspace proxy, and PostgreSQL instances. | ||
|
||
### Capture Coder server metrics with Prometheus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels too separated from the content above about infrastructure metrics.
The idea is that you need to capture 2 kinds of metrics: infrastructure, and Coder-specific, and you want them both to be available in the same system for alerting and dashboards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if that's because the subheadings under Observability are serving different purposes, with ### Observability key metrics
providing more information about a specific aspect, and ### How to capture Coder server metrics with Prometheus
showing steps.
I rearranged things in f110af3 to see if that helps the read-through, but I think that if we can add an additional method for capturing metrics and logs, we can split that out into its own H2
Does it make sense to add an example of how to stream coderd or provisionerd logs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I worry a bit about mixing high-level advice with specific step-by-step instructions, since the step-by-step stuff could change and people might not realize they also have to update this document. I'd much prefer linking so that step-by-step instructions appears in one place for any specific task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair - that'd be a lot easier with reusable content (embeds/includes/partials, whatever), but as it is now, let's get this out and add relevant links around as we find them
Co-authored-by: Spike Curtis <[email protected]>
preview