-
Notifications
You must be signed in to change notification settings - Fork 631
Make the default Postgres log directory outside its data directory #4282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is a tightening of validation compared to v1beta1. The parameter must have one of a handful of prefixes. A spec-level rule ensures the value refers to a volume declared in the spec. Co-authored-by: Benjamin Blattberg <[email protected]> Co-authored-by: Drew Sessler <[email protected]> Issue: PGO-2558
This is further protection against Postgres data loss due to a misconfigured or maliciously configured log directory. Co-authored-by: Benjamin Blattberg <[email protected]> Co-authored-by: Drew Sessler <[email protected]> Issue: PGO-2558
This has been the recommended practice for ages, but PGO left this parameter at Postgres' packaged default. Being outside the data directory makes the log directory and files: - easier to create - easier to consume from the OpenTelemetry Collector - easier to exclude from backups - persistent across failures that recover by recreating the data directory Users that wish to log to the default directory can set "spec.config.parameters.log_directory" to "log" on PostgresCluster. Co-authored-by: Benjamin Blattberg <[email protected]> Co-authored-by: Drew Sessler <[email protected]> Issue: PGO-2558
fb42b48
to
a3078e0
Compare
🎉 Snyk checks have passed. No issues have been found so far.✅ security/snyk check is complete. No issues have been found. (View Details) ✅ license/snyk check is complete. No issues have been found. (View Details) |
Rules []PostgresHBARuleSpec `json:"rules,omitempty"` | ||
} | ||
|
||
type PostgresConfigSpec struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're only using this struct right now; but are you thinking it's easier to copy over all the v1beta1 structs in this file so that we can keep them aligned (up until we drop v1beta1 or let them become unaligned)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't thinking at all! Which seems better to you presently?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At first I was going to say, "bring over everything so we can keep things aligned" and then I realized that the reason to have a v1 is to change things. Here we have a new CEL rule on parameters, and an editing of existing CEL rules (mostly reorganizing messages as the last part, I think) for PostgresHBARuleSpec
-- but then we're not using the v1.PostgresHBARuleSpec
anywhere...
So now I lean lightly towards "just bring over the things that have changed."
} | ||
} | ||
|
||
// sensitiveAbsolutePath is used by [sanitizeLogDirectory]. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might've helped me reading at first to understand sensitiveAbsolutePath
and sensitiveRelativePath
were disallowed paths.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No blockers, I like that we're getting logs out as a default from data.
The only question I have is about sanitizing. We have the CEL rules around log_directory
for v1, so theoretically any value we get should be safe to use for this cluster. (e.g., if the log_directory is in pgwal, they have a pgwal volume set up.)
So the sanitizing is really for v1beta1, yeah?
But also, is this our first instance of doing some real sanitizing of user-set configs?
I'm conflicted about this, too. I don't have all our validation in my head, but I don't think much of it so far has been about safety. When I skim, most of what I see are riffs on optional, required, and max-items. The label and subdomain validation is an early detection of values rejected (and not acted on) by Kubernetes. I do recall:
Yeah, I guess so. Most of my concern is that (without "sanitize") there are values the controller code cannot handle. There are lots of values for log directory inside PGDATA that won't bootstrap, or won't scrape, etc. The code now tries to adjust to known good values, but another approach could be to error out completely: "donk; invalid!" There's also an almost-zero chance that someone could poke a CRD to remove some validation, then submit a resource our controller enacts, which allows or does something "bad." 🤔 Rather pedantic. |
OK, if I remember that a spec is a request, then fixing that request seems reasonable. And I suppose someone who wanted to find the logs shouldn't look at the spec as the source of truth, but could get the location of the logs from PG / the settings configmap. (Maybe we make that suggestion in the docs.) But do you think it's worth emitting an event? Or adding it to the status? (People probably don't look at events unless something goes wrong, but status feels a little too much to me right now.) |
Checklist:
Type of Changes:
What is the current behavior (link to any open issues here)?
When unspecified, Postgres logs to the
log
directory inside its data directory. This is difficult to coordinate with the OpenTelemetry Collector and can lose log files during replication creation and major upgrades.What is the new behavior (if this is a feature change)?
When unspecified, Postgres logs outside the data directory on one of the attached persistent volumes. The allowed values for the
log_directory
parameter have been reduced to a handful of absolute path prefixes to volumes that are also in the spec.Users that want to continue to log to the data directory can/must set the
log_directory
parameter tolog
. That value is safe to assign even before this change.Other Information:
Issue: PGO-2558