Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

cbandy
Copy link
Member

@cbandy cbandy commented Sep 10, 2025

Checklist:

  • Have you added an explanation of what your changes do and why you'd like them to be included?
  • Have you updated or added documentation for the change, as applicable?
  • Have you tested your changes on all related environments with successful results, as applicable?
    • Have you added automated tests?

Type of Changes:

  • New feature

What is the current behavior (link to any open issues here)?

When unspecified, Postgres logs to the log directory inside its data directory. This is difficult to coordinate with the OpenTelemetry Collector and can lose log files during replication creation and major upgrades.

What is the new behavior (if this is a feature change)?

  • Breaking change (fix or feature that would cause existing functionality to change)

When unspecified, Postgres logs outside the data directory on one of the attached persistent volumes. The allowed values for the log_directory parameter have been reduced to a handful of absolute path prefixes to volumes that are also in the spec.

Users that want to continue to log to the data directory can/must set the log_directory parameter to log. That value is safe to assign even before this change.

Other Information:

Issue: PGO-2558

@cbandy cbandy force-pushed the postgres-log-directory branch from fb42b48 to a3078e0 Compare September 11, 2025 17:30
Copy link

snyk-io bot commented Sep 11, 2025

🎉 Snyk checks have passed. No issues have been found so far.

security/snyk check is complete. No issues have been found. (View Details)

license/snyk check is complete. No issues have been found. (View Details)

Copy link
Contributor

@benjaminjb benjaminjb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No blockers, I like that we're getting logs out as a default from data.

The only question I have is about sanitizing. We have the CEL rules around log_directory for v1, so theoretically any value we get should be safe to use for this cluster. (e.g., if the log_directory is in pgwal, they have a pgwal volume set up.)

So the sanitizing is really for v1beta1, yeah?

But also, is this our first instance of doing some real sanitizing of user-set configs?

@cbandy
Copy link
Member Author

cbandy commented Sep 11, 2025

The only question I have is about sanitizing. We have the CEL rules around log_directory for v1, so theoretically any value we get should be safe to use for this cluster. (e.g., if the log_directory is in pgwal, they have a pgwal volume set up.)

So the sanitizing is really for v1beta1, yeah?

But also, is this our first instance of doing some real sanitizing of user-set configs?

I'm conflicted about this, too. I don't have all our validation in my head, but I don't think much of it so far has been about safety. When I skim, most of what I see are riffs on optional, required, and max-items. The label and subdomain validation is an early detection of values rejected (and not acted on) by Kubernetes.

I do recall:

  • these are about SQL injection, repeated in Go.
  • this about accessing files, again in Go.
  • this about identities, also in Go.

So the sanitizing is really for v1beta1, yeah?

Yeah, I guess so.

Most of my concern is that (without "sanitize") there are values the controller code cannot handle. There are lots of values for log directory inside PGDATA that won't bootstrap, or won't scrape, etc. The code now tries to adjust to known good values, but another approach could be to error out completely: "donk; invalid!"

There's also an almost-zero chance that someone could poke a CRD to remove some validation, then submit a resource our controller enacts, which allows or does something "bad." 🤔 Rather pedantic.

@benjaminjb
Copy link
Contributor

Most of my concern is that (without "sanitize") there are values the controller code cannot handle. There are lots of values for log directory inside PGDATA that won't bootstrap, or won't scrape, etc. The code now tries to adjust to known good values, but another approach could be to error out completely: "donk; invalid!"

OK, if I remember that a spec is a request, then fixing that request seems reasonable. And I suppose someone who wanted to find the logs shouldn't look at the spec as the source of truth, but could get the location of the logs from PG / the settings configmap. (Maybe we make that suggestion in the docs.)

But do you think it's worth emitting an event? Or adding it to the status? (People probably don't look at events unless something goes wrong, but status feels a little too much to me right now.)

@cbandy cbandy force-pushed the postgres-log-directory branch from a3078e0 to f280184 Compare September 15, 2025 16:59
cbandy and others added 4 commits September 15, 2025 12:02
This is a tightening of validation compared to v1beta1. The parameter
must have one of a handful of prefixes. A spec-level rule ensures the
value refers to a volume declared in the spec.

Co-authored-by: Benjamin Blattberg <[email protected]>
Co-authored-by: Drew Sessler <[email protected]>
Issue: PGO-2558
This is further protection against Postgres data loss due to a
misconfigured or maliciously configured log directory.

Co-authored-by: Benjamin Blattberg <[email protected]>
Co-authored-by: Drew Sessler <[email protected]>
Issue: PGO-2558
This has been the recommended practice for ages, but PGO left this
parameter at Postgres' packaged default. Being outside the data
directory makes the log directory and files:

 - easier to create
 - easier to consume from the OpenTelemetry Collector
 - easier to exclude from backups
 - persistent across failures that recover by recreating the data directory

Users that wish to log to the default directory can
set "spec.config.parameters.log_directory" to "log" on PostgresCluster.

Co-authored-by: Benjamin Blattberg <[email protected]>
Co-authored-by: Drew Sessler <[email protected]>
Issue: PGO-2558
@cbandy cbandy force-pushed the postgres-log-directory branch from f280184 to 5976e8f Compare September 15, 2025 17:03
@cbandy cbandy merged commit d777eed into CrunchyData:main Sep 15, 2025
19 checks passed
@cbandy cbandy deleted the postgres-log-directory branch September 15, 2025 18:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants