You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
pg_dump: Reduce memory usage of dumps with statistics.
Right now, pg_dump stores all generated commands for statistics in
memory. These commands can be quite large and therefore can
significantly increase pg_dump's memory footprint. To fix, wait
until we are about to write out the commands before generating
them, and be sure to free the commands after writing. This is
implemented via a new defnDumper callback that works much like the
dataDumper one but is specially designed for TOC entries.
Custom dumps that include data might write the TOC twice (to update
data offset information), which would ordinarily cause pg_dump to
run the attribute statistics queries twice. However, as a hack, we
save the length of the written-out entry in the first pass, and we
skip over it in the second. While there is no known technical
problem with executing the queries multiple times and rewriting the
results, it's expensive and feels risky, so it seems prudent to
avoid it.
As an exception, we _do_ execute the queries twice for the tar
format. This format does a second pass through the TOC to generate
the restore.sql file, which isn't used by pg_restore, so different
results won't corrupt the output (it'll just be different). We
could alternatively save the definition in memory the first time it
is generated, but that defeats the purpose of this change. In any
case, past discussion indicates that the tar format might be a
candidate for deprecation, so it doesn't seem worth trying too much
harder.
Author: Corey Huinker <[email protected]>
Co-authored-by: Nathan Bossart <[email protected]>
Reviewed-by: Jeff Davis <[email protected]>
Discussion: https://postgr.es/m/CADkLM%3Dc%2Br05srPy9w%2B-%2BnbmLEo15dKXYQ03Q_xyK%2BriJerigLQ%40mail.gmail.com
0 commit comments