1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-19 13:42:17 +03:00
Commit Graph

1962 Commits

Author SHA1 Message Date
Fujii Masao
b53b88109f Improve error message when standby does accept connections.
Even after reaching the minimum recovery point, if there are long-lived
write transactions with 64 subtransactions on the primary, the recovery
snapshot may not yet be ready for hot standby, delaying read-only
connections on the standby. Previously, when read-only connections were
not accepted due to this condition, the following error message was logged:

    FATAL:  the database system is not yet accepting connections
    DETAIL:  Consistent recovery state has not been yet reached.

This DETAIL message was misleading because the following message was
already logged in this case:

    LOG:  consistent recovery state reached

This contradiction, i.e., indicating that the recovery state was consistent
while also stating it wasn’t, caused confusion.

This commit improves the error message to better reflect the actual state:

    FATAL: the database system is not yet accepting connections
    DETAIL: Recovery snapshot is not yet ready for hot standby.
    HINT: To enable hot standby, close write transactions with more than 64 subtransactions on the primary server.

To implement this, the commit introduces a new postmaster signal,
PMSIGNAL_RECOVERY_CONSISTENT. When the startup process reaches
a consistent recovery state, it sends this signal to the postmaster,
allowing it to correctly recognize that state.

Since this is not a clear bug, the change is applied only to the master
branch and is not back-patched.

Author: Atsushi Torikoshi <torikoshia@oss.nttdata.com>
Co-authored-by: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp>
Discussion: https://postgr.es/m/02db8cd8e1f527a8b999b94a4bee3165@oss.nttdata.com
2025-04-02 15:13:01 +09:00
Andres Freund
08ccd56ac7 aio, bufmgr: Comment fixes/improvements
Some of these comments have been wrong for a while (12f3867f55), some I
recently introduced (da7226993f, 55b454d0e1). This includes an update to a
comment in FlushBuffer(), which will be copied in a future commit.

These changes seem big enough to be worth doing in separate commits.

Suggested-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/20250319212530.80.nmisch@google.com
2025-03-29 14:45:42 -04:00
Álvaro Herrera
9fbd53dea5 Remove the query_id_squash_values GUC
Commit 62d712ecfd introduced the capability to calculate the same
queryId for queries with different lengths of constants in a list for an
IN clause.  This behavior was originally enabled with a GUC
query_id_squash_values.  After a discussion about the value of such a
GUC, it was decided to back out of the use of a GUC and make the
squashing behavior the only available option.

Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/Z-LZyygkkNyA8-kR@msg.df7cb.de
Discussion: https://postgr.es/m/CA+q6zcVTK-3C-8NWV1oY2NZrvtnMCDqnyYYyk1T7WMUG65MeOQ@mail.gmail.com
2025-03-27 13:33:37 +01:00
Álvaro Herrera
62d712ecfd Introduce squashing of constant lists in query jumbling
pg_stat_statements produces multiple entries for queries like
    SELECT something FROM table WHERE col IN (1, 2, 3, ...)

depending on the number of parameters, because every element of
ArrayExpr is individually jumbled.  Most of the time that's undesirable,
especially if the list becomes too large.

Fix this by introducing a new GUC query_id_squash_values which modifies
the node jumbling code to only consider the first and last element of a
list of constants, rather than each list element individually.  This
affects both the query_id generated by query jumbling, as well as
pg_stat_statements query normalization so that it suppresses printing of
the individual elements of such a list.

The default value is off, meaning the previous behavior is maintained.

Author: Dmitry Dolgov <9erthalion6@gmail.com>
Reviewed-by: Sergey Dudoladov (mysterious, off-list)
Reviewed-by: David Geier <geidav.pg@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Sutou Kouhei <kou@clear-code.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Marcos Pegoraro <marcos@f10.com.br>
Reviewed-by: Julien Rouhaud <rjuju123@gmail.com>
Reviewed-by: Zhihong Yu <zyu@yugabyte.com>
Tested-by: Yasuo Honda <yasuo.honda@gmail.com>
Tested-by: Sergei Kornilov <sk@zsrv.org>
Tested-by: Maciek Sakrejda <m.sakrejda@gmail.com>
Tested-by: Chengxi Sun <sunchengxi@highgo.com>
Tested-by: Jakub Wartak <jakub.wartak@enterprisedb.com>
Discussion: https://postgr.es/m/CA+q6zcWtUbT_Sxj0V6HY6EZ89uv5wuG5aefpe_9n0Jr3VwntFg@mail.gmail.com
2025-03-18 18:56:11 +01:00
Andres Freund
55b454d0e1 aio: Infrastructure for io_method=worker
This commit contains the basic, system-wide, infrastructure for
io_method=worker. It does not yet actually execute IO, this commit just
provides the infrastructure for running IO workers, kept separate for easier
review.

The number of IO workers can be adjusted with a PGC_SIGHUP GUC. Eventually
we'd like to make the number of workers dynamically scale up/down based on the
current "IO load".

To allow the number of IO workers to be increased without a restart, we need
to reserve PGPROC entries for the workers unconditionally. This has been
judged to be worth the cost. If it turns out to be problematic, we can
introduce a PGC_POSTMASTER GUC to control the maximum number.

As io workers might be needed during shutdown, e.g. for AIO during the
shutdown checkpoint, a new PMState phase is added. IO workers are shut down
after the shutdown checkpoint has been performed and walsender/archiver have
shut down, but before the checkpointer itself shuts down. See also
87a6690cc6.

Updates PGSTAT_FILE_FORMAT_ID due to the addition of a new BackendType.

Reviewed-by: Noah Misch <noah@leadboat.com>
Co-authored-by: Thomas Munro <thomas.munro@gmail.com>
Co-authored-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
Discussion: https://postgr.es/m/20210223100344.llw5an2aklengrmn@alap3.anarazel.de
Discussion: https://postgr.es/m/stj36ea6yyhoxtqkhpieia2z4krnam7qyetc57rfezgk4zgapf@gcnactj4z56m
2025-03-18 11:54:01 -04:00
Andres Freund
02844012b3 aio: Basic subsystem initialization
This commit just does the minimal wiring up of the AIO subsystem, added in the
next commit, to the rest of the system. The next commit contains more details
about motivation and architecture.

This commit is kept separate to make it easier to review, separating the
changes across the tree, from the implementation of the new subsystem.

We discussed squashing this commit with the main commit before merging AIO,
but there has been a mild preference for keeping it separate.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/uvrtrknj4kdytuboidbhwclo4gxhswwcpgadptsjvjqcluzmah%40brqs62irg4dt
2025-03-17 18:51:33 -04:00
Peter Eisentraut
3691edfab9 pg_noreturn to replace pg_attribute_noreturn()
We want to support a "noreturn" decoration on more compilers besides
just GCC-compatible ones, but for that we need to move the decoration
in front of the function declaration instead of either behind it or
wherever, which is the current style afforded by GCC-style attributes.
Also rename the macro to "pg_noreturn" to be similar to the C11
standard "noreturn".

pg_noreturn is now supported on all compilers that support C11 (using
_Noreturn), as well as GCC-compatible ones (using __attribute__, as
before), as well as MSVC (using __declspec).  (When PostgreSQL
requires C11, the latter two variants can be dropped.)

Now, all supported compilers effectively support pg_noreturn, so the
extra code for !HAVE_PG_ATTRIBUTE_NORETURN can be dropped.

This also fixes a possible problem if third-party code includes
stdnoreturn.h, because then the current definition of

    #define pg_attribute_noreturn() __attribute__((noreturn))

would cause an error.

Note that the C standard does not support a noreturn attribute on
function pointer types.  So we have to drop these here.  There are
only two instances at this time, so it's not a big loss.  In one case,
we can make up for it by adding the pg_noreturn to a wrapper function
and adding a pg_unreachable(), in the other case, the latter was
already done before.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/flat/pxr5b3z7jmkpenssra5zroxi7qzzp6eswuggokw64axmdixpnk@zbwxuq7gbbcw
2025-03-13 12:37:26 +01:00
Melanie Plageman
18cd15e706 Add connection establishment duration logging
Add log_connections option 'setup_durations' which logs durations of
several key parts of connection establishment and backend setup.

For an incoming connection, starting from when the postmaster gets a
socket from accept() and ending when the forked child backend is first
ready for query, there are multiple steps that could each take longer
than expected due to external factors. This logging provides visibility
into authentication and fork duration as well as the end-to-end
connection establishment and backend initialization time.

To make this portable, the timings captured in the postmaster (socket
creation time, fork initiation time) are passed through the
BackendStartupData.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Jelte Fennema-Nio <postgres@jeltef.nl>
Reviewed-by: Guillaume Lelarge <guillaume.lelarge@dalibo.com>
Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com
2025-03-12 11:35:27 -04:00
Melanie Plageman
9219093cab Modularize log_connections output
Convert the boolean log_connections GUC into a list GUC comprised of the
connection aspects to log.

This gives users more control over the volume and kind of connection
logging.

The current log_connections options are 'receipt', 'authentication', and
'authorization'. The empty string disables all connection logging. 'all'
enables all available connection logging.

For backwards compatibility, the most common values for the
log_connections boolean are still supported (on, off, 1, 0, true, false,
yes, no). Note that previously supported substrings of on, off, true,
false, yes, and no are no longer supported.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Discussion: https://postgr.es/m/flat/CAAKRu_b_smAHK0ZjrnL5GRxnAVWujEXQWpLXYzGbmpcZd3nLYw%40mail.gmail.com
2025-03-12 11:35:21 -04:00
Heikki Linnakangas
393e0d2314 Split WaitEventSet functions to separate source file
latch.c now only contains the Latch related functions, which build on
the WaitEventSet abstraction. Most of the platform-dependent stuff is
now in waiteventset.c.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/8a507fb6-df28-49d3-81a5-ede180d7f0fb@iki.fi
2025-03-06 01:26:16 +02:00
Heikki Linnakangas
635f580120 Rename some signal and interrupt handling functions for consistency
The usual pattern for handling a signal is that the signal handler
sets a flag and calls SetLatch(MyLatch), and CHECK_FOR_INTERRUPTS() or
other code that is part of a wait loop calls another function to deal
with it. The naming of the functions involved was a bit inconsistent,
however. CHECK_FOR_INTERRUPTS() calls ProcessInterrupts() to do the
heavy-lifting, but the analogous functions in aux processes were
called HandleMainLoopInterrupts(), HandleStartupProcInterrupts(),
etc. Similarly, most subroutines of ProcessInterrupts() were called
Process*(), but some were called Handle*().

To make things less confusing, rename all the functions that are part
of the overall signal/interrupt handling system but are not executed
in a signal handler to e.g. ProcessSomething(), rather than
HandleSomething(). The "Process" prefix is now consistently used in
the non-signal-handler functions, and the "Handle" prefix in functions
that are part of signal handlers, except for some completely unrelated
functions that clearly have nothing to do with signal or interrupt
handling.

Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Discussion: https://www.postgresql.org/message-id/8a384b26-1499-41f6-be33-64b801fb98b8@iki.fi
2025-03-05 16:22:26 +02:00
Michael Paquier
f4694e0f35 Fix some gaps in pg_stat_io with WAL receiver and WAL summarizer
The WAL receiver and WAL summarizer processes gain each one a call to
pgstat_report_wal(), to make sure that they report their WAL statistics
to pgstats, gathering data for pg_stat_io.

In the WAL receiver, the stats reports are timed with status updates sent
to the primary, that depend on wal_receiver_status_interval and
wal_receiver_timeout.  This is a conservative choice, but perhaps we
could be more aggressive with the frequency of the stats reports.  An
interesting historical fact is that the WAL receiver does writes and
syncs of WAL, but it has never reported its statistics to pgstats in
pg_stat_wal.

In the WAL summarizer, the stats reports are done each time the process
waits for WAL.

While on it, pg_stat_io is adjusted so as these two processes do not
report any rows when IOObject is not WAL, making the view easier to use
with less rows.

Two tests are added in TAP, checking statistics for the WAL summarizer
and the WAL receiver.  Status updates in the WAL receiver are currently
possible in the recovery test 001_stream_rep.pl.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z8UKZyVSHUUQJHNb@paquier.xyz
2025-03-05 10:17:39 +09:00
Michael Paquier
c76db55c90 Split pgstat_bestart() into three different routines
pgstat_bestart(), used post-authentication to set up a backend entry
in the PgBackendStatus array, so as its data becomes visible in
pg_stat_activity and related catalogs, has its logic divided into three
routines with this commit, called in order at different steps of the
backend initialization:
* pgstat_bestart_initial() sets up the backend entry with a minimal
amount of information, reporting it with a new BackendState called
STATE_STARTING while waiting for backend initialization and client
authentication to complete.  The main benefit that this offers is
observability, so as it is possible to monitor the backend activity
during authentication.  This step happens earlier than in the logic
prior to this commit.  pgstat_beinit() happens earlier as well, before
authentication.
* pgstat_bestart_security() reports the SSL/GSS status of the
connection, once authentication completes.  Auxiliary processes, for
example, do not need to call this step, hence it is optional.  This
step is called after performing authentication, same as previously.
* pgstat_bestart_final() reports the user and database IDs, takes the
entry out of STATE_STARTING, and reports its application_name.  This is
called as the last step of the three, once authentication completes.

An injection point is added, with a test checking that the "starting"
phase of a backend entry is visible in pg_stat_activity.  Some follow-up
patches are planned to take advantage of this refactoring with more
information provided in backend entries during authentication (LDAP
hanging was a problem for the author, initially).

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAOYmi+=60deN20WDyCoHCiecgivJxr=98s7s7-C8SkXwrCfHXg@mail.gmail.com
2025-03-04 14:09:44 +09:00
Melanie Plageman
06eae9e621 Trigger more frequent autovacuums with relallfrozen
Calculate the insert threshold for triggering an autovacuum of a
relation based on the number of unfrozen pages.

By only considering the unfrozen portion of the table when calculating
how many tuples to add to the insert threshold, we can trigger more
frequent vacuums of insert-heavy tables. This increases the chances of
vacuuming those pages when they still reside in shared buffers

This also increases the number of autovacuums triggered by tuples
inserted and not by wraparound risk. We prefer to freeze these pages
during insert-triggered autovacuums, as anti-wraparound vacuums are not
automatically canceled by conflicting lock requests.

We calculate the unfrozen percentage of the table using the recently
added (99f8f3fbbc) relallfrozen column of pg_class.

Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_aj-P7YyBz_cPNwztz6ohP%2BvWis%3Diz3YcomkB3NpYA--w%40mail.gmail.com
2025-03-03 14:42:00 -05:00
Peter Eisentraut
7202d72787 backend launchers void * arguments for binary data
Change backend launcher functions to take void * for binary data
instead of char *.  This removes the need for numerous casts.

Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Discussion: https://www.postgresql.org/message-id/flat/fd1fcedb-3492-4fc8-9e3e-74b97f2db6c7%40eisentraut.org
2025-02-21 08:03:33 +01:00
Melanie Plageman
052026c9b9 Eagerly scan all-visible pages to amortize aggressive vacuum
Aggressive vacuums must scan every unfrozen tuple in order to advance
the relfrozenxid/relminmxid. Because data is often vacuumed before it is
old enough to require freezing, relations may build up a large backlog
of pages that are set all-visible but not all-frozen in the visibility
map. When an aggressive vacuum is triggered, all of these pages must be
scanned. These pages have often been evicted from shared buffers and
even from the kernel buffer cache. Thus, aggressive vacuums often incur
large amounts of extra I/O at the expense of foreground workloads.

To amortize the cost of aggressive vacuums, eagerly scan some
all-visible but not all-frozen pages during normal vacuums.

All-visible pages that are eagerly scanned and set all-frozen in the
visibility map are counted as successful eager freezes and those not
frozen are counted as failed eager freezes.

If too many eager scans fail in a row, eager scanning is temporarily
suspended until a later portion of the relation. The number of failures
tolerated is configurable globally and per table.

To effectively amortize aggressive vacuums, we cap the number of
successes as well. Capping eager freeze successes also limits the amount
of potentially wasted work if these pages are modified again before the
next aggressive vacuum. Once we reach the maximum number of blocks
successfully eager frozen, eager scanning is disabled for the remainder
of the vacuum of the relation.

Original design idea from Robert Haas, with enhancements from
Andres Freund, Tomas Vondra, and me

Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/flat/CAAKRu_ZF_KCzZuOrPrOqjGVe8iRVWEAJSpzMgRQs%3D5-v84cXUg%40mail.gmail.com
2025-02-11 13:53:48 -05:00
Nathan Bossart
306dc520b9 Introduce autovacuum_vacuum_max_threshold.
One way autovacuum chooses tables to vacuum is by comparing the
number of updated or deleted tuples with a value calculated using
autovacuum_vacuum_threshold and autovacuum_vacuum_scale_factor.
The threshold specifies the base value for comparison, and the
scale factor specifies the fraction of the table size to add to it.
This strategy ensures that smaller tables are vacuumed after fewer
updates/deletes than larger tables, which is reasonable in many
cases but can result in infrequent vacuums on very large tables.
This is undesirable for a couple of reasons, such as very large
tables incurring a huge amount of bloat between vacuums.

This new parameter provides a way to set a limit on the value
calculated with autovacuum_vacuum_threshold and
autovacuum_vacuum_scale_factor so that very large tables are
vacuumed more frequently.  By default, it is set to 100,000,000
tuples, but it can be disabled by setting it to -1.  It can also be
adjusted for individual tables by changing storage parameters.

Author: Nathan Bossart <nathandbossart@gmail.com>
Co-authored-by: Frédéric Yhuel <frederic.yhuel@dalibo.com>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Michael Banck <mbanck@gmx.net>
Reviewed-by: Joe Conway <mail@joeconway.com>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: David Rowley <dgrowleyml@gmail.com>
Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com>
Reviewed-by: Vinícius Abrahão <vinnix.bsd@gmail.com>
Reviewed-by: Robert Treat <rob@xzilla.net>
Reviewed-by: Alena Rybakina <a.rybakina@postgrespro.ru>
Discussion: https://postgr.es/m/956435f8-3b2f-47a6-8756-8c54ded61802%40dalibo.com
2025-02-05 15:48:18 -06:00
Tom Lane
d4c3a6b8ad Remove obsolete restriction on the range of log_rotation_size.
When syslogger.c was first written, we didn't want to assume that
all platforms have 64-bit ftello.  But we've been assuming that
since v13 (cf commit 799d22461), so let's use that in syslogger.c
and allow log_rotation_size to range up to INT_MAX kilobytes.

The old code effectively limited log_rotation_size to 2GB regardless
of platform.  While nobody's complained, that doesn't seem too far
away from what might be thought reasonable these days.

I noticed this while searching for instances of "1024L" in connection
with commit 041e8b95b.  These were the last such instances.
(We still have instances of L-suffixed literals, but most of them
are associated with wait intervals for pg_usleep or similar functions.
I don't see any urgent reason to change that.)
2025-01-31 14:36:56 -05:00
Andres Freund
87a6690cc6 Change shutdown sequence to terminate checkpointer last
The main motivation for this change is to have a process that can serialize
stats after all other processes have terminated. Serializing stats already
happens in checkpointer, even though walsenders can be active longer.

The only reason the current shutdown sequence does not actively cause problems
is that walsender currently does not generate any stats. However, there is an
upcoming patch changing that.

Another need for this change originates in the AIO patchset, where IO
workers (which, in some edge cases, can emit stats of their own) need to run
while the shutdown checkpoint is being written.

This commit changes the shutdown sequence so checkpointer is signalled (via
SIGINT) to trigger writing the shutdown checkpoint without also causing
checkpointer to exit.  Once checkpointer wrote the shutdown checkpoint it
notifies postmaster via PMSIGNAL_XLOG_IS_SHUTDOWN and waits for the
termination signal (SIGUSR2, as before).  Checkpointer now is terminated after
all children, other than dead-end children and logger, have been terminated,
tracked using the new PM_WAIT_CHECKPOINTER PMState.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-25 11:37:13 -05:00
Andres Freund
f15538cd27 postmaster: Adjust which processes we expect to have exited
Comments and code stated that we expect checkpointer to have been signalled in
case of immediate shutdown / fatal errors, but didn't treat archiver and
walsenders the same. That doesn't seem right.

I had started digging through the history to see where this oddity was
introduced, but it's not the fault of a single commit.

Instead treat archiver, checkpointer, and walsenders the same.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:08:33 -05:00
Andres Freund
463a2ebd9f postmaster: Commonalize FatalError paths
This includes some behavioral changes:

- Previously PM_WAIT_XLOG_ARCHIVAL wasn't handled in HandleFatalError(), that
  doesn't seem quite right.

- Previously a fatal error in PM_WAIT_XLOG_SHUTDOWN lead to jumping back to
  PM_WAIT_BACKENDS, no we go to PM_WAIT_DEAD_END. Jumping backwards doesn't
  seem quite right and we didn't do so when checkpointer failed to fork during
  a shutdown.

- Previously a checkpointer fork failure didn't call SetQuitSignalReason(),
  which would lead to quickdie() reporting
  "terminating connection because of unexpected SIGQUIT signal"
  which seems even worse than the PMQUIT_FOR_CRASH message. If I saw that in
  the log I'd suspect somebody outside of postgres sent SIGQUITs

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:08:31 -05:00
Andres Freund
8edd8c77c8 postmaster: Move code to switch into FatalError state into function
There are two places switching to FatalError mode, behaving somewhat
differently. An upcoming commit will introduce a third. That doesn't seem seem
like a good idea.

This commit just moves the FatalError related code from HandleChildCrash()
into its own function, a subsequent commit will evolve the state machine
change to be suitable for other callers.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Andres Freund
f0b7ab7251 postmaster: Don't repeatedly transition to crashing state
Previously HandleChildCrash() skipped logging and signalling child exits if
already in an immediate shutdown or in FatalError state, but still
transitioned server state in response to a crash. That's redundant.

In the other place we transition to FatalError, we do take care to not do so
when already in FatalError state.

To make it easier to combine different paths for entering FatalError state,
only do so once in HandleChildCrash().

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Andres Freund
d239c1a8e5 postmaster: Don't open-code TerminateChildren() in HandleChildCrash()
After removing the duplication no user of sigquit_child() remains, therefore
remove it.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Andres Freund
4d271e3ec2 checkpointer: Request checkpoint via latch instead of signal
The motivation for this change is that a future commit will use SIGINT for
another purpose (postmaster requesting WAL access to be shut down) and that
there no other signals that we could readily use (see code comment for the
reason why SIGTERM shouldn't be used). But it's also a tad nicer / more
efficient to use SetLatch(), as it avoids sending signals when checkpointer
already is busy.

Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-24 17:00:10 -05:00
Tom Lane
4f15759bdc Don't ask for bug reports about pthread_is_threaded_np() != 0.
We thought that this condition was unreachable in ExitPostmaster,
but actually it's possible if you have both a misconfigured locale
setting and some other mistake that causes PostmasterMain to bail
out before reaching its own check of pthread_is_threaded_np().

Given the lack of other reports, let's not ask for bug reports if
this occurs; instead just give the same hint as in PostmasterMain.

Bug: #18783
Reported-by: anani191181515@gmail.com
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/18783-d1873b95a59b9103@postgresql.org
Discussion: https://postgr.es/m/206317.1737656533@sss.pgh.pa.us
Backpatch-through: 13
2025-01-23 14:23:04 -05:00
Andres Freund
28e7a9968e postmaster: Rename some shutdown related PMState phase names
The previous names weren't particularly clear. Future patches will add more
shutdown phases, making it even more important to have understandable shutdown
phases.

Suggested-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/d2cd8fd3-396a-4390-8f0b-74be65e72899@iki.fi
2025-01-10 11:43:00 -05:00
Andres Freund
e84712c738 postmaster: Make btmask_add() variadic
Suggested-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/d2cd8fd3-396a-4390-8f0b-74be65e72899@iki.fi
2025-01-10 11:43:00 -05:00
Andres Freund
7e957cbb50 postmaster: Introduce variadic btmask_all_except()
Upcoming patches would otherwise need btmask_all_except3().

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/w3z6w3g4aovivs735nk4pzjhmegntecesm3kktpebchegm5o53@aonnq2kn27xi
2025-01-10 11:43:00 -05:00
Andres Freund
40d4031abd postmaster: Improve logging of signals sent by postmaster
Previously many, in some cases important, signals we never logged. In other
cases the signal name was only included numerically.

As part of this, change the debug log level the signal is logged at to DEBUG3,
previously some where DEBUG2, some DEBUG4.

Also move from direct use of kill() to signal the av launcher to
signal_child(). There doesn't seem to be a reason for directly using kill().

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-10 11:43:00 -05:00
Andres Freund
7148cbbdc6 postmaster: Update pmState via a wrapper function
This makes logging of state changes easier - state transitions are now logged
at DEBUG1. Without that logging it was surprisingly hard to understand the
current state of the system while debugging.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Reviewed-by: Nazir Bilal Yavuz <byavuz81@gmail.com>
Discussion: https://postgr.es/m/kgng5nrvnlv335evmsuvpnh354rw7qyazl73kdysev2cr2v5zu@m3cfzxicm5kp
2025-01-10 11:42:56 -05:00
Nathan Bossart
c758119e5b Allow changing autovacuum_max_workers without restarting.
This commit introduces a new parameter named
autovacuum_worker_slots that controls how many autovacuum worker
slots to reserve during server startup.  Modifying this new
parameter's value does require a server restart, but it should
typically be set to the upper bound of what you might realistically
need to set autovacuum_max_workers.  With that new parameter in
place, autovacuum_max_workers can now be changed with a SIGHUP
(e.g., pg_ctl reload).

If autovacuum_max_workers is set higher than
autovacuum_worker_slots, a WARNING is emitted, and the server will
only start up to autovacuum_worker_slots workers at a given time.
If autovacuum_max_workers is set to a value less than the number of
currently-running autovacuum workers, the existing workers will
continue running, but no new workers will be started until the
number of running autovacuum workers drops below
autovacuum_max_workers.

Reviewed-by: Sami Imseih, Justin Pryzby, Robert Haas, Andres Freund, Yogesh Sharma
Discussion: https://postgr.es/m/20240410212344.GA1824549%40nathanxps13
2025-01-06 15:01:22 -06:00
David Rowley
11012c5037 Fix an assortment of spelling mistakes and typos
Author: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/5812a0b9-b0cf-4151-9a14-d9f00e4f2858@gmail.com
2025-01-02 12:42:01 +13:00
Bruce Momjian
50e6eb731d Update copyright for 2025
Backpatch-through: 13
2025-01-01 11:21:55 -05:00
Tom Lane
34486b6092 Exclude parallel workers from connection privilege/limit checks.
Cause parallel workers to not check datallowconn, rolcanlogin, and
ACL_CONNECT privileges.  The leader already checked these things
(except for rolcanlogin which might have been checked for a different
role).  Re-checking can accomplish little except to induce unexpected
failures in applications that might not even be aware that their query
has been parallelized.  We already had the principle that parallel
workers rely on their leader to pass a valid set of authorization
information, so this change just extends that a bit further.

Also, modify the ReservedConnections, datconnlimit and rolconnlimit
logic so that these limits are only enforced against regular backends,
and only regular backends are counted while checking if the limits
were already reached.  Previously, background processes that had an
assigned database or role were subject to these limits (with rather
random exclusions for autovac workers and walsenders), and the set of
existing processes that counted against each limit was quite haphazard
as well.  The point of these limits, AFAICS, is to ensure the
availability of PGPROC slots for regular backends.  Since all other
types of processes have their own separate pools of PGPROC slots, it
makes no sense either to enforce these limits against them or to count
them while enforcing the limit.

While edge-case failures of these sorts have been possible for a
long time, the problem got a good deal worse with commit 5a2fed911
(CVE-2024-10978), which caused parallel workers to make some of these
checks using the leader's current role where before we had used its
AuthenticatedUserId, thus allowing parallel queries to fail after
SET ROLE.  The previous behavior was fairly accidental and I have
no desire to return to it.

This patch includes reverting 73c9f91a1, which was an emergency hack
to suppress these same checks in some cases.  It wasn't complete,
as shown by a recent bug report from Laurenz Albe.  We can also revert
fd4d93d26 and 492217301, which hacked around the same problems in one
regression test.

In passing, remove the special case for autovac workers in
CheckMyDatabase; it seems cleaner to have AutoVacWorkerMain pass
the INIT_PG_OVERRIDE_ALLOW_CONNS flag, now that that does what's
needed.

Like 5a2fed911, back-patch to supported branches (which sadly no
longer includes v12).

Discussion: https://postgr.es/m/1808397.1735156190@sss.pgh.pa.us
2024-12-28 16:08:50 -05:00
Heikki Linnakangas
952365cded Remove unnecessary GetTransactionSnapshot() calls
In get_database_list() and get_subscription_list(), the
GetTransactionSnapshot() call is not required because the catalog
table scans use the catalog snapshot, which is held until the end of
the scan. See table_beginscan_catalog(), which calls
RegisterSnapshot(GetCatalogSnapshot(relid)).

In InitPostgres, it's a little less obvious that it's not required,
but still true I believe. All the catalog lookups in InitPostgres()
also use the catalog snapshot, and the looked up values are copied
while still holding the snapshot.

Furthermore, as the removed FIXME comments said, calling
GetTransactionSnapshot() didn't really prevent MyProc->xmin from being
reset anyway.

Discussion: https://www.postgresql.org/message-id/7c56f180-b9e1-481e-8c1d-efa63de3ecbb@iki.fi
2024-12-23 12:42:39 +02:00
Tom Lane
c91963da13 Set the stack_base_ptr in main(), not in random other places.
Previously we did this in PostmasterMain() and InitPostmasterChild(),
which meant that stack depth checking was disabled in non-postmaster
server processes, for instance in single-user mode.  That seems like
a fairly bad idea, since there's no a-priori restriction on the
complexity of queries we will run in single-user mode.  Moreover, this
led to not having quite the same stack depth limit in all processes,
which likely has no real-world effect but it offends my inner neatnik.
Setting the depth in main() guarantees that check_stack_depth() is
armed and has a consistent interpretation of stack depth in all forms
of server processes.

While at it, move the code associated with checking the stack depth
out of tcop/postgres.c (which was never a great home for it) into
a new file src/backend/utils/misc/stack_depth.c.

Discussion: https://postgr.es/m/2081982.1734393311@sss.pgh.pa.us
2024-12-17 12:08:42 -05:00
Thomas Munro
7bc9a8bdd2 Fix warnings about declaration of environ on MinGW.
POSIX says that the global variable environ shouldn't be declared in a
header, and that you have to declare it yourself.  MinGW declares it in
<stdlib.h> with some macrology that messes up our declarations.  Visual
Studio doesn't warn (there are clues that it may also declare it, but if
so, apparently compatibly).  Suppress our declarations, on MinGW only.

This clears the last warnings on CI's optional MinGW task, and hopefully
on build farm animal fairywren too.

Like 1319997d, no back-patch for now as it's not known to be breaking
anything, and my humble goal is just to keep the MinGW build clean going
forward.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGJLMh%2B6W5E4M_jSFb43gnrA_-Q6-%2BBf3HkBXyGfRFcBsQ%40mail.gmail.com
2024-12-15 00:41:27 +13:00
Noah Misch
8b9cbf4922 Fix elog(FATAL) before PostmasterMain() or just after fork().
Since commit 97550c0711, these failed with
"PANIC:  proc_exit() called in child process" due to uninitialized or
stale MyProcPid.  That was reachable if close() failed in
ClosePostmasterPorts() or setlocale(category, "C") failed, both
unlikely.  Back-patch to v13 (all supported versions).

Discussion: https://postgr.es/m/20241208034614.45.nmisch@google.com
2024-12-10 13:51:59 -08:00
Nathan Bossart
76fd342496 Provide a better error message for misplaced dispatch options.
Before this patch, misplacing a special must-be-first option for
dispatching to a subprogram (e.g., postgres -D . --single) would
fail with an error like

	FATAL:  --single requires a value

This patch adjusts this error to more accurately complain that the
special option wasn't listed first.  The aforementioned error
message now looks like

	FATAL:  --single must be first argument

The dispatch option parsing code has been refactored for use
wherever ParseLongOption() is called.  Beyond the obvious advantage
of avoiding code duplication, this should prevent similar problems
when new dispatch options are added.  Note that we assume that none
of the dispatch option names match another valid command-line
argument, such as the name of a configuration parameter.

Ideally, we'd remove this must-be-first requirement for these
options, but after some investigation, we decided that wasn't worth
the added complexity and behavior changes.

Author: Nathan Bossart, Greg Sabino Mullane
Reviewed-by: Greg Sabino Mullane, Peter Eisentraut, Álvaro Herrera, Tom Lane
Discussion: https://postgr.es/m/CAKAnmmJkZtZAiSryho%3DgYpbvC7H-HNjEDAh16F3SoC9LPu8rqQ%40mail.gmail.com
2024-12-04 15:04:15 -06:00
Andres Freund
6a5bcf7f7d postmaster: Reduce verbosity of environment dump debug message
Emitting each variable separately is unnecessarily verbose / hard to skim
over. Emit the whole thing in one ereport() to address that.

Also remove program name and function reference from the message. The former
doesn't seem particularly helpful and the latter is provided by the elog.c
infrastructure these days.

Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/leouteo5ozcrux3fepuhtbp6c56tbfd4naxeokidbx7m75cabz@hhw6g4urlowt
2024-11-27 11:17:23 -05:00
Heikki Linnakangas
5b00786857 Pass MyPMChildSlot as an explicit argument to child process
All the other global variables passed from postmaster to child have
the same value in all the processes, while MyPMChildSlot is more like
a parameter to each child process.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:12:32 +02:00
Heikki Linnakangas
a78af04270 Assign a child slot to every postmaster child process
Previously, only backends, autovacuum workers, and background workers
had an entry in the PMChildFlags array. With this commit, all
postmaster child processes, including all the aux processes, have an
entry. Dead-end backends still don't get an entry, though, and other
processes that don't touch shared memory will never mark their
PMChildFlags entry as active.

We now maintain separate freelists for different kinds of child
processes. That ensures that there are always slots available for
autovacuum and background workers. Previously, pre-authentication
backends could prevent autovacuum or background workers from starting
up, by using up all the slots.

The code to manage the slots in the postmaster process is in a new
pmchild.c source file. Because postmaster.c is just so large.
Assigning pmsignal slot numbers is now pmchild.c's responsibility.
This replaces the PMChildInUse array in pmsignal.c.

Some of the comments in postmaster.c still talked about the "stats
process", but that was removed in commit 5891c7a8ed. Fix those while
we're at it.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:12:28 +02:00
Heikki Linnakangas
bb861414fe Kill dead-end children when there's nothing else left
Previously, the postmaster would never try to kill dead-end child
processes, even if there were no other processes left. A dead-end
backend will eventually exit, when authentication_timeout expires, but
if a dead-end backend is the only thing that's preventing the server
from shutting down, it seems better to kill it immediately. It's
particularly important, if there was a bug in the early startup code
that prevented a dead-end child from timing out and exiting normally.

Includes a test for that case where a dead-end backend previously
prevented the server from shutting down.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:12:04 +02:00
Heikki Linnakangas
18d67a8d7d Replace postmaster.c's own backend type codes with BackendType
Introduce a separate BackendType for dead-end children, so that we
don't need a separate dead_end flag.

Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://www.postgresql.org/message-id/a102f15f-eac4-4ff2-af02-f9ff209ec66f@iki.fi
2024-11-14 16:06:16 +02:00
Heikki Linnakangas
368d8270c8 Rename two functions that wake up other processes
Instead of talking about setting latches, which is a pretty low-level
mechanism, emphasize that they wake up other processes.

This is in preparation for replacing Latches with a new abstraction.
That's still work in progress, but this seems a little tidier anyway,
so let's get this refactoring out of the way already.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
2024-11-01 13:47:24 +02:00
Heikki Linnakangas
a9c546a5a3 Use ProcNumbers instead of direct Latch pointers to address other procs
This is in preparation for replacing Latches with a new abstraction.
That's still work in progress, but this seems a little tidier anyway,
so let's get this refactoring out of the way already.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c%40iki.fi
2024-11-01 13:47:20 +02:00
Daniel Gustafsson
fb7e27abfb Remove duplicate words in comments
A few comments contained duplicate "the" in sentences, fix by removing
one occurrence.

Author: Vignesh C <vignesh21@gmail.com>
Discussion: https://postgr.es/m/CALDaNm2aEEiPwGJmPdzBxROVvs8n75yCjKz4K1f1B2TdWpzxTA@mail.gmail.com
2024-10-31 11:38:03 +01:00
Peter Eisentraut
e18512c000 Remove unused #include's from backend .c files
as determined by IWYU

These are mostly issues that are new since commit dbbca2cf29.

Discussion: https://www.postgresql.org/message-id/flat/0df1d5b1-8ca8-4f84-93be-121081bde049%40eisentraut.org
2024-10-27 08:26:50 +01:00
Heikki Linnakangas
f9ecb57a50 Clean up WaitLatch calls that passed latch without WL_LATCH_SET
The 'latch' argument is ignored if WL_LATCH_SET is not given. Clarify
these calls by not pointlessly passing MyLatch.

Discussion: https://www.postgresql.org/message-id/391abe21-413e-4d91-a650-b663af49500c@iki.fi
2024-10-05 15:31:06 +03:00