Previously the instrumentation logic always converted to seconds, only for
many of the callers to do unnecessary division to get to milliseconds. As an
upcoming refactoring will split the Instrumentation struct, utilize instrtime
always to keep things simpler. It's also a bit faster to not have to first
convert to a double in functions like InstrEndLoop(), InstrAggNode().
Author: Lukas Fittl <lukas@fittl.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/CAP53PkzZ3UotnRrrnXWAv=F4avRq9MQ8zU+bxoN9tpovEu6fGQ@mail.gmail.com
This formerly said "unique constraint must ...", which was accurate
enough when it only applied to UNIQUE and PRIMARY KEY constraints.
However, now we use it for exclusion constraints too, and in that
case it's a tad confusing. Do what we already did in the errdetail
message: print the constraint_type, so that it looks like "UNIQUE
constraint ...", "EXCLUDE constraint ...", etc.
Author: jian he <jian.universality@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CACJufxH6VhAf65Vghg4T2q315gY=Rt4BUfMyunkfRj0n2S9n-g@mail.gmail.com
Since commit bd15b7db48, pg_dump uses pg_get_sequence_data() (née
pg_sequence_read_tuple()) to gather all sequence data in a single
query as opposed to a query per sequence. Two related bugs have
been identified:
* If the user lacks appropriate privileges on the sequence, pg_dump
generates a setval() command with garbage values instead of
failing as expected.
* pg_dump can fail due to a concurrently dropped sequence, even if
the dropped sequence's data isn't part of the dump.
This commit fixes the above issues by 1) teaching
pg_get_sequence_data() to return nulls instead of erroring for a
missing sequence and 2) teaching pg_dump to fail if it tries to
dump the data of a sequence for which pg_get_sequence_data()
returned nulls. Note that pg_dump may still fail due to a
concurrently dropped sequence, but it should now only do so when
the sequence data is part of the dump. This matches the behavior
before commit bd15b7db48.
Bug: #19365
Reported-by: Paveł Tyślacki <pavel.tyslacki@gmail.com>
Suggested-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/19365-6245240d8b926327%40postgresql.org
Discussion: https://postgr.es/m/2885944.1767029161%40sss.pgh.pa.us
Backpatch-through: 18
This is important for Postgres extensions that are written in C++,
such as pg_duckdb, which uses PGXS as the build system currently. In
the autotools build, C++ is not coupled to LLVM. If the autotools
build is configured without --with-llvm, the C++ compiler and the
various flags get persisted into the Makefile.global.
Author: Tristan Partin <tristan@partin.io>
Author: Jelte Fennema-Nio <postgres@jeltef.nl>
Discussion: https://www.postgresql.org/message-id/flat/D98JHQF7H2A8.VSE3I4CJBTAB%40partin.io
When creating a partition for a RANGE partitioned table, the reporting
of errors relating to converting the specified range values into
constant values for the partition key's type could display the name of a
previous partition key column when an earlier range was specified as
MINVALUE or MAXVALUE.
This was caused by the code not correctly incrementing the index that
tracks which partition key the foreach loop was working on after
processing MINVALUE/MAXVALUE ranges.
Fix by using foreach_current_index() to ensure the index variable is
always set to the List element being worked on.
Author: myzhen <zhenmingyang@yeah.net>
Reviewed-by: zhibin wang <killerwzb@gmail.com>
Discussion: https://postgr.es/m/273cab52.978.19b96fc75e7.Coremail.zhenmingyang@yeah.net
Backpatch-through: 14
This patch completes the transition to making inet_ops be default for
inet/cidr columns, rather than btree_gist's opclasses. Once we do
that, though, pg_upgrade has a big problem. A dump from an older
version will see btree_gist's opclasses as being default, so it will
not mention the opclass explicitly in CREATE INDEX commands, which
would cause the restore to create the indexes using inet_ops. Since
that's not compatible with what's actually in the files, havoc would
ensue.
This isn't readily fixable, because the CREATE INDEX command strings
are built by the older server's pg_get_indexdef() function; pg_dump
hasn't nearly enough knowledge to modify those strings successfully.
Even if we cared to put in the work to make that happen in pg_dump,
it would be counterproductive because the end goal here is to get
people off of these opclasses. Allowing such indexes to persist
through pg_upgrade wouldn't advance that goal.
Therefore, this patch just adds code to pg_upgrade to detect indexes
that would be problematic and refuse to upgrade.
There's another issue too: even without any indexes to worry about,
pg_dump in binary-upgrade mode will reproduce the "CREATE OPERATOR
CLASS ... DEFAULT" commands for btree_gist's opclasses, and those
will fail because now we have a built-in opclass that provides a
conflicting default. We could ask users to drop the btree_gist
extension altogether before upgrading, but that would carry very
severe penalties. It would affect perfectly-valid indexes for other
data types, and it would drop operators that might be relied on in
views or other database objects. Instead, put a hack in DefineOpClass
to ignore the DEFAULT clauses for these opclasses when in
binary-upgrade mode. This will result in installing a version of
btree_gist that isn't quite the version it claims to be, but that can
be fixed by issuing ALTER EXTENSION UPDATE afterwards.
Since we don't apply that hack when not in binary-upgrade mode,
it is now impossible to install any version of btree_gist
less than 1.9 via CREATE EXTENSION.
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Matthias van de Meent <boekewurm+postgres@gmail.com>
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/2483812.1754072263@sss.pgh.pa.us
The only user-visible change is the fix in the "malformed
pg_dependencies" error detail. That one is new in commit e1405aa5e3,
so no backpatching required.
When repurposing a standby to a logical replica, pg_createsubscriber
uses for the new replica a set of configuration parameters saved into
postgresql.auto.conf, to force recovery patterns when the physical
replica is promoted.
While not wrong in practice, this approach can cause issues when forcing
again recovery on a logical replica or its base backup as the recovery
parameters are not reset on the target server once pg_createsubscriber
is done with the node.
This commit aims at improving the situation, by changing the way
recovery parameters are saved on the target node. Instead of writing
all the configuration to postgresql.auto.conf, this file now uses an
include_if_exists, that points to a pg_createsubscriber.conf. This new
file contains all the recovery configuration, and is renamed to
pg_createsubscriber.conf.disabled when pg_createsubscriber exits. This
approach resets the recovery parameters, and offers the benefit to keep
a trace of the setup used when the target node got promoted, for
debugging purposes. If pg_createsubscriber.conf cannot be renamed
(unlikely scenario), a warning is issued to inform users that a manual
intervention may be required to reset this configuration.
This commit includes a test case to demonstrate the problematic case: a
standby node created from a base backup of what was the target node of
pg_createsubscriber does not get confused when started. If removing
this new logic, the test fails with the standby not able to start due
to an incorrect recovery target setup, where the startup process fails
quickly with a FATAL.
I have provided the design idea for the patch, that Alyona has written
(with some code adjustments from me). This could be considered as a
bug, but after discussion this is put into the bucket for improvements.
Redesigning pg_createsubscriber would not be acceptable in the stable
branches anyway.
Author: Alyona Vinter <dlaaren8@gmail.com>
Reviewed-by: Ilyasov Ian <ianilyasov@outlook.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Andrey Rudometov <unlimitedhikari@gmail.com>
Discussion: https://postgr.es/m/CAGWv16K6L6Pzm99i1KiXLjFWx2bUS3DVsR6yV87-YR9QO7xb3A@mail.gmail.com
During catcache searches, the most-recently searched entries are kept at
the head of the list to speed up subsequent searches, keeping the
"freshest" entries at its beginning. A rehash of the catcache was doing
the opposite: fresh entries were moved to the tail of the newly-created
buckets, causing a rehash to slow down a bit.
When a rehash is done, this commit switches the code to use
dlist_push_tail() instead of dlist_push_head(), so as fresh entries are
kept at the head of the lists, not their tail.
Author: ChangAo Chen <cca5507@qq.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/tencent_9EA10D8512B5FE29E7323F780A0749768708@qq.com
This new identifier type provides support for 64-bit unsigned values,
to be used in catalogs, like OIDs. An advantage of a new data type is
that it becomes easier to grep for it in the code when assigning this
type to a catalog attribute, linking it to dedicated APIs and internal
structures.
The following operators are added in this commit, with dedicated tests:
- Casts with integer types and OID.
- btree and hash operators
- min/max functions.
- C type with related macros and defines, named around "Oid8".
This has been mentioned as useful on its own on the thread to add
support for 64-bit TOAST values, so as it becomes possible to attach
this data type to the TOAST code and catalog definitions. However, as
this concept can apply to many more areas, it is implemented as its own
independent change. This is based on a discussion with Andres Freund
and Tom Lane.
Bump catalog version.
Author: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Greg Burd <greg@burd.me>
Reviewed-by: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com>
Discussion: https://postgr.es/m/1891064.1754681536@sss.pgh.pa.us
In a7f107df2 I changed subplan param evaluation to happen within the
containing expression. As part of that, ExecInitSubPlanExpr() was changed to
evaluate parameters via a new EEOP_PARAM_SET expression step. These parameters
were temporarily stored into ExprState->resvalue/resnull, with some reasoning
why that would be fine. Unfortunately, that analysis was wrong -
ExecInitSubscriptionRef() evaluates the input array into "resv"/"resnull",
which will often point to ExprState->resvalue/resnull. This means that the
EEOP_PARAM_SET, if inside an array subscript, would overwrite the input array
to array subscript.
The fix is fairly simple - instead of evaluating into
ExprState->resvalue/resnull, store the temporary result of the subplan in the
subplan's return value.
Bug: #19370
Reported-by: Zepeng Zhang <redraiment@gmail.com>
Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us>
Diagnosed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/19370-7fb7a5854b7618f1@postgresql.org
Backpatch-through: 18
The new test 002_worker_terminate relies on the generation of a LOG
entry to check that a worker has been started, but missed the fact that
a node set with log_error_verbosity = verbose would add an error code.
The regexp used for the matching check did not take this case into
account, making the test fail on a timeout. The regexp is now fixed to
handle the verbose case correctly.
Per buildfarm member prion, that uses log_error_verbosity = verbose.
The error was reproducible by setting this GUC the same way in the test.
Oversight in f1e251be80.
Background workers gain a new flag, called BGWORKER_INTERRUPTIBLE, that
offers the possibility to terminate the workers when these are connected
to a database that is involved in one of the following commands:
ALTER DATABASE RENAME TO
ALTER DATABASE SET TABLESPACE
CREATE DATABASE
DROP DATABASE
This is useful to give background workers the same behavior as backends
and autovacuum workers, which are stopped when these commands are
executed. The default behavior, that exists since 9.3, is still to
never terminate bgworkers connected to the database involved in any of
these commands. The new flag has to be set to terminate the workers.
A couple of tests are added to worker_spi to track the commands that
impact the termination of the workers. There is a test case for a
non-interruptible worker, additionally, that relies on an injection
point to make the wait time in CountOtherDBBackends() reduced from 5s to
0.3s for faster test runs. The tests rely on the contents of the server
logs to check if a worker has been started or terminated:
- LOG generated by worker_spi_main() at startup, once connection to
database is done.
- FATAL in bgworker_die() when terminated.
A couple of tests run in the CI have showed that this method is stable
enough. The safe_psql() calls that scan pg_stat_activity could be
replaced with some poll_query_until() for more stability, if the current
method proves to be an issue in the buildfarm.
Author: Aya Iwata <iwata.aya@fujitsu.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Peter Smith <smithpb2250@gmail.com>
Discussion: https://postgr.es/m/OS7PR01MB11964335F36BE41021B62EAE8EAE4A@OS7PR01MB11964.jpnprd01.prod.outlook.com
When processing the "publish" options of an ALTER PUBLICATION command,
we call SplitIdentifierString() to split the options into a List of
strings. Since SplitIdentifierString() modifies the delimiter
character and puts NULs in their place, this would overwrite the memory
of the AlterPublicationStmt. Later in AlterPublicationOptions(), the
modified AlterPublicationStmt is copied for event triggers, which would
result in the event trigger only seeing the first "publish" option
rather than all options that were specified in the command.
To fix this, make a copy of the string before passing to
SplitIdentifierString().
Here we also adjust a similar case in the pgoutput plugin. There's no
known issues caused by SplitIdentifierString() here, so this is being
done out of paranoia.
Thanks to Henson Choi for putting together an example case showing the
ALTER PUBLICATION issue.
Author: sunil s <sunilfeb26@gmail.com>
Reviewed-by: Henson Choi <assam258@gmail.com>
Reviewed-by: zengman <zengman@halodbtech.com>
Backpatch-through: 14
Prior to v15, GUC settings supplied in the CONNECTION clause of
CREATE SUBSCRIPTION were correctly passed through to
the publisher's walsender. For example:
CREATE SUBSCRIPTION mysub
CONNECTION 'options=''-c wal_sender_timeout=1000'''
PUBLICATION ...
would cause wal_sender_timeout to take effect on the publisher's walsender.
However, commit f3d4019da5 changed the way logical replication
connections are established, forcing the publisher's relevant
GUC settings (datestyle, intervalstyle, extra_float_digits) to
override those provided in the CONNECTION string. As a result,
from v15 through v18, GUC settings in the CONNECTION string were
always ignored.
This regression prevented per-connection tuning of logical replication.
For example, using a shorter timeout for walsender connecting
to a nearby subscriber and a longer one for walsender connecting
to a remote subscriber.
This commit restores the intended behavior by ensuring that
GUC settings in the CONNECTION string are again passed through
and applied by the walsender, allowing per-connection configuration.
Backpatch to v15, where the regression was introduced.
Author: Fujii Masao <masao.fujii@gmail.com>
Reviewed-by: Chao Li <lic@highgo.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Japin Li <japinli@hotmail.com>
Discussion: https://postgr.es/m/CAHGQGwGYV+-abbKwdrM2UHUe-JYOFWmsrs6=QicyJO-j+-Widw@mail.gmail.com
Backpatch-through: 15
It's been almost a year since we last did this, and upstream has
been busy. They've added stemmers for Polish and Esperanto,
and also deprecated their old Dutch stemmer in favor of the
Kraaij-Pohlmann algorithm. (The "dutch" stemmer is now the
latter, and "dutch_porter" is the old algorithm.)
Upstream also decided to rename their internal header "header.h"
to something less generic: "snowball_runtime.h". Seems like a good
thing, but it complicates this patch a bit because we were relying on
interposing our own version of "header.h" to control system header
inclusion order. (We're partially failing at that now, because now the
generated stemmer files include <stddef.h> before snowball_runtime.h.
I think that'll be okay, but if the buildfarm complains then we'll
have to do more-extensive editing of the generated files.)
I realized that we weren't documenting the available stemmers in
any user-visible place, except indirectly through sample \dFd output.
That's incomplete because we only provide built-in dictionaries for
the recommended stemmers for each language, not alternative stemmers
such as dutch_porter. So I added a list to the documentation.
I did not do anything with the stopword lists. If those are still
available from snowballstem.org, they are mighty well hidden.
Discussion: https://postgr.es/m/1185975.1767569534@sss.pgh.pa.us
When the standby is passed as a PostgreSQL::Test::Cluster instance,
use the WAIT FOR LSN command on the standby server to implement
wait_for_catchup() for replay, write, and flush modes. This is more
efficient than polling pg_stat_replication on the upstream, as the
WAIT FOR LSN command uses a latch-based wakeup mechanism.
The optimization applies when:
- The standby is passed as a Cluster object (not just a name string)
- The mode is 'replay', 'write', or 'flush' (not 'sent')
- The standby is in recovery
For 'sent' mode, when the standby is passed as a string (e.g., a
subscription name for logical replication), or when the standby has
been promoted, the function falls back to the original polling-based
approach using pg_stat_replication on the upstream.
Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>
Update psql tab completion to support the optional MODE option in the
WAIT FOR LSN command. After specifying an LSN value, completion now offers
both MODE and WITH keywords. The MODE option specifies which LSN type to wait
for. In particular, it controls whether the wait is evaluated from the
standby or primary perspective.
When MODE is specified, the completion suggests the valid mode values:
standby_replay, standby_write, standby_flush, and primary_flush.
Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>
This commit extends the WAIT FOR LSN command with an optional MODE option in
the WITH clause that specifies which LSN type to wait for:
WAIT FOR LSN '<lsn>' [WITH (MODE '<mode>', ...)]
where mode can be:
- 'standby_replay' (default): Wait for WAL to be replayed to the specified
LSN,
- 'standby_write': Wait for WAL to be written (received) to the specified
LSN,
- 'standby_flush': Wait for WAL to be flushed to disk at the specified LSN,
- 'primary_flush': Wait for WAL to be flushed to disk on the primary server.
The default mode is 'standby_replay', matching the original behavior when MODE
is not specified. This follows the pattern used by COPY and EXPLAIN
commands, where options are specified as string values in the WITH clause.
Modes are explicitly named to distinguish between primary and standby
operations:
- Standby modes ('standby_replay', 'standby_write', 'standby_flush') can only
be used during recovery (on a standby server),
- Primary mode ('primary_flush') can only be used on a primary server.
The 'standby_write' and 'standby_flush' modes are useful for scenarios where
applications need to ensure WAL has been received or persisted on the standby
without necessarily waiting for replay to complete. The 'primary_flush' mode
allows waiting for WAL to be flushed on the primary server.
This commit also includes includes:
- Documentation updates for the new syntax and mode descriptions,
- Test coverage for all four modes, including error cases and concurrent
waiters,
- Wakeup logic in walreceiver for standby write/flush waiters,
- Wakeup logic in WAL writer for primary flush waiters.
Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>
Add support for waiting on WAL write and flush LSNs in addition to the
existing replay LSN wait type. This provides the foundation for
extending the WAIT FOR command with MODE parameter.
Key changes are following.
- Add WAIT_LSN_TYPE_STANDBY_WRITE and WAIT_LSN_TYPE_STANDBY_FLUSH to
WaitLSNType.
- Add GetCurrentLSNForWaitType() to retrieve the current LSN for each wait
type.
- Add new wait events WAIT_EVENT_WAIT_FOR_WAL_WRITE and
WAIT_EVENT_WAIT_FOR_WAL_FLUSH for pg_stat_activity visibility.
- Update WaitForLSN() to use GetCurrentLSNForWaitType() internally.
Discussion: https://postgr.es/m/CABPTF7UiArgW-sXj9CNwRzUhYOQrevLzkYcgBydmX5oDes1sjg%40mail.gmail.com
Author: Xuneng Zhou <xunengzhou@gmail.com>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Chao Li <li.evan.chao@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@kurilemu.de>