The callback functions ReplaceVarsFromTargetList_callback and
pullup_replace_vars_callback are both used to replace Vars in an
expression tree that reference a particular RTE with items from a
targetlist, and they both need to expand whole-tuple references and
deal with OLD/NEW RETURNING list Vars. As a result, currently there
is significant code duplication between these two functions.
This patch introduces a new function, ReplaceVarFromTargetList, to
perform the replacement and calls it from both callback functions,
thereby eliminating code duplication.
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWhr=FM4X5kCPvVs-g2XEk+ceLsNtBK_zZMkqFn9vUjsw@mail.gmail.com
Commit 83ea6c540 added support for virtual generated columns that are
computed on read. All Var nodes in the query that reference virtual
generated columns must be replaced with the corresponding generation
expressions. Currently, this replacement occurs in the rewriter.
However, this approach has several issues. If a Var referencing a
virtual generated column has any varnullingrels, those varnullingrels
need to be propagated into the generation expression. Failing to do
so can lead to "wrong varnullingrels" errors and improper outer-join
removal.
Additionally, if such a Var comes from the nullable side of an outer
join, we may need to wrap the generation expression in a
PlaceHolderVar to ensure that it is evaluated at the right place and
hence is forced to null when the outer join should do so. In certain
cases, such as when the query uses grouping sets, we also need a
PlaceHolderVar for anything that is not a simple Var to isolate
subexpressions. Failure to do so can result in incorrect results.
To fix these issues, this patch expands the virtual generated columns
in the planner rather than in the rewriter, and leverages the
pullup_replace_vars architecture to avoid code duplication. The
generation expressions will be correctly marked with nullingrel bits
and wrapped in PlaceHolderVars when needed by the pullup_replace_vars
callback function. This requires handling the OLD/NEW RETURNING list
Vars in pullup_replace_vars_callback, as it may now deal with Vars
referencing the result relation instead of a subquery.
The "wrong varnullingrels" error was reported by Alexander Lakhin.
The incorrect result issue and the improper outer-join removal issue
were reported by Richard Guo.
Author: Richard Guo <guofenglinux@gmail.com>
Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Discussion: https://postgr.es/m/75eb1a6f-d59f-42e6-8a78-124ee808cda7@gmail.com
This commit adds %P to psql prompts, able to report the status of a
pipeline depending on PQpipelineStatus(): on, off or abort.
The following variables are added to report the state of an ongoing
pipeline:
- PIPELINE_SYNC_COUNT: reports the number of piped syncs.
- PIPELINE_COMMAND_COUNT: reports the number of piped commands, a
command being either \bind, \bind_named, \close or \parse.
- PIPELINE_RESULT_COUNT: reports the results available to read with
\getresults.
These variables can be used with \echo or in a prompt, using "%:name:"
in PROMPT1, PROMPT2 or PROMPT3. Some basic regression tests are added
for these. The suggestion to use variables to show the details about
the status counters comes from me. The original patch proposed was less
extensible, hardcoding the output in the prompt.
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
A non-leaf partition with a subplan that is an Append node was
omitted from PlannedStmt.unprunableRelids because it was mistakenly
included in PlannerGlobal.prunableRelids due to the way
PartitionedRelPruneInfo.leafpart_rti_map[] is constructed. This
happened when a non-leaf partition used an unflattened Append or
MergeAppend. As a result, ExecGetRangeTableRelation() reported an
error when called from CreatePartitionPruneState() to process the
partition's own PartitionPruneInfo, since it was treated as prunable
when it should not have been.
Reported-by: Alexander Lakhin <exclusion@gmail.com> (via sqlsmith)
Diagnosed-by: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tender Wang <tndrwang@gmail.com>
Discussion: https://postgr.es/m/74839af6-aadc-4f60-ae77-ae65f94bf607@gmail.com
When a standby replays an XLOG_PARAMETER_CHANGE record that lowers
wal_level below logical, we invalidate all logical slots in hot
standby mode. However, if this record was replayed while not in hot
standby mode, logical slots could remain valid even after promotion,
potentially causing an assertion failure during WAL record decoding.
To fix this issue, this commit adds a check for hot_standby status
when restoring a logical replication slot on standbys. This check
ensures that logical slots are invalidated when they become
incompatible due to insufficient wal_level during recovery.
Backpatch to v16 where logical decoding on standby was introduced.
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/CAD21AoABoFwGY_Rh2aeE6tEq3HkJxf0c6UeOXn4VV9v6BAQPSw%40mail.gmail.com
Backpatch-through: 16
Our json parsing defined the macros OPTIONAL and REQUIRED to decorate the
structs with for increased readability. This however collides with macros
in the <windef.h> header on Windows.
../src/interfaces/libpq/fe-auth-oauth-curl.c:398:9: warning: "OPTIONAL" redefined
398 | #define OPTIONAL false
| ^~~~~~~~
In file included from D:/a/_temp/msys64/ucrt64/include/windef.h:9,
from D:/a/_temp/msys64/ucrt64/include/windows.h:69,
from D:/a/_temp/msys64/ucrt64/include/winsock2.h:23,
from ../src/include/port/win32_port.h:60,
from ../src/include/port.h:24,
from ../src/include/c.h:1331,
from ../src/include/postgres_fe.h:28,
from ../src/interfaces/libpq/fe-auth-oauth-curl.c:16:
include/minwindef.h:65:9: note: this is the location of the previous definition
65 | #define OPTIONAL
| ^~~~~~~~
Rename to avoid compilation errors in anticipation of implementing
support for Windows.
Reported-by: Dave Cramer (on PostgreSQL Hacking Discord)
Pages from the bitmap created by the TIDBitmap API can be exact or
lossy. The TIDBitmap API extracts the tuple offsets from exact pages
into an array for the convenience of the caller.
This was done in tbm_private|shared_iterate() right after advancing the
iterator. However, as long as tbm_private|shared_iterate() set a
reference to the PagetableEntry in the TBMIterateResult, the offset
extraction can be done later.
Waiting to extract the tuple offsets has a few benefits. For the shared
iterator case, it allows us to extract the offsets after dropping the
shared iterator state lock, reducing time spent holding a contended
lock.
Separating the iteration step and extracting the offsets later also
allows us to avoid extracting the offsets for prefetched blocks. Those
offsets were never used, so the overhead of extracting and storing them
was wasted.
The real motivation for this change, however, is that future commits
will make bitmap heap scan use the read stream API. This requires a
TBMIterateResult per issued block. By removing the array of tuple
offsets from the TBMIterateResult and only extracting the offsets when
they are used, we reduce the memory required for per buffer data
substantially.
Suggested-by: Thomas Munro <thomas.munro@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/CA%2BhUKGLHbKP3jwJ6_%2BhnGi37Pw3BD5j2amjV3oSk7j-KyCnY7Q%40mail.gmail.com
This comment mentions that we check that the configured number of
backends does not exceed MAX_BACKENDS in RegisterBackgroundWorker()
and relevant GUC check hooks, neither of which has those checks
anymore. To fix, adjust this comment to say that we do the check
in InitializeMaxBackends().
Oversights in commits 6bc8ef0b7f and 0b1fe1413e.
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/Z7zOEzz8lNjaU9yf%40nathan
Although they're exposed as int4 in pg_class, relpages and
relallvisible are really of type BlockNumber, that is uint32.
Correct type puns in relation_statistics_update() and remove
inappropriate range-checks. The type puns are only cosmetic
issues, but the range checks would cause failures with huge
relations.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Author: Corey Huinker <corey.huinker@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/614341.1740269035@sss.pgh.pa.us
Jacob reported that comments for LW_SHARED_MASK referenced a MAX_BACKENDS
limit of 2^23-1, but that MAX_BACKENDS is actually limited to 2^18-1. The
limit was lowered in 48354581a4, but the comment in lwlock.c wasn't updated.
Instead of just fixing the comment, it seems better to directly base the
lwlock defines on MAX_BACKENDS and add static assertions to ensure that there
is enough space. That way there's no comment that can go out of sync in the
future.
As part of that change I noticed that for some reason the high bit wasn't used
for flags, which seems somewhat odd. Redefine the flag values to start at the
highest bit.
Reported-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Reviewed-by: Jacob Brazeal <jacob.brazeal@gmail.com>
Discussion: https://postgr.es/m/CA+COZaBO_s3LfALq=b+HcBHFSOEGiApVjrRacCe4VP9m7CJsNQ@mail.gmail.com
MAX_BACKENDS influences many things besides postmaster. I e.g. noticed that we
don't have static assertions ensuring BUF_REFCOUNT_MASK is big enough for
MAX_BACKENDS, adding them would require including postmaster.h in
buf_internals.h which doesn't seem right.
While at that, add MAX_BACKENDS_BITS, as that's useful in various places for
static assertions (to be added in subsequent commits).
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/wptizm4qt6yikgm2pt52xzyv6ycmqiutloyvypvmagn7xvqkce@d4xuv3mylpg4
Due to misunderstanding on my part, commit 235328ee4 did not go far
enough to silence older versions of Valgrind. For those, it was the bit
scan that was problematic, not the subsequent bit-masking operation. To
fix, use the unaligned path for the trailing bytes. Since we don't have
a bit scan here anymore, also remove some comments and endian-specific
coding around that.
Reported-by: Anton A. Melnikov <a.melnikov@postgrespro.ru>
Discussion: https://postgr.es/m/f3aa2d45-3b28-41c5-9499-a1bc30e0f8ec@postgrespro.ru
Backpatch-through: 17
The four following attributes are removed from pg_stat_wal:
* wal_write
* wal_sync
* wal_write_time
* wal_sync_time
a051e71e28 has added an equivalent of this information in pg_stat_io
with more granularity as this now spreads across the backend types, IO
context and IO objects. So, keeping the same information in pg_stat_wal
has little benefits.
Another benefit of this commit is the removal of PendingWalStats,
simplifying an upcoming patch to add per-backend WAL statistics, which
already support IO statistics and which have access to the write/sync
stats data of WAL.
The GUC track_wal_io_timing, that was used to enable or disable the
aggregation of the write and sync timings for WAL, is also removed.
pgstat_prepare_io_time() is simplified.
Bump catalog version.
Bump PGSTAT_FILE_FORMAT_ID, due to the update of PgStat_WalStats.
Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com>
Discussion: https://postgr.es/m/Z7RkQ0EfYaqqjgz/@ip-10-97-1-34.eu-west-3.compute.internal
Our cross-version upgrade tests have been failing for some pre-v10
source versions since commit 1fd1bd871. This turns out to be
because relallvisible may change for tables that have hash indexes,
because the upgrade process forcibly reindexes such indexes to
deal with the changes made in v10.
Fortunately, the set of tables that have such indexes is small
and won't change anymore in those branches. So just hack up
AdjustUpgrade.pm to not compare the relallvisible values of
those specific tables.
While here, also tighten the regex that suppresses comparison
of version fields.
Discussion: https://postgr.es/m/812817.1740277228@sss.pgh.pa.us
This commit checks interactions with pipelines and implicit transaction
blocks for the following commands that have their own behaviors when
used in pipelines depending on their order in a pipeline and sync
requests:
- SET LOCAL
- REINDEX CONCURRENTLY
- VACUUM
- Subtransactions (SAVEPOINT, ROLLBACK TO SAVEPOINT)
These scenarios could be tested only with pgbench previously. The
meta-commands of psql controlling pipelines make these easier to
implement, debug, and they can be run in a SQL script.
Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com>
Discussion: https://postgr.es/m/CAO6_XqroE7JuMEm1sWz55rp9fAYX2JwmcP_3m_v51vnOFdsLiQ@mail.gmail.com
The 'cached-plan-inval' test suite, introduced in 525392d57 under
src/test/modules/delay_execution, aimed to verify that cached plan
invalidation triggers replanning after deferred locks are taken.
However, its ExecutorStart_hook-based approach relies on lock timing
assumptions that, in retrospect, are fragile. This instability was
exposed by failures on BF animal trilobite, which builds with
CLOBBER_CACHE_ALWAYS.
One option was to dynamically disable the cache behavior that causes
the test suite to fail by setting "debug_discard_caches = 0", but it
seems better to remove the suite. The risk of future failures due to
other cache flush hazards outweighs the benefit of catching real
breakage in the backend behavior it tests.
Reported-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2990641.1740117879@sss.pgh.pa.us
To implement AIO writes, the backend initiating writes needs to transfer the
lock ownership to the AIO subsystem, so the lock held during the write can be
released in another backend.
Other backends need to be able to "complete" an asynchronously started IO to
avoid deadlocks (consider e.g. one backend starting IO for a buffer and then
waiting for a heavyweight lock held by another relation followed by the
current holder of the heavyweight lock waiting for the IO to complete).
To that end, this commit adds LWLockDisown() and LWLockReleaseDisowned(). If
code uses LWLockDisown() it's the code's responsibility to ensure that the
lock is released in case of errors.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Discussion: https://postgr.es/m/1f6b50a7-38ef-4d87-8246-786d39f46ab9@iki.fi
Per the buildfarm, these tests appear to be unstable in the wake of
commit ddb17e387a. I'm not sure that
just hiding this output is the right way forward, because I think
there may be other test cases that will fail with lower probability
even after this fix. However, it's hard to tell right now, because
this is failing on a number of buildfarm animals. So let's try this
for now to either get a clearer picture of what else is broken, or
as a stopgap until we decide what the permanent fix should be, or
perhaps this will be the permanent fix after all.
Concurrently dropping either the granted role or the grantee
does not stop GRANT from completing, instead resulting in a
dangling role reference in pg_auth_members. That's relatively
harmless in the short run, but inconsistent catalog entries
are not a good thing.
This patch solves the problem by adding the granted and grantee
roles as explicit shared dependencies of the pg_auth_members entry.
That's a bit indirect, but it works because the pg_shdepend code
applies the necessary locking and rechecking.
Commit 6566133c5 previously established similar handling for
the grantor column of pg_auth_members; it's not clear why it
didn't cover the other two role OID columns.
A side-effect of this approach is that DROP OWNED BY will now drop
pg_auth_members entries that mention the target role as either the
granted or grantee role. That's clearly appropriate for the
grantee, since we'll drop its other privileges too. It doesn't
seem too far out of line for the granted role, since we're
presumably about to drop it and besides we're removing all reasons
why it'd matter to be a member of it. (One could argue that this
makes DropRole's code to auto-drop pg_auth_members entries
unnecessary, but I chose to leave it in place since perhaps some
people's workflows expect that to work without a DROP OWNED BY.)
Note to patch readers: CreateRole's first CommandCounterIncrement
call is now unconditional, because this change creates another
case in which it's needed, and it seemed to be more trouble than
it's worth to preserve that micro-optimization.
Arguably this is a bug fix, but the fact that it changes the
expected contents of pg_shdepend seems like not a great thing
to do in the stable branches, and perhaps we don't want the
change in DROP OWNED BY semantics there either. On the other
hand, I opted not to force a catversion bump in HEAD, because
the presence or absence of these entries doesn't matter for
most purposes.
Reported-by: Virender Singla <virender.cse@gmail.com>
Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at>
Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com
There is a race condition between "GRANT role" and "DROP ROLE",
which allows GRANT to install pg_auth_members entries that refer to
dropped roles. (Commit 6566133c5 prevented that for the grantor
field, but not for the granted or grantee roles.) We'll soon fix
that, at least in HEAD, but pg_dumpall needs to cope with the
situation in case of pre-existing inconsistency. As pg_dumpall
stands, it will emit invalid commands like 'GRANT foo TO ""',
which causes pg_upgrade to fail. Fix it to emit warnings and skip
those GRANTs, instead.
There was some discussion of removing the problem by changing
dumpRoleMembership's query to use JOIN not LEFT JOIN, but that
would result in silently ignoring such entries. It seems better
to produce a warning.
Pre-v16 branches already coped with dangling grantor OIDs by simply
omitting the GRANTED BY clause. I left that behavior as-is, although
it's somewhat inconsistent with the behavior of later branches.
Reported-by: Virender Singla <virender.cse@gmail.com>
Discussion: https://postgr.es/m/CAM6Zo8woa62ZFHtMKox6a4jb8qQ=w87R2L0K8347iE-juQL2EA@mail.gmail.com
Backpatch-through: 13
GIN and GiST indexes utilizing pg_trgm's opclasses store sorted
trigrams within index tuples. When comparing and sorting each trigram,
pg_trgm treats each character as a 'char[3]' type in C. However, the
char type in C can be interpreted as either signed char or unsigned
char, depending on the platform, if the signedness is not explicitly
specified. Consequently, during replication between different CPU
architectures, there was an issue where index scans on standby servers
could not locate matching index tuples due to the differing treatment
of character signedness.
This change introduces comparison functions for trgm that explicitly
handle signed char and unsigned char. The appropriate comparison
function will be dynamically selected based on the character
signedness stored in the control file. Therefore, upgraded clusters
can utilize the indexes without rebuilding, provided the cluster
upgrade occurs on platforms with the same character signedness as the
original cluster initialization.
The default char signedness information was introduced in 44fe30fdab,
so no backpatch.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
This change adds a new option --set-char-signedness to pg_upgrade. It
enables user to set arbitrary signedness during pg_upgrade. This helps
cases where user who knew they copied the v17 source cluster from
x86 (signedness=true) to ARM (signedness=false) can pg_upgrade
properly without the prerequisite of acquiring an x86 VM.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
Commit 44fe30fdab introduced the 'default_char_signedness' field in
controlfile. Newly created database clusters always set this field to
'signed'.
This change ensures that pg_upgrade updates the
'default_char_signedness' to 'unsigned' if the source database cluster
has signedness=false. For source clusters from v17 or earlier, which
lack the 'default_char_signedness' information, pg_upgrade assumes the
source cluster was initialized on the same platform where pg_upgrade
is running. It then sets the 'default_char_signedness' value according
to the current platform's default character signedness.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
With the newly added option --char-signedness, pg_resetwal updates the
default char signedness flag in the controlfile. This option is
primarily intended for an upcoming patch that pg_upgrade supports
preserving the default char signedness during upgrades, and is not
meant for manual operation.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
The signedness of the 'char' type in C is
implementation-dependent. For instance, 'signed char' is used by
default on x86 CPUs, while 'unsigned char' is used on aarch
CPUs. Previously, we accidentally let C implementation signedness
affect persistent data. This led to inconsistent results when
comparing char data across different platforms.
This commit introduces a new 'default_char_signedness' field in
ControlFileData to store the signedness of the 'char' type. While this
change does not encourage the use of 'char' without explicitly
specifying its signedness, this field can be used as a hint to ensure
consistent behavior for pre-v18 data files that store data sorted by
the 'char' type on disk (e.g., GIN and GiST indexes), especially in
cross-platform replication scenarios.
Newly created database clusters unconditionally set the default char
signedness to true. pg_upgrade (with an upcoming commit) changes this
flag for clusters if the source database cluster has
signedness=false. As a result, signedness=false setting will become
rare over time. If we had known about the problem during the last
development cycle that forced initdb (v8.3), we would have made all
clusters signed or all clusters unsigned. Making pg_upgrade the only
source of signedness=false will cause the population of database
clusters to converge toward that retrospective ideal.
Bump catalog version (for the catalog changes) and PG_CONTROL_VERSION
(for the additions in ControlFileData).
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://postgr.es/m/CB11ADBC-0C3F-4FE0-A678-666EE80CBB07%40amazon.com
Previously the portlock logic, added in 9b4eafcaf4, didn't actually work
properly when the tests were run via meson. 9b4eafcaf4 used the
MESON_BUILD_ROOT environment variable to determine the directory for the port
lock directory, but that's never set for running the tests. That meant that
each test used its own portlock dir, unless the PG_TEST_PORT_DIR environment
variable was set.
Fix the problem by setting top_builddir for the environment. That's also used
for the autoconf/make build.
Backpatch back to 16, where meson support was added.
Reported-by: Zharkov Roman <r.zharkov@postgrespro.ru>
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Backpatch-through: 16
Dumps from versions older than v16 do not know about NO INDENT in a
XMLSERIALIZE() clause. This commit adjusts AdjustUpgrade.pm so as NO
INDENT is discarded in the contents of the new dump adjusted for
comparison when the old version is v15 or older. This should be enough
to make the cross-version upgrade tests pass.
Per report from buildfarm member crake. Oversight in 984410b923.
Reviewed-by: Andrew Dunstan <andrew@dunslane.net>
Discussion: https://postgr.es/m/88b183f1-ebf9-4f51-9144-3704380ccae7@dunslane.net
Backpatch-through: 16
This allows using text position search functions with nondeterministic
collations. These functions are
- position, strpos
- replace
- split_part
- string_to_array
- string_to_table
which all use common internal infrastructure.
There was previously no internal implementation of this, so it was met
with a not-supported error. This adds the internal implementation and
removes the error.
Unlike with deterministic collations, the search cannot use any
byte-by-byte optimized techniques but has to go substring by
substring. We also need to consider that the found match could have a
different length than the needle and that there could be substrings of
different length matching at a position. In most cases, we need to
find the longest such substring (greedy semantics), but this can be
configured by each caller.
Reviewed-by: Euler Taveira <euler@eulerto.com>
Discussion: https://www.postgresql.org/message-id/flat/582b2613-0900-48ca-8b0d-340c06f4d400@eisentraut.org