1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-22 12:22:45 +03:00
Commit Graph

26279 Commits

Author SHA1 Message Date
Tom Lane
7a80e381d1 Skip useless calculation of join RTE column names during EXPLAIN.
There's no need for set_simple_column_names() to compute unique
column names for join RTEs, because a finished plan tree will
not contain any join alias Vars that we could need names for.
Its other, internal callers will not pass it any join RTEs
anyway, so the upshot is we can just skip join RTEs here.

Aside from getting rid of a klugy against-its-documentation use of
set_relation_column_names, this can speed up EXPLAIN substantially
when considering many-join queries, because the upper join RTEs
tend to have a lot of columns.

Sami Imseih, with cosmetic changes by me

Discussion: https://postgr.es/m/CAA5RZ0th3q-0p1pri58z9grG8r8azmEBa8o1rtkwhLmJg_cH+g@mail.gmail.com
2024-12-17 15:52:12 -05:00
Melanie Plageman
dc6acfd910 Count pages set all-visible and all-frozen in VM during vacuum
Heap vacuum already counts and logs pages with newly frozen tuples. Now
count and log the number of pages newly set all-visible and all-frozen
in the visibility map.

Pages that are all-visible but not all-frozen are debt for future
aggressive vacuums. The counts of newly all-visible and all-frozen pages
give us insight into the rate at which this debt is being accrued and
paid down.

Author: Melanie Plageman
Reviewed-by: Masahiko Sawada, Alastair Turner, Nitin Jadhav, Andres Freund, Bilal Yavuz, Tomas Vondra
Discussion: https://postgr.es/m/flat/CAAKRu_ZQe26xdvAqo4weHLR%3DivQ8J4xrSfDDD8uXnh-O-6P6Lg%40mail.gmail.com#6d8d2b4219394f774889509bf3bdc13d,
https://postgr.es/m/ctdjzroezaxmiyah3gwbwm67defsrwj2b5fpfs4ku6msfpxeia%40mwjyqlhwr2wu
2024-12-17 14:19:13 -05:00
Melanie Plageman
4b565a198b Make visibilitymap_set() return previous state of vmbits
It can be useful to know the state of a relation page's VM bits before
visibilitymap_set(). visibilitymap_set() has the old value on hand, so
returning it is simple. This commit does not use visibilitymap_set()'s
new return value.

Author: Melanie Plageman
Reviewed-by: Masahiko Sawada, Andres Freund, Nitin Jadhav, Bilal Yavuz
Discussion: https://postgr.es/m/flat/CAAKRu_ZQe26xdvAqo4weHLR%3DivQ8J4xrSfDDD8uXnh-O-6P6Lg%40mail.gmail.com#6d8d2b4219394f774889509bf3bdc13d,
https://postgr.es/m/ctdjzroezaxmiyah3gwbwm67defsrwj2b5fpfs4ku6msfpxeia%40mwjyqlhwr2wu
2024-12-17 14:19:03 -05:00
Melanie Plageman
f020baa066 Rename LVRelState->frozen_pages
Rename frozen_pages to new_frozen_tuple_pages in LVRelState, the struct
used for tracking state during vacuuming of a heap relation.
frozen_pages sounds like it tracks pages set all-frozen. That is a
misnomer. It only includes pages with at least one newly frozen tuple.
It also includes pages that are not all-frozen.

Author: Melanie Plageman
Reviewed-by: Andres Freund, Masahiko Sawada, Nitin Jadhav, Bilal Yavuz

Discussion: https://postgr.es/m/ctdjzroezaxmiyah3gwbwm67defsrwj2b5fpfs4ku6msfpxeia%40mwjyqlhwr2wu
2024-12-17 14:18:59 -05:00
Tom Lane
21fb39cb07 Set max_safe_fds whenever we create shared memory and semaphores.
Formerly we skipped this in bootstrap/check mode and in single-user
mode.  That's bad in check mode because it may allow accepting a
value of max_connections that doesn't actually work: on platforms
where semaphores consume file descriptors, there may not be enough
free FDs left over to satisfy fd.c, causing postmaster start to
fail.  It's also not great in single-user mode, because fd.c will
operate with just the minimum allowable value of max_safe_fds,
resulting in excess file open/close overhead if anything moderately
complicated is done in single-user mode.  (There may be some penalty
for bootstrap mode too, though probably not much.)

Discussion: https://postgr.es/m/2081982.1734393311@sss.pgh.pa.us
2024-12-17 12:23:26 -05:00
Tom Lane
c91963da13 Set the stack_base_ptr in main(), not in random other places.
Previously we did this in PostmasterMain() and InitPostmasterChild(),
which meant that stack depth checking was disabled in non-postmaster
server processes, for instance in single-user mode.  That seems like
a fairly bad idea, since there's no a-priori restriction on the
complexity of queries we will run in single-user mode.  Moreover, this
led to not having quite the same stack depth limit in all processes,
which likely has no real-world effect but it offends my inner neatnik.
Setting the depth in main() guarantees that check_stack_depth() is
armed and has a consistent interpretation of stack depth in all forms
of server processes.

While at it, move the code associated with checking the stack depth
out of tcop/postgres.c (which was never a great home for it) into
a new file src/backend/utils/misc/stack_depth.c.

Discussion: https://postgr.es/m/2081982.1734393311@sss.pgh.pa.us
2024-12-17 12:08:42 -05:00
Tomas Vondra
8cd44db42a Update comments about index parallel builds
Commit b437571714 allowed parallel builds for BRIN, but left behind
two comments claiming only btree indexes support parallel builds.

Reported by Egor Rogov, along with similar issues in SGML docs.
Backpatch to 17, where parallel builds for BRIN were introduced.

Reported-by: Egor Rogov
Backpatch-through: 17
Discussion: https://postgr.es/m/114e2d5d-125e-07d8-94aa-5ad175fb7443@postgrespro.ru
2024-12-17 15:40:07 +01:00
Peter Eisentraut
fb1a18810f Remove ts_locale.c's lowerstr()
lowerstr() and lowerstr_with_len() in ts_locale.c do the same thing as
str_tolower() that the rest of the system uses, except that the former
don't use the common locale provider framework but instead use the
global libc locale settings.

This patch replaces uses of lowerstr*() with str_tolower(...,
DEFAULT_COLLATION_OID).  For instances that use a libc locale
globally, this will result in exactly the same behavior.  For
instances that use other locale providers, you now get consistent
behavior and are no longer dependent on the libc locale settings (for
this case; there are others).

Most uses of these functions are for processing dictionary and
configuration files.  In those cases, using the default collation
seems appropriate.  At least we don't have a more specific collation
available.  But the code in contrib/pg_trgm should really depend on
the collation of the columns being processed.  This is not done here,
this can be done in a separate patch.

(You can probably construct some edge cases where this change would
create some locale-related upgrade incompatibility, for example if
before you used a combination of ICU and a differently-behaving libc
locale.  We can document this in the release notes, but I don't think
there is anything more we can do about this.)

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/653f3b84-fc87-45a7-9a0c-bfb4fcab3e7d%40eisentraut.org
2024-12-17 14:04:55 +01:00
Peter Eisentraut
d3aad4ac57 Remove ts_locale.c's t_isdigit(), t_isspace(), t_isprint()
These do the same thing as the standard isdigit(), isspace(), and
isprint() but with multibyte and encoding support.  But all the
callers are only interested in analyzing single-byte ASCII characters.
So this extra layer is overkill and we can replace the uses with the
standard functions.

All the t_is*() functions in ts_locale.c are under scrutiny because
they don't use the common locale provider framework but instead use
the global libc locale settings.  For the functions being touched by
this patch, we don't need all that anyway, as mentioned above, so the
simplest solution is to just remove them.  The few remaining t_is*()
functions will need a different treatment in a separate patch.

pg_trgm has some compile-time options with macros such as
KEEPONLYALNUM.  These are not documented, and the non-default variant
is not supported by any test cases.  As part of this undertaking, I'm
removing the non-default variant, as it is in the way of cleanup.  So
in this case, the not-KEEPONLYALNUM code path is gone.

Reviewed-by: Jeff Davis <pgsql@j-davis.com>
Discussion: https://www.postgresql.org/message-id/flat/653f3b84-fc87-45a7-9a0c-bfb4fcab3e7d%40eisentraut.org
2024-12-17 12:52:29 +01:00
Richard Guo
60be3f9f0a Avoid unnecessary wrapping for more complex expressions
When pulling up a subquery that is under an outer join, if the
subquery's target list contains a strict expression that uses a
subquery variable, it's okay to pull up the expression without
wrapping it in a PlaceHolderVar: if the subquery variable is forced to
NULL by the outer join, the expression result will come out as NULL
too.

If the strict expression does not contain any subquery variables, the
current code always wraps it in a PlaceHolderVar.  While this is not
incorrect, the analysis could be tighter: if the strict expression
contains any variables of rels that are under the same lowest nulling
outer join as the subquery, we can also avoid wrapping it.  This is
safe because if the subquery variable is forced to NULL by the outer
join, the variables of rels that are under the same lowest nulling
outer join will also be forced to NULL, resulting in the expression
evaluating to NULL as well.  Therefore, it's not necessary to force
the expression to be evaluated below the outer join.  It could be
beneficial to get rid of such PHVs because they could imply lateral
dependencies, which force us to resort to nestloop joins.

This patch checks if the lateral references in the strict expression
contain any variables of rels under the same lowest nulling outer join
as the subquery, and avoids wrapping the expression in that case.

This is fundamentally a generalization of the optimizations for bare
Vars and PHVs introduced in commit f64ec81a8.

No backpatch as this could result in plan changes.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4_ENtfRdLaM_bXAxiKRYO7DmwDBDG4_2=VTDi0mJP-jAw@mail.gmail.com
2024-12-17 19:53:01 +09:00
Michael Paquier
fee2b3ea2e Tweak some comments related to variable-numbered stats in pgstat.c
These comments referred to database objects, but depending on the stats
kind dealt with this may not be true.

Issues found while reviewing a different patch in this area.

Discussion: https://postgr.es/m/ZtXR+CtkEVVE/LHF@ip-10-97-1-34.eu-west-3.compute.internal
2024-12-17 14:32:35 +09:00
Michael Paquier
0f23dedc91 Print out error position for some more DDLs
The following commands gain some information about the error position in
the query, should they fail when looking at the type used:
- CREATE TYPE (LIKE)
- CREATE TABLE OF

Both are related to typenameType() where the type name lookup is done.
These calls gain the ParseState that already exists in these paths.

Author: Kirill Reshke, Jian He
Reviewed-by: Álvaro Herrera, Michael Paquier
Discussion: https://postgr.es/m/CALdSSPhqfvKbDwqJaY=yEePi_aq61GmMpW88i6ZH7CMG_2Z4Cg@mail.gmail.com
2024-12-17 09:44:06 +09:00
Jeff Davis
86a5d6006a Refactor string case conversion into provider-specific files.
Create API entry points pg_strlower(), etc., that work with any
provider and give the caller control over the destination
buffer. Then, move provider-specific logic into pg_locale_builtin.c,
pg_locale_icu.c, and pg_locale_libc.c as appropriate.

Discussion: https://postgr.es/m/7aa46d77b377428058403723440862d12a8a129a.camel@j-davis.com
2024-12-16 09:35:18 -08:00
Michael Paquier
39240bcad5 Print out error position for CREATE DOMAIN
This is simply done by pushing down the ParseState available in
ProcessUtility() to DefineDomain(), giving more information about the
position of an error when running a CREATE DOMAIN query.

Most of the queries impacted by this change have been added previously
in 0172b4c944.

Author: Kirill Reshke, Jian He
Reviewed-by: Álvaro Herrera, Tom Lane, Michael Paquier
Discussion: https://postgr.es/m/CALdSSPhqfvKbDwqJaY=yEePi_aq61GmMpW88i6ZH7CMG_2Z4Cg@mail.gmail.com
2024-12-16 14:52:11 +09:00
Tom Lane
bf9165bb0c Declare a couple of variables inside not outside a PG_TRY block.
I went through the buildfarm's reports of "warning: variable 'foo'
might be clobbered by 'longjmp' or 'vfork' [-Wclobbered]".  As usual,
none of them are live problems according to my understanding of the
effects of setjmp/longjmp, to wit that local variables might revert
to their values as of PG_TRY entry, due to being kept in registers.
But I did happen to notice that XmlTableGetValue's "cstr" variable
doesn't need to be declared outside the PG_TRY block at all (thus
giving further proof that the -Wclobbered warning has little
connection to real problems).  We might as well move it inside,
and "cur" too, in hopes of eliminating one of the bogus warnings.
2024-12-15 15:50:07 -05:00
Álvaro Herrera
62b7a9a778 Refactor some SQL/JSON error messages
Turn type names into "%s" specifiers to 1) avoid getting them translated
and 2) reduce the total number of messages.
2024-12-14 12:55:00 +01:00
Thomas Munro
7bc9a8bdd2 Fix warnings about declaration of environ on MinGW.
POSIX says that the global variable environ shouldn't be declared in a
header, and that you have to declare it yourself.  MinGW declares it in
<stdlib.h> with some macrology that messes up our declarations.  Visual
Studio doesn't warn (there are clues that it may also declare it, but if
so, apparently compatibly).  Suppress our declarations, on MinGW only.

This clears the last warnings on CI's optional MinGW task, and hopefully
on build farm animal fairywren too.

Like 1319997d, no back-patch for now as it's not known to be breaking
anything, and my humble goal is just to keep the MinGW build clean going
forward.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> (earlier version)
Discussion: https://postgr.es/m/CA%2BhUKGJLMh%2B6W5E4M_jSFb43gnrA_-Q6-%2BBf3HkBXyGfRFcBsQ%40mail.gmail.com
2024-12-15 00:41:27 +13:00
Thomas Munro
48c142f78d Remove EXTENSION_DONT_CHECK_SIZE from md.c.
Commits 7bb3102c and 3eb77eba removed the only user of the
EXTENSION_DONT_CHECK_SIZE flag, which had previously been required to
checkpoint truncated relations.  Since 7bb3102c, segments have been
opened directly for synchronization without calling _mdfd_getseg(), so
it doesn't need a mode that tolerates non-final short segments.  Remove
the redundant flag and associated comments.

Reported-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/nyj4k7yur5t27rtygvx2i2lrlp6rqfvvhoiiwx4fznynksf2et%404hj2sp42alpe
2024-12-14 21:52:10 +13:00
John Naylor
c72ca3ddec Fix typo
Ryo Kanbayashi

Discussion: https://postgr.es/m/CANOn0ExEQiPVrzkjULkENVac_n4Lknm6dxsU69MSncQap0kJVA%40mail.gmail.com
2024-12-14 09:52:08 +07:00
Álvaro Herrera
3191eccd8a Rewrite maybe_reread_subscription() comment
One sentence was gramatically wrong, but also too terse.  Expand on it.
2024-12-13 07:41:36 +01:00
Nathan Bossart
a0ff56e2d3 Revert "Don't truncate database and user names in startup packets."
This reverts commit 562bee0fc1.

We received a report from the field about this change in behavior,
so it seems best to revert this commit and to add proper
multibyte-aware truncation as a follow-up exercise.

Fixes bug #18711.

Reported-by: Adam Rauch
Reviewed-by: Tom Lane, Bertrand Drouvot, Bruce Momjian, Thomas Munro
Discussion: https://postgr.es/m/18711-7503ee3e449d2c47%40postgresql.org
Backpatch-through: 17
2024-12-12 15:52:04 -06:00
David Rowley
bd10ec5297 Detect redundant GROUP BY columns using UNIQUE indexes
d4c3a156c added support that when the GROUP BY contained all of the
columns belonging to a relation's PRIMARY KEY, all other columns
belonging to that relation would be removed from the GROUP BY clause.
That's possible because all other columns are functionally dependent on
the PRIMARY KEY and those columns alone ensure the groups are distinct.

Here we expand on that optimization and allow it to work for any unique
indexes on the table rather than just the PRIMARY KEY index.  This
normally requires that all columns in the index are defined with NOT NULL,
however, we can relax that requirement when the index is defined with
NULLS NOT DISTINCT.

When there are multiple suitable indexes to allow columns to be removed,
we prefer the index with the least number of columns as this allows us
to remove the highest number of GROUP BY columns.  One day, we may want to
revisit that decision as it may make more sense to use the narrower set of
columns in terms of the width of the data types and stored/queried data.

This also adjusts the code to make use of RelOptInfo.indexlist rather
than looking up the catalog tables.

In passing, add another short-circuit path to allow bailing out earlier
in cases where it's certainly not possible to remove redundant GROUP BY
columns.  This early exit is now cheaper to do than when this code was
originally written as 00b41463c made it cheaper to check for empty
Bitmapsets.

Patch originally by Zhang Mingli and later worked on by jian he, but after
I (David) worked on it, there was very little of the original left.

Author: Zhang Mingli, jian he, David Rowley
Reviewed-by: jian he, Andrei Lepikhov
Discussion: https://postgr.es/m/327990c8-b9b2-4b0c-bffb-462249f82de0%40Spark
2024-12-12 15:28:38 +13:00
David Rowley
430a5952de Defer remove_useless_groupby_columns() work until query_planner()
Traditionally, remove_useless_groupby_columns() was called during
grouping_planner() directly after the call to preprocess_groupclause().
While in many ways, it made sense to populate the field and remove the
functionally dependent columns from processed_groupClause at the same
time, it's just that doing so had the disadvantage that
remove_useless_groupby_columns() was being called before the RelOptInfos
were populated for the relations mentioned in the query.  Not having
RelOptInfos available meant we needed to manually query the catalog tables
to get the required details about the primary key constraint for the
table.

Here we move the remove_useless_groupby_columns() call to
query_planner() and put it directly after the RelOptInfos are populated.
This is fine to do as processed_groupClause still isn't final at this
point as it can still be modified inside standard_qp_callback() by
make_pathkeys_for_sortclauses_extended().

This commit is just a refactor and simply moves
remove_useless_groupby_columns() into initsplan.c.  A planned follow-up
commit will adjust that function so it uses RelOptInfo instead of doing
catalog lookups and also teach it how to use unique indexes as proofs to
expand the cases where we can remove functionally dependent columns from
the GROUP BY.

Reviewed-by: Andrei Lepikhov, jian he
Discussion: https://postgr.es/m/CAApHDvqLezKwoEBBQd0dp4Y9MDkFBDbny0f3SzEeqOFoU7Z5+A@mail.gmail.com
2024-12-12 14:22:15 +13:00
Masahiko Sawada
78c5e141e9 Add UUID version 7 generation function.
This commit introduces the uuidv7() SQL function, which generates UUID
version 7 as specified in RFC 9652. UUIDv7 combines a Unix timestamp
in milliseconds and random bits, offering both uniqueness and
sortability.

In our implementation, the 12-bit sub-millisecond timestamp fraction
is stored immediately after the timestamp, in the space referred to as
"rand_a" in the RFC. This ensures additional monotonicity within a
millisecond. The rand_a bits also function as a counter. We select a
sub-millisecond timestamp so that it monotonically increases for
generated UUIDs within the same backend, even when the system clock
goes backward or when generating UUIDs at very high
frequency. Therefore, the monotonicity of generated UUIDs is ensured
within the same backend.

This commit also expands the uuid_extract_timestamp() function to
support UUID version 7.

Additionally, an alias uuidv4() is added for the existing
gen_random_uuid() SQL function to maintain consistency.

Bump catalog version.

Author: Andrey Borodin
Reviewed-by: Sergey Prokhorenko, Przemysław Sztoch, Nikolay Samokhvalov
Reviewed-by: Peter Eisentraut, Jelte Fennema-Nio, Aleksander Alekseev
Reviewed-by: Masahiko Sawada, Lukas Fittl, Michael Paquier, Japin Li
Reviewed-by: Marcos Pegoraro, Junwang Zhao, Stepan Neretin
Reviewed-by: Daniel Vérité
Discussion: https://postgr.es/m/CAAhFRxitJv%3DyoGnXUgeLB_O%2BM7J2BJAmb5jqAT9gZ3bij3uLDA%40mail.gmail.com
2024-12-11 15:54:41 -08:00
Nathan Bossart
e8d5929428 Use pg_memory_is_all_zeros() in pgstatfuncs.c.
There are a few places in this file that use memset() and memcmp()
to determine whether a section of memory is all zeros.  This commit
modifies them to use pg_memory_is_all_zeros() instead.  These
aren't expected to be hot code paths, but this may optimize them a
bit.  Plus, this allows us to remove some variables that were only
needed for the memset() and memcmp().

Author: Bertrand Drouvot
Reviewed-by: Michael Paquier
Discussion: https://postgr.es/m/Z1hNubHfvMxlW6eu%40ip-10-97-1-34.eu-west-3.compute.internal
2024-12-11 14:19:14 -06:00
David Rowley
c2a4078eba Enable BUFFERS with EXPLAIN ANALYZE by default
The topic of turning EXPLAIN's BUFFERS option on with the ANALYZE option
has come up a few times over the past few years.  In many ways, doing this
seems like a good idea as it may be more obvious to users why a given
query is running more slowly than they might expect.  Also, from my own
(David's) personal experience, I've seen users posting to the mailing
lists with two identical plans, one slow and one fast asking why their
query is sometimes slow.  In many cases, this is due to additional reads.
Having BUFFERS on by default may help reduce some of these questions, and
if not, make it more obvious to the user before they post, or save a
round-trip to the mailing list when additional I/O effort is the cause of
the slowness.

The general consensus is that we want BUFFERS on by default with
ANALYZE.  However, there were more than zero concerns raised with doing
so.  The primary reason against is the additional verbosity, making it
harder to read large plans.  Another concern was that buffer information
isn't always useful so may not make sense to have it on by default.

It's currently December, so let's commit this to see if anyone comes
forward with a strong objection against making this change.  We have over
half a year remaining in the v18 cycle where we could still easily consider
reverting this if someone were to come forward with a convincing enough
reason as to why doing this is a bad idea.

There were two patches independently submitted to achieve this goal, one
by me and the other by Guillaume.  This commit is a mix of both of these
patches with some additional work done by me to adjust various
additional places in the documentation which include EXPLAIN ANALYZE
output.

Author: Guillaume Lelarge, David Rowley
Reviewed-by: Robert Haas, Greg Sabino Mullane, Michael Christofides
Discussion: https://postgr.es/m/CANNMO++W7MM8T0KyXN3ZheXXt-uLVM3aEtZd+WNfZ=obxffUiA@mail.gmail.com
2024-12-11 22:35:11 +13:00
David Rowley
0f5738202b Use ExprStates for hashing in GROUP BY and SubPlans
This speeds up obtaining hash values for GROUP BY and hashed SubPlans by
using the ExprState support for hashing, thus allowing JIT compilation for
obtaining hash values for these operations.

This, even without JIT compilation, has been shown to improve Hash
Aggregate performance in some cases by around 15% and hashed NOT IN
queries in one case by over 30%, however, real-world cases are likely to
see smaller gains as the test cases used were purposefully designed to
have high hashing overheads by keeping the hash table small to prevent
additional memory overheads that would be a factor when working with large
hash tables.

In passing, fix a hypothetical bug in ExecBuildHash32Expr() so that the
initial value is stored directly in the ExprState's result field if
there are no expressions to hash.  None of the current users of this
function use an initial value, so the bug is only hypothetical.

Reviewed-by: Andrei Lepikhov <lepihov@gmail.com>
Discussion: https://postgr.es/m/CAApHDvpYSO3kc9UryMevWqthTBrxgfd9djiAjKHMPUSQeX9vdQ@mail.gmail.com
2024-12-11 13:47:16 +13:00
Jeff Davis
a43567483c Use in-place updates for pg_restore_relation_stats().
This matches the behavior of vac_update_relstats(), which is important
to avoid bloating pg_class.

Author: Corey Huinker
Discussion: https://postgr.es/m/CADkLM=fc3je+ufv3gsHqjjSSf+t8674RXpuXW62EL55MUEQd-g@mail.gmail.com
2024-12-10 16:30:37 -08:00
David Rowley
50416cc484 Speedup Hash Joins with dedicated functions for ExprState hashing
Hashing of a single Var is a very common operation for ExprState to
perform.  Here we add dedicated ExecJust* functions which helps speed up
Hash Joins by removing the interpretation overhead in ExecInterpExpr().

This change currently only affects Hash Joins on a single column.  Hash
Joins with multiple join keys or an expression still run through
ExecInterpExpr().

Some testing has shown up to 10% query performance increases on recent AMD
hardware and nearly 7% increase on an Apple M2 for a query performing a
hash join with a large number of lookups on a small hash table.

This change was extracted from a larger patch which adjusts GROUP BY /
hashed subplans / hashed set operations to use ExprState hashing.

Discussion: https://postgr.es/m/CAApHDvr8Zc0ZgzVoCZLdHGOFNhiJeQ6vrUcS9V7N23zMWQb-eA@mail.gmail.com
2024-12-11 11:32:15 +13:00
Tom Lane
9828905303 Doc: add some commentary about ExecutorRun's NoMovement special case.
Robert Haas expressed concern about whether commit 3eea7a0c9 exposed
the parallel-execution machinery to a case it isn't tested for, namely
a second non-parallel execution of a plan after a parallel execution.
Investigation shows that that can't happen because of pquery.c's
manipulation of the scan direction, but it sure wasn't obvious to
start with.  Add some commentary about that.

Discussion: https://postgr.es/m/CA+TgmoagyKQy=HFw+wLo0AKTYWHui+iKswZ8Jnqqd-cFby-WVg@mail.gmail.com
2024-12-10 17:17:28 -05:00
Noah Misch
8b9cbf4922 Fix elog(FATAL) before PostmasterMain() or just after fork().
Since commit 97550c0711, these failed with
"PANIC:  proc_exit() called in child process" due to uninitialized or
stale MyProcPid.  That was reachable if close() failed in
ClosePostmasterPorts() or setlocale(category, "C") failed, both
unlikely.  Back-patch to v13 (all supported versions).

Discussion: https://postgr.es/m/20241208034614.45.nmisch@google.com
2024-12-10 13:51:59 -08:00
Peter Eisentraut
74edabce7a Support for GiST in get_equal_strategy_number()
A WITHOUT OVERLAPS primary key or unique constraint is accepted as a
REPLICA IDENTITY, since it guarantees uniqueness.  But subscribers
applying logical decoding messages would fail because there was not
support for looking up the equals operator for a gist index.  This
fixes that: For GiST indexes we can use the stratnum GiST support
function.

Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 13:26:09 +01:00
Peter Eisentraut
13544e790e Make the conditions in IsIndexUsableForReplicaIdentityFull() more explicit
IsIndexUsableForReplicaIdentityFull() described a number of conditions
that a suitable index has to fulfill.  But not all of these were
actually checked in the code.  Instead, it appeared to rely on
get_equal_strategy_number() to filter out any indexes that are not
btree or hash.  As we look to generalize index AM capabilities, this
would possibly break if we added additional support in
get_equal_strategy_number().  Instead, write out code to check for the
required capabilities explicitly.  This shouldn't change any behaviors
at the moment.

Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 13:11:34 +01:00
Peter Eisentraut
a2a475b011 Replace get_equal_strategy_number_for_am() by get_equal_strategy_number()
get_equal_strategy_number_for_am() gets the equal strategy number for
an AM.  This currently only supports btree and hash.  In the more
general case, this also depends on the operator class (see for example
GistTranslateStratnum()).  To support that, replace this function with
get_equal_strategy_number() that takes an opclass and derives it from
there.  (This function already existed before as a static function, so
the signature is kept for simplicity.)

This patch is only a refactoring, it doesn't add support for other
index AMs such as gist.  This will be done separately.

Reviewed-by: Paul Jungwirth <pj@illuminatedcomputing.com>
Reviewed-by: vignesh C <vignesh21@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 12:53:27 +01:00
Peter Eisentraut
321c287351 Improve internal logical replication error for missing equality strategy
This "shouldn't happen", except right now it can with a temporal gist
index (to be fixed soon), because of missing gist support in
get_equal_strategy_number().  But right now, the error is not caught
right away, but instead you get the subsequent error about a "missing
operator 0".  This makes the error more accurate.

Author: Paul Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-12-10 12:30:42 +01:00
Michael Paquier
d37e856410 Fix comments of GUC hooks for timezone_abbreviations
The GUC assign and check hooks used "assign_timezone_abbreviations",
which was incorrect.

Issue noticed while browsing this area of the code, introduced in
0a20ff54f5.

Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/Z1eV6Y8yk77GZhZI@paquier.xyz
Backpatch-through: 16
2024-12-10 13:02:21 +09:00
Daniel Gustafsson
73a392d236 Fix small memory leaks in GUC checks
Follow-up commit to a9d58bfe8a.  Backpatch down to v16 where
this was added in order to keep the code consistent for future
backpatches.

Author: Tofig Aliev <t.aliev@postgrespro.ru>
Reviewed-by: Daniel Gustafsson <daniel@yesql.se>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/bba4313fdde9db46db279f96f3b748b1@postgrespro.ru
Backpatch-through: 16
2024-12-09 20:58:23 +01:00
Nathan Bossart
0a27c3d0f7 Fix various overflow hazards in date and timestamp functions.
This commit makes use of the overflow-aware routines in int.h to
fix a variety of reported overflow bugs in the date and timestamp
code.  It seems unlikely that this fixes all such bugs in this
area, but since the problems seem limited to cases that are far
beyond any realistic usage, I'm not going to worry too much.  Note
that for one bug, I've chosen to simply add a comment about the
overflow hazard because fixing it would require quite a bit of code
restructuring that doesn't seem worth the risk.

Since this is a bug fix, it could be back-patched, but given the
risk of conflicts with the new routines in int.h and the overall
risk/reward ratio of this patch, I've opted not to do so for now.

Fixes bug #18585 (except for the one case that's just commented).

Reported-by: Alexander Lakhin
Author: Matthew Kim, Nathan Bossart
Reviewed-by: Joseph Koshakow, Jian He
Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com
Discussion: https://postgr.es/m/18585-db646741dd649abd%40postgresql.org
2024-12-09 13:47:23 -06:00
Tom Lane
3eea7a0c97 Simplify executor's determination of whether to use parallelism.
Our parallel-mode code only works when we are executing a query
in full, so ExecutePlan must disable parallel mode when it is
asked to do partial execution.  The previous logic for this
involved passing down a flag (variously named execute_once or
run_once) from callers of ExecutorRun or PortalRun.  This is
overcomplicated, and unsurprisingly some of the callers didn't
get it right, since it requires keeping state that not all of
them have handy; not to mention that the requirements for it were
undocumented.  That led to assertion failures in some corner
cases.  The only state we really need for this is the existing
QueryDesc.already_executed flag, so let's just put all the
responsibility in ExecutePlan.  (It could have been done in
ExecutorRun too, leading to a slightly shorter patch -- but if
there's ever more than one caller of ExecutePlan, it seems better
to have this logic in the subroutine than the callers.)

This makes those ExecutorRun/PortalRun parameters unnecessary.
In master it seems okay to just remove them, returning the
API for those functions to what it was before parallelism.
Such an API break is clearly not okay in stable branches,
but for them we can just leave the parameters in place after
documenting that they do nothing.

Per report from Yugo Nagata, who also reviewed and tested
this patch.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/20241206062549.710dc01cf91224809dd6c0e1@sraoss.co.jp
2024-12-09 14:38:19 -05:00
Heikki Linnakangas
4d8275046c Remove remants of "snapshot too old"
Remove the 'whenTaken' and 'lsn' fields from SnapshotData. After the
removal of the "snapshot too old" feature, they were never set to a
non-zero value.

This largely reverts commit 3e2f3c2e42, which added the
OldestActiveSnapshot tracking, and the init_toast_snapshot()
function. That was only required for setting the 'whenTaken' and 'lsn'
fields. SnapshotToast is now a constant again, like SnapshotSelf and
SnapshotAny. I kept a thin get_toast_snapshot() wrapper around
SnapshotToast though, to check that you have a registered or active
snapshot. That's still a useful sanity check.

Reviewed-by: Nathan Bossart, Andres Freund, Tom Lane
Discussion: https://www.postgresql.org/message-id/cd4b4f8c-e63a-41c0-95f6-6e6cd9b83f6d@iki.fi
2024-12-09 18:13:03 +02:00
Richard Guo
f64ec81a81 Avoid unnecessary wrapping for Vars and PHVs
When pulling up a lateral subquery that is under an outer join, the
current code always wraps a Var or PHV in the subquery's targetlist
into a new PlaceHolderVar if it is a lateral reference to something
outside the subquery.  This is necessary when the Var/PHV references
the non-nullable side of the outer join from the nullable side: we
need to ensure that it is evaluated at the right place and hence is
forced to null when the outer join should do so.  However, if the
referenced rel is under the same lowest nulling outer join, we can
actually omit the wrapping.  That's safe because if the subquery
variable is forced to NULL by the outer join, the lateral reference
variable will come out as NULL too.  It could be beneficial to get rid
of such PHVs because they imply lateral dependencies, which force us
to resort to nestloop joins.

This patch leverages the newly introduced nullingrel_info to check if
the nullingrels of the subquery RTE are a subset of those of the
laterally referenced rel, in order to determine if the referenced rel
is under the same lowest nulling outer join.

No backpatch as this could result in plan changes.

Author: Richard Guo
Reviewed-by: James Coleman, Dmitry Dolgov, Andrei Lepikhov
Discussion: https://postgr.es/m/CAMbWs48uk6C7Z9m_FNT8_21CMCk68hrgAsz=z6zpP1PNZMkeoQ@mail.gmail.com
2024-12-09 20:38:22 +09:00
Richard Guo
5668a857de Fix right-semi-joins in HashJoin rescans
When resetting a HashJoin node for rescans, if it is a single-batch
join and there are no parameter changes for the inner subnode, we can
just reuse the existing hash table without rebuilding it.  However,
for join types that depend on the inner-tuple match flags in the hash
table, we need to reset these match flags to avoid incorrect results.
This applies to right, right-anti, right-semi, and full joins.

When I introduced "Right Semi Join" plan shapes in aa86129e1, I failed
to reset the match flags in the hash table for right-semi joins in
rescans.  This oversight has been shown to produce incorrect results.
This patch fixes it.

Author: Richard Guo
Discussion: https://postgr.es/m/CAMbWs4-nQF9io2WL2SkD0eXvfPdyBc9Q=hRwfQHCGV2usa0jyA@mail.gmail.com
2024-12-09 20:36:23 +09:00
Michael Paquier
f0c569d715 Fix memory leak in pgoutput with publication list cache
The pgoutput module caches publication names in a list and frees it upon
invalidation.  However, the code forgot to free the actual publication
names within the list elements, as publication names are pstrdup()'d in
GetPublication().  This would cause memory to leak in
CacheMemoryContext, bloating it over time as this context is not
cleaned.

This is a problem for WAL senders running for a long time, as an
accumulation of invalidation requests would bloat its cache memory
usage.  A second case, where this leak is easier to see, involves a
backend calling SQL functions like pg_logical_slot_{get,peek}_changes()
which create a new decoding context with each execution.  More
publications create more bloat.

To address this, this commit adds a new memory context within the
logical decoding context and resets it each time the publication names
cache is invalidated, based on a suggestion from Amit Kapila.  This
ensures that the lifespan of the publication names aligns with that of
the logical decoding context.

This solution changes PGOutputData, which is fine for HEAD but it could
cause an ABI breakage in stable branches as the structure size would
change, so these are left out for now.

Analyzed-by: Michael Paquier, Jeff Davis
Author: Zhijie Hou
Reviewed-by: Michael Paquier, Masahiko Sawada, Euler Taveira
Discussion: https://postgr.es/m/Z0khf9EVMVLOc_YY@paquier.xyz
2024-12-09 16:41:46 +09:00
Michael Paquier
001a537b83 Improve comment about dropped entries in pgstat.c
pgstat_write_statsfile() discards any entries marked as dropped from
being written to the stats file at shutdown, and also included an
assertion based on the same condition.

The intention of the assertion is to track that no pgstats entries
should be left around as terminating backends should drop any entries
they still hold references on before the stats file is written by the
checkpointer, and it not worth taking down the server in this case if
there is a bug making that possible.

Let's improve the comment of this area to document clearly what's
intended.

Based on a discussion with Bertrand Drouvot and Anton A. Melnikov.

Author: Bertrand Drouvot
Discussion: https://postgr.es/m/a13e8cdf-b97a-4ecb-8f42-aaa367974e29@postgrespro.ru
Backpatch-through: 15
2024-12-09 14:35:39 +09:00
Amit Kapila
2d0152d614 Improve the error message introduced in commit 87ce27de69.
The error detail message "Replica identity consists of an unpublished
generated column." implies that the entire replica identity is made up of
an unpublished generated column which may not be the case.

Reported-by: Peter Smith
Author: Shlok Kyal
Reviewed-by: Peter Smith, Amit Kapila
Discussion: https://postgr.es/m/CAHut+PuwMhKx0PhOA4APhJTLoBGNykbeCQpr_CuwGT-SkswG5w@mail.gmail.com
2024-12-09 09:11:45 +05:30
Michael Paquier
da99fedf8c Fix invalidation of local pgstats references for entry reinitialization
818119afcc has introduced the "generation" concept in pgstats entries,
incremented a counter when a pgstats entry is reinitialized, but it did
not count on the fact that backends still holding local references to
such entries need to be refreshed if the cache age is outdated.  The
previous logic only updated local references when an entry was dropped,
but it needs also to consider entries that are reinitialized.

This matters for replication slot stats (as well as custom pgstats kinds
in 18~), where concurrent drops and creates of a slot could cause
incorrect stats to be locally referenced.  This would lead to an
assertion failure at shutdown when writing out the stats file, as the
backend holding an outdated local reference would not be able to drop
during its shutdown sequence the stats entry that should be dropped, as
the last process holding a reference to the stats entry.  The
checkpointer was then complaining about such an entry late in the
shutdown sequence, after the shutdown checkpoint is finished with the
control file updated, causing the stats file to not be generated.  In
non-assert builds, the entry would just be skipped with the stats file
written.

Note that only logical replication slots use statistics.

A test case based on TAP is added to test_decoding, where a persistent
connection peeking at a slot's data is kept with concurrent drops and
creates of the same slot.  This is based on the isolation test case that
Anton has sent.  As it requires a node shutdown with a check to make
sure that the stats file is written with this specific sequence of
events, TAP is used instead.

Reported-by: Anton A. Melnikov
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/56bf8ff9-dd8c-47b2-872a-748ede82af99@postgrespro.ru
Backpatch-through: 15
2024-12-09 10:45:28 +09:00
David Rowley
1fe5a347e3 Fix possible crash during WindowAgg evaluation
When short-circuiting WindowAgg node evaluation on the top-level
WindowAgg node using quals on monotonic window functions, because the
WindowAgg run condition can mean there's no need to evaluate subsequent
window function results in the same partition once the run condition
becomes false, it was possible that the executor would use stale results
from the previous invocation of the window function in some cases.

A fix for this was partially done by a5832722, but that commit only
fixed the issue for non-top-level WindowAgg nodes.  I mistakenly thought
that the top-level WindowAgg didn't have this issue, but Jayesh's example
case clearly shows that's incorrect.  At the time, I also thought that
this only affected 32-bit systems as all window functions which then
supported run conditions returned BIGINT, however, that's wrong as
ExecProject is still called and that could cause evaluation of any other
window function belonging to the same WindowAgg node, one of which may
return a byref type.

The only queries affected by this are WindowAggs with a "Run Condition"
which contains at least one window function with a byref result type,
such as lead() or lag() on a byref column.  The window clause must also
contain a PARTITION BY clause (without a PARTITION BY, execution of the
WindowAgg stops immediately when the run condition becomes false and
there's no risk of using the stale results).

Reported-by: Jayesh Dehankar
Discussion: https://postgr.es/m/193261e2c4d.3dd3cd7c1842.871636075166132237@zohocorp.com
Backpatch-through: 15, where WindowAgg run conditions were added
2024-12-09 14:23:21 +13:00
Tom Lane
3f9b962176 Ensure that pg_amop/amproc entries depend on their lefttype/righttype.
Usually an entry in pg_amop or pg_amproc does not need a dependency on
its amoplefttype/amoprighttype/amproclefttype/amprocrighttype types,
because there is an indirect dependency via the argument types of its
referenced operator or procedure, or via the opclass it belongs to.
However, for some support procedures in some index AMs, the argument
types of the support procedure might not mention the column data type
at all.  Also, the amop/amproc entry might be treated as "loose" in
the opfamily, in which case it lacks a dependency on any particular
opclass; or it might be a cross-type entry having a reference to a
datatype that is not its opclass' opcintype.

The upshot of all this is that there are cases where a datatype can
be dropped while leaving behind amop/amproc entries that mention it,
because there is no path in pg_depend showing that those entries
depend on that type.  Such entries are harmless in normal activity,
because they won't get used, but they cause problems for maintenance
actions such as dropping the operator family.  They also cause pg_dump
to produce bogus output.  The previous commit put a band-aid on the
DROP OPERATOR FAMILY failure, but a real fix is needed.

To fix, add pg_depend entries showing that a pg_amop/pg_amproc entry
depends on its lefttype/righttype.  To avoid bloating pg_depend too
much, skip this if the referenced operator or function has that type
as an input type.  (I did not bother with considering the possible
indirect dependency via the opclass' opcintype; at least in the
reported case, that wouldn't help anyway.)

Probably, the reason this has escaped notice for so long is that
add-on datatypes and relevant opclasses/opfamilies are usually
packaged as extensions nowadays, so that there's no way to drop
a type without dropping the referencing opclasses/opfamilies too.
Still, in the absence of pg_depend entries there's nothing that
constrains DROP EXTENSION to drop the opfamily entries before the
datatype, so it seems possible for a DROP failure to occur anyway.

The specific case that was reported doesn't fail in v13, because
v13 prefers to attach the support procedure to the opclass not the
opfamily.  But it's surely possible to construct other edge cases
that do fail in v13, so patch that too.

Per report from Yoran Heling.  Back-patch to all supported branches.

Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
2024-12-07 15:56:28 -05:00
Tom Lane
c82003760d Make getObjectDescription robust against dangling amproc type links.
Yoran Heling reported a case where a data type could be dropped
while references to its OID remain behind in pg_amproc.  This
causes getObjectDescription to fail, which blocks dropping the
operator family (since our DROP code likes to construct descriptions
of everything it's dropping).  The proper fix for this requires
adding more pg_depend entries.  But to allow DROP to go through with
already-corrupt catalogs, tweak getObjectDescription to print "???"
for the type instead of failing when it processes such an entry.

I changed the logic for pg_amop similarly, for consistency,
although it is not known that the problem can manifest in pg_amop.

Per report from Yoran Heling.  Back-patch to all supported
branches (although the problem may be unreachable in v13).

Discussion: https://postgr.es/m/Z1MVCOh1hprjK5Sf@gmai021
2024-12-07 14:28:16 -05:00
Tom Lane
3220ceaf77 Fix is_digit labeling of to_timestamp's FFn format codes.
These format codes produce or consume strings of digits, so they
should be labeled with is_digit = true, but they were not.
This has effect in only one place, where is_next_separator()
is checked to see if the preceding format code should slurp up
all the available digits.  Thus, with a format such as '...SSFF3'
with remaining input '12345', the 'SS' code would consume all
five digits (and then complain about seconds being out of range)
when it should eat only two digits.

Per report from Nick Davies.  This bug goes back to d589f9446
where the FFn codes were introduced, so back-patch to v13.

Discussion: https://postgr.es/m/AM8PR08MB6356AC979252CFEA78B56678B6312@AM8PR08MB6356.eurprd08.prod.outlook.com
2024-12-07 13:12:32 -05:00