1
0
mirror of https://github.com/postgres/postgres.git synced 2025-12-21 05:21:08 +03:00
Commit Graph

8630 Commits

Author SHA1 Message Date
Michael Paquier
cc5ef90edd Refactor initial hash lookup in dynahash.c
The same pattern is used three times in dynahash.c to retrieve a bucket
number and a hash bucket from a hash value.  This has popped up while
discussing improvements for the type cache, where this piece of
refactoring would become useful.

Note that hash_search_with_hash_value() does not need the bucket number,
just the hash bucket.

Author: Teodor Sigaev
Reviewed-by: Aleksander Alekseev, Michael Paquier
Discussion: https://postgr.es/m/5812a6e5-68ae-4d84-9d85-b443176966a1@sigaev.ru
2024-03-15 07:57:17 +09:00
Nathan Bossart
d1162cfda8 Add pg_column_toast_chunk_id().
This function returns the chunk_id of an on-disk TOASTed value.  If
the value is un-TOASTed or not on-disk, it returns NULL.  This is
useful for identifying which values are actually TOASTed and for
investigating "unexpected chunk number" errors.

Bumps catversion.

Author: Yugo Nagata
Reviewed-by: Jian He
Discussion: https://postgr.es/m/20230329105507.d764497456eeac1ca491b5bd%40sraoss.co.jp
2024-03-14 10:58:00 -05:00
Jeff Davis
2d819a08a1 Introduce "builtin" collation provider.
New provider for collations, like "libc" or "icu", but without any
external dependency.

Initially, the only locale supported by the builtin provider is "C",
which is identical to the libc provider's "C" locale. The libc
provider's "C" locale has always been treated as a special case that
uses an internal implementation, without using libc at all -- so the
new builtin provider uses the same implementation.

The builtin provider's locale is independent of the server environment
variables LC_COLLATE and LC_CTYPE. Using the builtin provider, the
database collation locale can be "C" while LC_COLLATE and LC_CTYPE are
set to "en_US", which is impossible with the libc provider.

By offering a new builtin provider, it clarifies that the semantics of
a collation using this provider will never depend on libc, and makes
it easier to document the behavior.

Discussion: https://postgr.es/m/ab925f69-5f9d-f85e-b87c-bd2a44798659@joeconway.com
Discussion: https://postgr.es/m/dd9261f4-7a98-4565-93ec-336c1c110d90@manitou-mail.org
Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com
Reviewed-by: Daniel Vérité, Peter Eisentraut, Jeremy Schneider
2024-03-13 23:33:44 -07:00
Nathan Bossart
ecb0fd3372 Reintroduce MAINTAIN privilege and pg_maintain predefined role.
Roles with MAINTAIN on a relation may run VACUUM, ANALYZE, REINDEX,
REFRESH MATERIALIZE VIEW, CLUSTER, and LOCK TABLE on the relation.
Roles with privileges of pg_maintain may run those same commands on
all relations.

This was previously committed for v16, but it was reverted in
commit 151c22deee due to concerns about search_path tricks that
could be used to escalate privileges to the table owner.  Commits
2af07e2f74, 59825d1639, and c7ea3f4229 resolved these concerns by
restricting search_path when running maintenance commands.

Bumps catversion.

Reviewed-by: Jeff Davis
Discussion: https://postgr.es/m/20240305161235.GA3478007%40nathanxps13
2024-03-13 14:49:26 -05:00
Peter Eisentraut
97d85be365 Make the order of the header file includes consistent
Similar to commit 7e735035f2.

Author: Richard Guo <guofenglinux@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/CAMbWs4-WhpCFMbXCjtJ%2BFzmjfPrp7Hw1pk4p%2BZpU95Kh3ofZ1A%40mail.gmail.com
2024-03-13 15:07:00 +01:00
Michael Paquier
77cf6a78de Add some asserts based on LWLockHeldByMe() for replication slot statistics
Two assertions checking that ReplicationSlotAllocationLock is acquired
are added to pgstat_create_replslot() and pgstat_drop_replslot(),
corresponding to the routines in charge of the creation and the drop of
replication slot statistics.  The code previously relied on this
assumption and documented it in comments, but did not enforce this
policy at runtime.

Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Ze_p-hmD_yFeVYXg@paquier.xyz
2024-03-13 07:45:11 +09:00
Michael Paquier
2c8118ee5d Use printf's %m format instead of strerror(errno) in more places
Most callers of strerror() are removed from the backend code.  The
remaining callers require special handling with a saved errno from a
previous system call.  The frontend code still needs strerror() where
error states need to be handled outside of fprintf.

Note that pg_regress is not changed to use %m as the TAP output may
clobber errno, since those functions call fprintf() and friends before
evaluating the format string.

Support for %m in src/port/snprintf.c has been added in d6c55de1f9,
hence all the stable branches currently supported include it.

Author: Dagfinn Ilmari Mannsåker
Discussion: https://postgr.es/m/87sf13jhuw.fsf@wibble.ilmari.org
2024-03-12 10:02:54 +09:00
Heikki Linnakangas
af0e7deb4a Don't destroy SMgrRelations at relcache invalidation
With commit 21d9c3ee4e, SMgrRelations remain valid until end of
transaction (or longer if they're "pinned"). Relcache invalidation can
happen in the middle of a transaction, so we must not destroy them at
relcache invalidation anymore.

This was revealed by failures in the 'constraints' test in buildfarm
animals using -DCLOBBER_CACHE_ALWAYS. That started failing with commit
8af2565248, which was the first commit that started to rely on an
SMgrRelation living until end of transaction.

Diagnosed-by: Tomas Vondra, Thomas Munro
Reviewed-by: Thomas Munro
Discussion: https://www.postgresql.org/message-id/CA%2BhUKGK%2B5DOmLaBp3Z7C4S-Yv6yoROvr1UncjH2S1ZbPT8D%2BZg%40mail.gmail.com
2024-03-11 09:08:02 +02:00
Michael Paquier
b36fbd9f8d Improve consistency of replication slot statistics
The replication slot stats stored in shared memory rely on an internal
index number.  Both pgstat_reset_replslot() and pgstat_fetch_replslot()
lacked some LWLock protections with ReplicationSlotControlLock while
operating on these index numbers.  This issue could cause these two
functions to potentially operate on incorrect slots when taken in
isolation in the event of slots dropped and/or re-created concurrently.

Note that pg_stat_get_replication_slot() is called once per slot when
querying pg_stat_replication_slots, meaning that the stats are retrieved
across multiple ReplicationSlotControlLock acquisitions.  So, while this
commit improves more consistency, it may still be possible that
statistics are not completely consistent for a single scan of
pg_stat_replication_slots under concurrent replication slot drop or
creation activity.

The issue should unlikely be a problem in practice, causing the report
of inconsistent stats or or the stats reset of an incorrect slot, so no
backpatch is done.

Author: Bertrand Drouvot
Reviewed-by: Heikki Linnakangas, Shveta Malik, Michael Paquier
Discussion: https://postgr.es/m/ZeGq1HDWFfLkjh4o@ip-10-97-1-34.eu-west-3.compute.internal
2024-03-11 10:25:01 +09:00
Jeff Davis
f696c0cd5f Catalog changes preparing for builtin collation provider.
Rename pg_collation.colliculocale to colllocale, and
pg_database.daticulocale to datlocale. These names reflects that the
fields will be useful for the upcoming builtin provider as well, not
just for ICU.

This is purely a rename; no changes to the meaning of the fields.

Discussion: https://postgr.es/m/ff4c2f2f9c8fc7ca27c1c24ae37ecaeaeaff6b53.camel%40j-davis.com
Reviewed-by: Peter Eisentraut
2024-03-09 14:48:18 -08:00
Alvaro Herrera
270af6f0df Admit deferrable PKs into rd_pkindex, but flag them as such
... and in particular don't return them as replica identity.

The motivation for this change is letting the primary keys be seen by
code that derives NOT NULL constraints from them, when creating
inheritance children; before this change, if you had a deferrable PK,
pg_dump would not recreate the attnotnull marking properly, because the
column would not be considered as having anything to back said marking
after dropping the throwaway NOT NULL constraint.

The reason we don't want these PKs as replica identities is that
replication can corrupt data, if the uniqueness constraint is
transiently broken.

Reported-by: Amul Sul <sulamul@gmail.com>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/CAAJ_b94QonkgsbDXofakHDnORQNgafd1y3Oa5QXfpQNJyXyQ7A@mail.gmail.com
2024-03-08 16:32:29 +01:00
Alexander Korotkov
4c1973fcae Avoid recursion in MemoryContext functions
You might run out of stack space with recursion, which is not nice in
functions that might be used e.g. at cleanup after transaction
abort. MemoryContext contains pointer to parent and siblings, so we
can traverse a tree of contexts iteratively, without using
stack. Refactor the functions to do that.

MemoryContextStats() still recurses, but it now has a limit to how
deep it recurses. Once the limit is reached, it prints just a summary
of the rest of the hierarchy, similar to how it summarizes contexts
with lots of children. That seems good anyway, because a context dump
with hundreds of nested contexts isn't very readable.

Report by Egor Chindyaskin and Alexander Lakhin.

Discussion: https://postgr.es/m/1672760457.940462079%40f306.i.mail.ru
Author: Heikki Linnakangas
Reviewed-by: Robert Haas, Andres Freund, Alexander Korotkov, Tom Lane
2024-03-08 13:18:30 +02:00
Amit Kapila
bf279ddd1c Introduce a new GUC 'standby_slot_names'.
This patch provides a way to ensure that physical standbys that are
potential failover candidates have received and flushed changes before
the primary server making them visible to subscribers. Doing so guarantees
that the promoted standby server is not lagging behind the subscribers
when a failover is necessary.

The logical walsender now guarantees that all local changes are sent and
flushed to the standby servers corresponding to the replication slots
specified in 'standby_slot_names' before sending those changes to the
subscriber.

Additionally, the SQL functions pg_logical_slot_get_changes,
pg_logical_slot_peek_changes and pg_replication_slot_advance are modified
to ensure that they process changes for failover slots only after physical
slots specified in 'standby_slot_names' have confirmed WAL receipt for those.

Author: Hou Zhijie and Shveta Malik
Reviewed-by: Masahiko Sawada, Peter Smith, Bertrand Drouvot, Ajin Cherian, Nisha Moond, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-03-08 08:10:45 +05:30
John Naylor
ee1b30f128 Add template for adaptive radix tree
This implements a radix tree data structure based on the design in
"The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases"
by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main
technique that makes it adaptive is using several different node types,
each with a different capacity of elements, and a different algorithm
for accessing them. The nodes start small and grow/shrink as needed.

The main advantage over hash tables is efficient sorted iteration and
better memory locality when successive keys are lexicographically
close together. The implementation currently assumes 64-bit integer
keys, and traversing the tree is in general slower than a linear
probing hash table, so this is not a general-purpose associative array.

The paper describes two other techniques not implemented here,
namely "path compression" and "lazy expansion". These can further
reduce memory usage and speed up traversal, but the former would add
significant complexity and the latter requires storing the full key
with the value. We do trivially compress the path when leading bytes
of the key are zeros, however.

For value storage, we use "combined pointer/value slots", as
recommended in the paper. Values of size equal or smaller than the the
platform's pointer type are stored in the array of child pointers in
the last level node, while larger values are each stored in a separate
allocation. This is for now fixed at compile time, but it would be
fairly trivial to allow determining at runtime how variable-length
values are stored.

One innovation in our implementation compared to the ART paper is
decoupling the notion of node "size class" from "kind". The size
classes within a given node kind have the same underlying type, but
a variable capacity for children, so we can introduce additional node
sizes with little additional code.

To enable different use cases to specialize for different value types
and for shared/local memory, we use macro-templatized code generation
in the same manner as simplehash.h and sort_template.h.

Future commits will use this infrastructure for storing TIDs.

Patch by Masahiko Sawada and John Naylor, but a substantial amount of
credit is due to Andres Freund, whose proof-of-concept was a valuable
source of coding idioms and awareness of performance pitfalls, and
who reviewed earlier versions.

Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com
2024-03-07 12:40:11 +07:00
Michael Paquier
099ca50bd4 Revert "Fix parallel-safety check of expressions and predicate for index builds"
This reverts commit eae7be600b, following a discussion with Tom Lane,
due to concerns that this impacts the decisions made by the planner for
the number of workers spawned based on the inlining and const-folding of
index expressions and predicate for cases that would have worked until
this commit.

Discussion: https://postgr.es/m/162802.1709746091@sss.pgh.pa.us
Backpatch-through: 12
2024-03-07 08:30:35 +09:00
Michael Paquier
eae7be600b Fix parallel-safety check of expressions and predicate for index builds
As coded, the planner logic that calculates the number of parallel
workers to use for a parallel index build uses expressions and
predicates from the relcache, which are flattened for the planner by
eval_const_expressions().

As reported in the bug, an immutable parallel-unsafe function flattened
in the relcache would become a Const, which would be considered as
parallel-safe, even if the predicate or the expressions including the
function are not safe in parallel workers.  Depending on the expressions
or predicate used, this could cause the parallel build to fail.

Tests are included that check parallel index builds with parallel-unsafe
predicate and expressions.  Two routines are added to lsyscache.h to be
able to retrieve expressions and predicate of an index from its pg_index
data.

Reported-by: Alexander Lakhin
Author: Tender Wang
Reviewed-by: Jian He, Michael Paquier
Discussion: https://postgr.es/m/CAHewXN=UaAaNn9ruHDH3Os8kxLVmtWqbssnf=dZN_s9=evHUFA@mail.gmail.com
Backpatch-through: 12
2024-03-06 17:23:56 +09:00
Jeff Davis
0984a3b851 Run pgindent again on the same file.
Apparently, pgindent got confused by the double space. The first time
I ran it, it moved the function name to the next line. The second time
I ran it, it moved the function name back, but without the double
space.

Now the results appear stable.
2024-03-05 11:16:23 -08:00
Jeff Davis
b406af1806 Run pgindent for commit ef4cfdce0e. 2024-03-05 10:58:24 -08:00
Heikki Linnakangas
ef4cfdce0e Fix references to renamed function in comments
I renamed the function in commit 024c521117, but missed these
comments.

Reported-by: Richard Guo
Discussion: https://www.postgresql.org/message-id/CAMbWs4-jR6qc7JRMKwz-zXQy_AYLUZ3PHjGep4B91of321cqWw@mail.gmail.com
2024-03-05 18:23:58 +02:00
Peter Eisentraut
030e10ff1a Rename pg_constraint.conwithoutoverlaps to conperiod
pg_constraint.conwithoutoverlaps was recently added to support primary
keys and unique constraints with the WITHOUT OVERLAPS clause.  An
upcoming patch provides the foreign-key side of this functionality,
but the syntax there is different and uses the keyword PERIOD.  It
would make sense to use the same pg_constraint field for both of
these, but then we should pick a more general name that conveys "this
constraint has a temporal/period-related feature".  conperiod works
for that and is nicely compact.  Changing this now avoids possibly
having to introduce versioning into clients.  Note there are still
some "without overlaps" variables left, which deal specifically with
the parsing of the primary key/unique constraint feature.

Author: Paul A. Jungwirth <pj@illuminatedcomputing.com>
Discussion: https://www.postgresql.org/message-id/flat/CA+renyUApHgSZF9-nd-a0+OPGharLQLO=mDHcY4_qQ0+noCUVg@mail.gmail.com
2024-03-05 11:24:17 +01:00
Heikki Linnakangas
55cdba2647 Fix a leftover reference to backend_id in comment
Commit 024c521117 replaced backend_id with proc_number.

Reported-by: Alexander Lakhin
2024-03-05 09:15:02 +02:00
Jeff Davis
59825d1639 Fix buildfarm failures from 2af07e2f74.
Use GUC_ACTION_SAVE rather than GUC_ACTION_SET, necessary for working
with parallel query.

Now that the call requires more arguments, wrap the call in a new
function to avoid code duplication and offer a place for a comment.

Discussion: https://postgr.es/m/E1rhJpO-0027Wf-9L@gemulon.postgresql.org
2024-03-04 19:42:16 -08:00
Daniel Gustafsson
cc09e6549f Remove the adminpack contrib extension
The adminpack extension was only used to support pgAdmin III,  which
in turn was declared EOL many years ago. Removing the extension also
allows us to remove functions from core as well which were only used
to support old version of adminpack.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Nathan Bossart <nathandbossart@gmail.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/CALj2ACUmL5TraYBUBqDZBi1C+Re8_=SekqGYqYprj_W8wygQ8w@mail.gmail.com
2024-03-04 12:39:22 +01:00
Peter Eisentraut
dbbca2cf29 Remove unused #include's from backend .c files
as determined by include-what-you-use (IWYU)

While IWYU also suggests to *add* a bunch of #include's (which is its
main purpose), this patch does not do that.  In some cases, a more
specific #include replaces another less specific one.

Some manual adjustments of the automatic result:

- IWYU currently doesn't know about includes that provide global
  variable declarations (like -Wmissing-variable-declarations), so
  those includes are being kept manually.

- All includes for port(ability) headers are being kept for now, to
  play it safe.

- No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the
  patch from exploding in size.

Note that this patch touches just *.c files, so nothing declared in
header files changes in hidden ways.

As a small example, in src/backend/access/transam/rmgr.c, some IWYU
pragma annotations are added to handle a special case there.

Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org
2024-03-04 12:02:20 +01:00
Heikki Linnakangas
393b5599e5 Use MyBackendType in more places to check what process this is
Remove IsBackgroundWorker, IsAutoVacuumLauncherProcess(),
IsAutoVacuumWorkerProcess(), and IsLogicalSlotSyncWorker() in favor of
new Am*Process() macros that use MyBackendType. For consistency with
the existing Am*Process() macros.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/f3ecd4cb-85ee-4e54-8278-5fabfb3a4ed0@iki.fi
2024-03-04 10:25:12 +02:00
David Rowley
a0cd954480 Optimize GenerationAlloc() and SlabAlloc()
In a similar effort to 413c18401, separate out the hot and cold paths in
GenerationAlloc() and SlabAlloc() to avoid having to setup the stack frame
for the hot path.

This additionally adjusts how we use the GenerationContext's freeblock.
Freeblock, when set, is now always empty and we only switch to using it
when the current allocation request finds the current block does not have
enough space and the freeblock is large enough to accomodate the
allocation.

This commit also adjusts GenerationFree() so that if we pfree the final
allocation in the current generation block, we now mark that block as
empty and keep it as the current block.  Previously we free'd that block
and set the current block to NULL.  Doing that meant we needed a special
case in GenerationAlloc to check if GenerationContext.block was NULL.
So this both reduces free/malloc calls and reduces the work done in
GenerationAlloc().

In passing, improve some comments in aset.c

Discussion: https://postgr.es/m/CAApHDvpHVSJqqb4B4OZLixr=CotKq-eKkbwZqvZVo_biYvUvQA@mail.gmail.com
2024-03-04 17:42:10 +13:00
Heikki Linnakangas
024c521117 Replace BackendIds with 0-based ProcNumbers
Now that BackendId was just another index into the proc array, it was
redundant with the 0-based proc numbers used in other places. Replace
all usage of backend IDs with proc numbers.

The only place where the term "backend id" remains is in a few pgstat
functions that expose backend IDs at the SQL level. Those IDs are now
in fact 0-based ProcNumbers too, but the documentation still calls
them "backend ids". That term still seems appropriate to describe what
the numbers are, so I let it be.

One user-visible effect is that pg_temp_0 is now a valid temp schema
name, for backend with ProcNumber 0.

Reviewed-by: Andres Freund
Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407@iki.fi
2024-03-03 19:38:22 +02:00
Heikki Linnakangas
ab355e3a88 Redefine backend ID to be an index into the proc array
Previously, backend ID was an index into the ProcState array, in the
shared cache invalidation manager (sinvaladt.c). The entry in the
ProcState array was reserved at backend startup by scanning the array
for a free entry, and that was also when the backend got its backend
ID. Things become slightly simpler if we redefine backend ID to be the
index into the PGPROC array, and directly use it also as an index to
the ProcState array. This uses a little more memory, as we reserve a
few extra slots in the ProcState array for aux processes that don't
need them, but the simplicity is worth it.

Aux processes now also have a backend ID. This simplifies the
reservation of BackendStatusArray and ProcSignal slots.

You can now convert a backend ID into an index into the PGPROC array
simply by subtracting 1. We still use 0-based "pgprocnos" in various
places, for indexes into the PGPROC array, but the only difference now
is that backend IDs start at 1 while pgprocnos start at 0. (The next
commmit will get rid of the term "backend ID" altogether and make
everything 0-based.)

There is still a 'backendId' field in PGPROC, now part of 'vxid' which
encapsulates the backend ID and local transaction ID together. It's
needed for prepared xacts. For regular backends, the backendId is
always equal to pgprocno + 1, but for prepared xact PGPROC entries,
it's the ID of the original backend that processed the transaction.

Reviewed-by: Andres Freund, Reid Thompson
Discussion: https://www.postgresql.org/message-id/8171f1aa-496f-46a6-afc3-c46fe7a9b407@iki.fi
2024-03-03 19:37:28 +02:00
Alvaro Herrera
30b8d6e4ce GUC table: Add description to computed variables
Per suggestion from Kyotaro Horiguchi
Discussion: https://postgr.es/m/20240229.130404.1411153273308142188.horikyota.ntt@gmail.com
2024-03-03 14:53:47 +01:00
Michael Paquier
655dc31046 Simplify pg_enc2gettext_tbl[] with C99-designated initializer syntax
This commit switches pg_enc2gettext_tbl[] in encnames.c to use a
C99-designated initializer syntax.

pg_bind_textdomain_codeset() is simplified so as it is possible to do
a direct lookup at the gettext() array with a value of the enum pg_enc
rather than doing a loop through all its elements, as long as the
encoding value provided by GetDatabaseEncoding() is in the correct range
of supported encoding values.  Note that PG_MULE_INTERNAL gains a value
in the array, pointing to NULL.

Author: Jelte Fennema-Nio
Discussion: https://postgr.es/m/CAGECzQT3caUbcCcszNewCCmMbCuyP7XNAm60J3ybd6PN5kH2Dw@mail.gmail.com
2024-03-01 18:03:48 +09:00
Daniel Gustafsson
6fd144e3a9 Fix integer underflow in shared memory debugging
dsa_dump would print a large negative number instead of zero for
segment bin 0.  Fix by explicitly checking for underflow and add
special case for bin 0. Backpatch to all supported versions.

Author: Ian Ilyasov <ianilyasov@outlook.com>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/GV1P251MB1004E0D09D117D3CECF9256ECD502@GV1P251MB1004.EURP251.PROD.OUTLOOK.COM
Backpatch-through: v12
2024-02-29 12:19:52 +01:00
Tom Lane
d163fdbfea Fix mis-rounding and overflow hazards in date_bin().
In the case where the target timestamp is before the origin timestamp
and their difference is already an exact multiple of the stride, the
code incorrectly subtracted the stride anyway.

Also detect several integer-overflow cases that previously produced
bogus results.  (The submitted patch tried to avoid overflow, but
I'm not convinced it's right, and problematic cases are so far out of
the plausibly-useful range that they don't seem worth sweating over.
Let's just use overflow-detecting arithmetic and throw errors.)

timestamp_bin() and timestamptz_bin() are basically identical and
so had identical bugs.  Fix both.

Report and patch by Moaaz Assali, adjusted some by me.  Back-patch
to v14 where date_bin() was introduced.

Discussion: https://postgr.es/m/CALkF+nvtuas-2kydG-WfofbRSJpyODAJWun==W-yO5j2R4meqA@mail.gmail.com
2024-02-28 14:00:30 -05:00
Alvaro Herrera
53c2a97a92 Improve performance of subsystems on top of SLRU
More precisely, what we do here is make the SLRU cache sizes
configurable with new GUCs, so that sites with high concurrency and big
ranges of transactions in flight (resp. multixacts/subtransactions) can
benefit from bigger caches.  In order for this to work with good
performance, two additional changes are made:

1. the cache is divided in "banks" (to borrow terminology from CPU
   caches), and algorithms such as eviction buffer search only affect
   one specific bank.  This forestalls the problem that linear searching
   for a specific buffer across the whole cache takes too long: we only
   have to search the specific bank, whose size is small.  This work is
   authored by Andrey Borodin.

2. Change the locking regime for the SLRU banks, so that each bank uses
   a separate LWLock.  This allows for increased scalability.  This work
   is authored by Dilip Kumar.  (A part of this was previously committed as
   d172b717c6f4.)

Special care is taken so that the algorithms that can potentially
traverse more than one bank release one bank's lock before acquiring the
next.  This should happen rarely, but particularly clog.c's group commit
feature needed code adjustment to cope with this.  I (Álvaro) also added
lots of comments to make sure the design is sound.

The new GUCs match the names introduced by bcdfa5f2e2 in the
pg_stat_slru view.

The default values for these parameters are similar to the previous
sizes of each SLRU.  commit_ts, clog and subtrans accept value 0, which
means to adjust by dividing shared_buffers by 512 (so 2MB for every 1GB
of shared_buffers), with a cap of 8MB.  (A new slru.c function
SimpleLruAutotuneBuffers() was added to support this.)  The cap was
previously 1MB for clog, so for sites with more than 512MB of shared
memory the total memory used increases, which is likely a good tradeoff.
However, other SLRUs (notably multixact ones) retain smaller sizes and
don't support a configured value of 0.  These values based on
shared_buffers may need to be revisited, but that's an easy change.

There was some resistance to adding these new GUCs: it would be better
to adjust to memory pressure automatically somehow, for example by
stealing memory from shared_buffers (where the caches can grow and
shrink naturally).  However, doing that seems to be a much larger
project and one which has made virtually no progress in several years,
and because this is such a pain point for so many users, here we take
the pragmatic approach.

Author: Andrey Borodin <x4mmm@yandex-team.ru>
Author: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Amul Sul, Gilles Darold, Anastasia Lubennikova,
	Ivan Lazarev, Robert Haas, Thomas Munro, Tomas Vondra,
	Yura Sokolov, Васильев Дмитрий (Dmitry Vasiliev).
Discussion: https://postgr.es/m/2BEC2B3F-9B61-4C1D-9FB5-5FAB0F05EF86@yandex-team.ru
Discussion: https://postgr.es/m/CAFiTN-vzDvNz=ExGXz6gdyjtzGixKSqs0mKHMmaQ8sOSEFZ33A@mail.gmail.com
2024-02-28 17:05:31 +01:00
Heikki Linnakangas
0b16bb8776 Remove AIX support
There isn't a lot of user demand for AIX support, we have a bunch of
hacks to work around AIX-specific compiler bugs and idiosyncrasies,
and no one has stepped up to the plate to properly maintain it.
Remove support for AIX to get rid of that maintenance overhead. It's
still supported for stable versions.

The acute issue that triggered this decision was that after commit
8af2565248, the AIX buildfarm members have been hitting this
assertion:

    TRAP: failed Assert("(uintptr_t) buffer == TYPEALIGN(PG_IO_ALIGN_SIZE, buffer)"), File: "md.c", Line: 472, PID: 2949728

Apperently the "pg_attribute_aligned(a)" attribute doesn't work on AIX
for values larger than PG_IO_ALIGN_SIZE, for a static const variable.
That could be worked around, but we decided to just drop the AIX support
instead.

Discussion: https://www.postgresql.org/message-id/20240224172345.32@rfd.leadboat.com
Reviewed-by: Andres Freund, Noah Misch, Thomas Munro
2024-02-28 15:17:23 +04:00
Michael Paquier
48920476b4 Remove last NULL element in config_group_names[]
This has not been needed since 9d77708d83 where there was a loop to
print all the possible GUC groups, relying on the last element to be
NULL.

Author: Japin Li
Reviewed-By: Jelte Fennema-Nio
Discussion: https://postgr.es/m/CAGECzQT3caUbcCcszNewCCmMbCuyP7XNAm60J3ybd6PN5kH2Dw@mail.gmail.com
2024-02-28 12:51:35 +09:00
David Rowley
413c18401d Refactor AllocSetAlloc(), separating hot and cold paths
Allocating from a free list or from a block which contains enough space
already, we deem to be common code paths and want to optimize for those.
Having to allocate a new block, either a normal block or a dedicated one
for a large allocation, we deem to be less common, therefore we class
that as "cold".  Both cold paths require a malloc so are going to be
slower as a result of that regardless.

The main motivation here is to remove the calls to malloc() in the hot
path and because of this, the compiler is now free to not bother setting
up the stack frame in AllocSetAlloc(), thus making the hot path much
cheaper.

Author: Andres Freund
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20210719195950.gavgs6ujzmjfaiig@alap3.anarazel.de
2024-02-28 14:20:43 +13:00
Michael Paquier
afd8ef3909 Use C99-designated initializer syntax for more arrays
This is in the same spirit as ef5e2e9085, updating this time some
arrays in parser.c, relpath.c, guc_tables.c and pg_dump_sort.c so as the
order of their elements has no need to match the enum structures they
are based on anymore.

Author: Jelte Fennema-Nio
Reviewed-by: Jian He, Japin Li
Discussion: https://postgr.es/m/CAGECzQT3caUbcCcszNewCCmMbCuyP7XNAm60J3ybd6PN5kH2Dw@mail.gmail.com
2024-02-28 08:42:36 +09:00
Andrew Dunstan
92d2ab7554 Rationalize and improve error messages for some jsonpath items
This is a followup to commit 66ea94e8e6.

Error mssages concerning incorrect formats for date-time types are
unified and parameterized, instead of using a fully separate error
message for each type.

Similarly, error messages regarding numeric and string arguments to
certain items are standardized, and instead of saying that the argument
is out of range simply say that it is invalid. The actual invalid
arguments to these itesm are now shown in the error message.

Error messages relating to numeric inputs of Nan or Infinity are
made more informative.

Jeevan Chalke and Kyotaro Horiguchi, with some input from Tom Lane.

Discussion: https://postgr.es/m/20240129.121200.235012930453045390.horikyota.ntt@gmail.com
2024-02-27 02:07:22 -05:00
David Rowley
743112a2e9 Adjust memory allocation functions to allow sibling calls
Many modern compilers are able to optimize function calls to functions
where the parameters of the called function match a leading subset of
the calling function's parameters.  If there are no instructions in the
calling function after the function is called, then the compiler is free
to avoid any stack frame setup and implement the function call as a
"jmp" rather than a "call".  This is called sibling call optimization.

Here we adjust the memory allocation functions in mcxt.c to allow this
optimization.  This requires moving some responsibility into the memory
context implementations themselves.  It's now the responsibility of the
MemoryContext to check for malloc failures.  This is good as it both
allows the sibling call optimization, but also because most small and
medium allocations won't call malloc and just allocate memory to an
existing block.  That can't fail, so checking for NULLs in that case
isn't required.

Also, traditionally it's been the responsibility of palloc and the other
allocation functions in mcxt.c to check for invalid allocation size
requests.  Here we also move the responsibility of checking that into the
MemoryContext.  This isn't to allow the sibling call optimization, but
more because most of our allocators handle large allocations separately
and we can just add the size check when doing large allocations.  We no
longer check this for non-large allocations at all.

To make checking the allocation request sizes and ERROR handling easier,
add some helper functions to mcxt.c for the allocators to use.

Author: Andres Freund
Reviewed-by: David Rowley
Discussion: https://postgr.es/m/20210719195950.gavgs6ujzmjfaiig@alap3.anarazel.de
2024-02-27 16:39:42 +13:00
Nathan Bossart
42a1de3013 Add helper functions for dshash tables with string keys.
Presently, string keys are not well-supported for dshash tables.
The dshash code always copies key_size bytes into new entries'
keys, and dshash.h only provides compare and hash functions that
forward to memcmp() and tag_hash(), both of which do not stop at
the first NUL.  This means that callers must pad string keys so
that the data beyond the first NUL does not adversely affect the
results of copying, comparing, and hashing the keys.

To better support string keys in dshash tables, this commit does
a couple things:

* A new copy_function field is added to the dshash_parameters
  struct.  This function pointer specifies how the key should be
  copied into new table entries.  For example, we only want to copy
  up to the first NUL byte for string keys.  A dshash_memcpy()
  helper function is provided and used for all existing in-tree
  dshash tables without string keys.

* A set of helper functions for string keys are provided.  These
  helper functions forward to strcmp(), strcpy(), and
  string_hash(), all of which ignore data beyond the first NUL.

This commit also adjusts the DSM registry's dshash table to use the
new helper functions for string keys.

Reviewed-by: Andy Fan
Discussion: https://postgr.es/m/20240119215941.GA1322079%40nathanxps13
2024-02-26 15:47:13 -06:00
Nathan Bossart
5fe08c006c Use NULL instead of 0 for 'arg' argument in dshash_create() calls.
A couple of dshash_create() callers provide 0 for the 'void *arg'
argument, which might give readers the incorrect impression that
this is some sort of "flags" parameter.

Reviewed-by: Andy Fan
Discussion: https://postgr.es/m/20240119215941.GA1322079%40nathanxps13
2024-02-26 15:46:01 -06:00
Alexander Korotkov
28e858c0f9 Improve documentation and GUC description for transaction_timeout
Reported-by: Alexander Lakhin
2024-02-25 20:30:17 +02:00
Amit Kapila
93db6cbda0 Add a new slot sync worker to synchronize logical slots.
By enabling slot synchronization, all the failover logical replication
slots on the primary (assuming configurations are appropriate) are
automatically created on the physical standbys and are synced
periodically. The slot sync worker on the standby server pings the primary
server at regular intervals to get the necessary failover logical slots
information and create/update the slots locally. The slots that no longer
require synchronization are automatically dropped by the worker.

The nap time of the worker is tuned according to the activity on the
primary. The slot sync worker waits for some time before the next
synchronization, with the duration varying based on whether any slots were
updated during the last cycle.

A new parameter sync_replication_slots enables or disables this new
process.

On promotion, the slot sync worker is shut down by the startup process to
drop any temporary slots acquired by the slot sync worker and to prevent
the worker from trying to fetch the failover slots.

A functionality to allow logical walsenders to wait for the physical will
be done in a subsequent commit.

Author: Shveta Malik, Hou Zhijie based on design inputs by Masahiko Sawada and Amit Kapila
Reviewed-by: Masahiko Sawada, Bertrand Drouvot, Peter Smith, Dilip Kumar, Ajin Cherian, Nisha Moond, Kuroda Hayato, Amit Kapila
Discussion: https://postgr.es/m/514f6f2f-6833-4539-39f1-96cd1e011f23@enterprisedb.com
2024-02-22 15:25:15 +05:30
Michael Paquier
011d60c435 Speed up uuid_out() by not relying on a StringInfo
Since the size of the string representation of an uuid is fixed, there
is no benefit in using a StringInfo.  This commit simplifies uuid_oud()
to not rely on a StringInfo, where avoiding the overhead of the string
manipulation makes the function substantially faster.

A COPY TO on a relation with one UUID attribute can show up to a 40%
speedup when the bottleneck is the COPY computation with uuid_out()
showing up at the top of the profiles (numbered measure here, Laurenz
has mentioned something closer to 20% faster runtimes), for example when
the data is fully in shared buffers or the OS cache.

Author: Laurenz Albe
Reviewed-by: Andres Freund, Michael Paquier
Description: https://postgr.es/m/679d5455cbbb0af667ccb753da51a475bae1eaed.camel@cybertec.at
2024-02-22 10:02:55 +09:00
Nathan Bossart
3b42bdb471 Use new overflow-safe integer comparison functions.
Commit 6b80394781 introduced integer comparison functions designed
to be as efficient as possible while avoiding overflow.  This
commit makes use of these functions in many of the in-tree qsort()
comparators to help ensure transitivity.  Many of these comparator
functions should also see a small performance boost.

Author: Mats Kindahl
Reviewed-by: Andres Freund, Fabrízio de Royes Mello
Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com
2024-02-16 14:05:36 -06:00
Nathan Bossart
5497daf3aa Replace calls to pg_qsort() with the qsort() macro.
Calls to this function might give the impression that pg_qsort()
is somehow different than qsort(), when in fact there is a qsort()
macro in port.h that expands all in-tree uses to pg_qsort().

Reviewed-by: Mats Kindahl
Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com
2024-02-16 11:37:50 -06:00
Alexander Korotkov
d57b7cc333 Add missing check_stack_depth() to some recursive functions
Reported-by: Egor Chindyaskin, Alexander Lakhin
Discussion: https://postgr.es/m/1672760457.940462079%40f306.i.mail.ru
2024-02-16 16:02:00 +02:00
Alexander Korotkov
51efe38cb9 Introduce transaction_timeout
This commit adds timeout that is expected to be used as a prevention
of long-running queries. Any session within the transaction will be
terminated after spanning longer than this timeout.

However, this timeout is not applied to prepared transactions.
Only transactions with user connections are affected.

Discussion: https://postgr.es/m/CAAhFRxiQsRs2Eq5kCo9nXE3HTugsAAJdSQSmxncivebAxdmBjQ%40mail.gmail.com
Author: Andrey Borodin <amborodin@acm.org>
Author: Japin Li <japinli@hotmail.com>
Author: Junwang Zhao <zhjwpku@gmail.com>
Reviewed-by: Nikolay Samokhvalov <samokhvalov@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Reviewed-by: bt23nguyent <bt23nguyent@oss.nttdata.com>
Reviewed-by: Yuhang Qiu <iamqyh@gmail.com>
2024-02-15 23:56:12 +02:00
Tom Lane
5c9f2f9398 Doc: improve a couple of comments in postgresql.conf.sample.
Clarify comments associated with max_parallel_workers and
related settings.

Per bug #18343 from Christopher Kline.

Discussion: https://postgr.es/m/18343-3a5e903d1d3692ab@postgresql.org
2024-02-15 16:45:03 -05:00
Nathan Bossart
8fd0498de2 Remove obsolete check in SIGTERM handler for the startup process.
Thanks to commit 3b00fdba9f, this check in the SIGTERM handler for
the startup process is now obsolete and can be removed.  Instead of
leaving around the dead function write_stderr_signal_safe(), I've
opted to just remove it for now.

This partially reverts commit 97550c0711.

Reviewed-by: Andres Freund, Noah Misch
Discussion: https://postgr.es/m/20231121212008.GA3742740%40nathanxps13
2024-02-14 17:09:31 -06:00