postgres

mirror of https://github.com/postgres/postgres.git synced 2025-10-24 01:29:19 +03:00

Author	SHA1	Message	Date
Michael Paquier	1dbe6f7667	Refactor non-supported compression error message in toast_compression.c This code used a NO_LZ4_SUPPORT() macro to issue an error in the code paths where LZ4 [de]compression is attempted but the build does not support it. This commit refactors the code to use a more flexible error message so as it can be used for other compression methods, where the method is given in input of macro. Extracted from a larger patch by the same author. Author: Nikhil Kumar Veldanda <veldanda.nikhilkumar17@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAFAfj_HX84EK4hyRYw50AOHOcdVi-+FFwAAPo7JHx4aShCvunQ@mail.gmail.com	2025-07-16 11:59:22 +09:00
Fujii Masao	b8341ae856	pgoutput: Initialize missing default for "origin" parameter. The pgoutput plugin initializes optional parameters like "binary" with default values at the start of processing. However, the "origin" parameter was previously missed and left without explicit initialization. Although the PGOutputData struct, which holds these settings, is zero-initialized at allocation (resulting in publish_no_origin field for "origin" parameter being false by default), this default was not set explicitly, unlike other parameters. This commit adds explicit initialization of the "origin" parameter to ensure consistency and clarity in how defaults are handled. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/d2790f10-238d-4cb5-a743-d9d2a9dd900f@oss.nttdata.com	2025-07-16 10:31:51 +09:00
Tom Lane	5fe55a0fe4	Doc: clarify description of regexp fields in pg_ident.conf. The grammar was a little shaky and confusing here, so word-smith it a bit. Also, adjust the comments in pg_ident.conf.sample to use the same terminology as the SGML docs, in particular "DATABASE-USERNAME" not "PG-USERNAME". Back-patch appropriate subsets. I did not risk changing pg_ident.conf.sample in released branches, but it still seems OK to change it in v18. Reported-by: Alexey Shishkin <alexey.shishkin@enterprisedb.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Discussion: https://postgr.es/m/175206279327.3157504.12519088928605422253@wrigleys.postgresql.org Backpatch-through: 13	2025-07-15 18:53:00 -04:00
Tom Lane	2a3a396432	Clarify the ra != rb case in compareJsonbContainers(). It's impossible to reach this case with either ra or rb being WJB_DONE, because our earlier checks that the structure and length of the inputs match should guarantee that we reach their ends simultaneously. However, the comment completely fails to explain this, and the Asserts don't cover it either. The comment is pretty obscure anyway, so rewrite it, and extend the Asserts to reject WJB_DONE. This is only cosmetic, so no need for back-patch. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru	2025-07-15 18:21:12 -04:00
Tom Lane	aad1617b76	Silence uninitialized-value warnings in compareJsonbContainers(). Because not every path through JsonbIteratorNext() sets val->type, some compilers complain that compareJsonbContainers() is comparing possibly-uninitialized values. The paths that don't set it return WJB_DONE, WJB_END_ARRAY, or WJB_END_OBJECT, so it's clear by manual inspection that the "(ra == rb)" code path is safe, and indeed we aren't seeing warnings about that. But the (ra != rb) case is much less obviously safe. In Assert-enabled builds it seems that the asserts rejecting WJB_END_ARRAY and WJB_END_OBJECT persuade gcc 15.x not to warn, which makes little sense because it's impossible to believe that the compiler can prove of its own accord that ra/rb aren't WJB_DONE here. (In fact they never will be, so the code isn't wrong, but why is there no warning?) Without Asserts, the appearance of warnings is quite unsurprising. We discussed fixing this by converting those two Asserts into pg_assume, but that seems not very satisfactory when it's so unclear why the compiler is or isn't warning: the warning could easily reappear with some other compiler version. Let's fix it in a less magical, more future-proof way by changing JsonbIteratorNext() so that it always does set val->type. The cost of that should be pretty negligible, and it makes the function's API spec less squishy. Reported-by: Erik Rijkers <er@xs4all.nl> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/988bf1bc-3f1f-99f3-bf98-222f1cd9dc5e@xs4all.nl Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru Backpatch-through: 13	2025-07-15 18:11:18 -04:00
Michael Paquier	006fc975a2	Fix comments in index.c This comment paragraph referred to text_eq(), but the name of the function in charge of "text" comparisons is called texteq(). Author: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxHL--XNcCCO1LgKsygzYGiVHZMfTcAxOSG8+ezxWtjddw@mail.gmail.com	2025-07-15 16:05:59 +09:00
Tom Lane	3c4e26a62c	In username-map substitution, cope with more than one \1. If the system-name field of a pg_ident.conf line is a regex containing capturing parentheses, you can write \1 in the user-name field to represent the captured part of the system name. But what happens if you write \1 more than once? The only reasonable expectation IMO is that each \1 gets replaced, but presently our code replaces only the first. Fix that. Also, improve the tests for this feature to exercise cases where a non-empty string needs to be substituted for \1. The previous testing didn't inspire much faith that it was verifying correct operation of the substitution code. Given the lack of field complaints about this, I don't feel a need to back-patch. Reported-by: David G. Johnston <david.g.johnston@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAKFQuwZu6kZ8ZPvJ3pWXig+6UX4nTVK-hdL_ZS3fSdps=RJQQQ@mail.gmail.com	2025-07-13 13:52:32 -04:00
Nathan Bossart	8893c3ab36	Remove XLogCtl->ckptFullXid. A few code paths set this variable, but its value is never used. Oversight in commit `2fc7af5e96`. Reviewed-by: Aleksander Alekseev <aleksander@tigerdata.com> Discussion: https://postgr.es/m/aHFyE1bs9YR93dQ1%40nathan	2025-07-12 14:34:57 -05:00
Tom Lane	84ce258707	Replace float8 with int in date2isoweek() and date2isoyear(). The values of the "result" variables in these functions are always integers; using a float8 variable accomplishes nothing except to incur useless conversions to and from float. While that wastes a few nanoseconds, these functions aren't all that time-critical. But it seems worth fixing to remove possible reader confusion. Also, in the case of date2isoyear(), "result" is a very poorly chosen variable name because it is not the function's result. Rename it to "week", and do the same in date2isoweek() for consistency. Since this is mostly cosmetic, there seems little need for back-patch. Author: Sergey Fukanchik <s.fukanchik@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/6323a-68726500-1-7def9d00@137821581	2025-07-12 11:50:37 -04:00
Andres Freund	f2c87ac04e	Remove long-unused TransactionIdIsActive() TransactionIdIsActive() has not been used since `bb38fb0d43`, in 2014. There are no known uses in extensions either and it's hard to see valid uses for it. Therefore remove TransactionIdIsActive(). Discussion: https://postgr.es/m/odgftbtwp5oq7cxjgf4kjkmyq7ypoftmqy7eqa7w3awnouzot6@hrwnl5tdqrgu	2025-07-12 11:00:44 -04:00
Thomas Munro	b8e1f2d96b	aio: Fix configuration reload in IO workers. method_worker.c installed SignalHandlerForConfigReload, but it failed to actually process reload requests. That hasn't yet produced any concrete problem reports in terms of GUC changes it should have cared about in v18, but it was inconsistent. It did cause problems for a couple of patches in development that need IO workers to react to ALTER SYSTEM + pg_reload_conf(). Fix extracted from one of those patches. Back-patch to 18. Reported-by: Dmitry Dolgov <9erthalion6@gmail.com> Discussion: https://postgr.es/m/sh5uqe4a4aqo5zkkpfy5fobe2rg2zzouctdjz7kou4t74c66ql%40yzpkxb7pgoxf	2025-07-12 16:33:02 +12:00
Thomas Munro	177c1f0593	aio: Remove obsolete IO worker ID references. In an ancient ancestor of this code, the postmaster assigned IDs to IO workers. Now it tracks them in an unordered array and doesn't know their IDs, so it might be confusing to readers that it still referred to their indexes as IDs. No change in behavior, just variable name and error message cleanup. Back-patch to 18. Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com	2025-07-12 14:44:22 +12:00
Thomas Munro	01d618bcd7	aio: Regularize IO worker internal naming. Adopt PgAioXXX convention for pgaio module type names. Rename a function that didn't use a pgaio_worker_ submodule prefix. Rename the internal submit function's arguments to match the indirectly relevant function pointer declaration and nearby examples. Rename the array of handle IDs in PgAioSubmissionQueue to sqes, a term of art seen in the systems it emulates, also clarifying that they're not IO handle pointers as the old name might imply. No change in behavior, just type, variable and function name cleanup. Back-patch to 18. Discussion: https://postgr.es/m/CA%2BhUKG%2BwbaZZ9Nwc_bTopm4f-7vDmCwLk80uKDHj9mq%2BUp0E%2Bg%40mail.gmail.com	2025-07-12 14:44:09 +12:00
Thomas Munro	40e105042a	Fix stale idle flag when IO workers exit. Otherwise we could choose a worker that has exited and crash while trying to wake it up. Back-patch to 18. Reported-by: Tomas Vondra <tomas@vondra.me> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/t5aqjhkj6xdkido535pds7fk5z4finoxra4zypefjqnlieevbg%40357aaf6u525j	2025-07-12 13:11:47 +12:00
Tom Lane	64840e4624	Fix inconsistent quoting of role names in ACLs. getid() and putid(), which parse and deparse role names within ACL input/output, applied isalnum() to see if a character within a role name requires quoting. They did this even for non-ASCII characters, which is problematic because the results would depend on encoding, locale, and perhaps even platform. So it's possible that putid() could elect not to quote some string that, later in some other environment, getid() will decide is not a valid identifier, causing dump/reload or similar failures. To fix this in a way that won't risk interoperability problems with unpatched versions, make getid() treat any non-ASCII as a legitimate identifier character (hence not requiring quotes), while making putid() treat any non-ASCII as requiring quoting. We could remove the resulting excess quoting once we feel that no unpatched servers remain in the wild, but that'll be years. A lesser problem is that getid() did the wrong thing with an input consisting of just two double quotes (""). That has to represent an empty string, but getid() read it as a single double quote instead. The case cannot arise in the normal course of events, since we don't allow empty-string role names. But let's fix it while we're here. Although we've not heard field reports of problems with non-ASCII role names, there's clearly a hazard there, so back-patch to all supported versions. Reported-by: Peter Eisentraut <peter@eisentraut.org> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3792884.1751492172@sss.pgh.pa.us Backpatch-through: 13	2025-07-11 18:50:13 -04:00
Nathan Bossart	8d33fbacba	Add FLUSH_UNLOGGED option to CHECKPOINT command. This option, which is disabled by default, can be used to request the checkpoint also flush dirty buffers of unlogged relations. As with the MODE option, the server may consolidate the options for concurrently requested checkpoints. For example, if one session uses (FLUSH_UNLOGGED FALSE) and another uses (FLUSH_UNLOGGED TRUE), the server may perform one checkpoint with FLUSH_UNLOGGED enabled. Author: Christoph Berg <myon@debian.org> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	2f698d7f4b	Add MODE option to CHECKPOINT command. This option may be set to FAST (the default) to request the checkpoint be completed as fast as possible, or SPREAD to request the checkpoint be spread over a longer interval (based on the checkpoint-related configuration parameters). Note that the server may consolidate the options for concurrently requested checkpoints. For example, if one session requests a "fast" checkpoint and another requests a "spread" checkpoint, the server may perform one "fast" checkpoint. Author: Christoph Berg <myon@debian.org> Reviewed-by: Andres Freund <andres@anarazel.de> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	a4f126516e	Add option list to CHECKPOINT command. This commit adds the boilerplate code for supporting a list of options in CHECKPOINT commands. No actual options are supported yet, but follow-up commits will add support for MODE and FLUSH_UNLOGGED. While at it, this commit refactors the code for executing CHECKPOINT commands to its own function since it's about to become significantly larger. Author: Christoph Berg <myon@debian.org> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	bb938e2c3c	Rename CHECKPOINT_IMMEDIATE to CHECKPOINT_FAST. The new name more accurately reflects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "fast" instead of "immediate", too. Likewise, references to "immediate" checkpoints in the documentation have been updated to say "fast". This is preparatory work for a follow-up commit that will add a MODE option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Nathan Bossart	cd8324cc89	Rename CHECKPOINT_FLUSH_ALL to CHECKPOINT_FLUSH_UNLOGGED. The new name more accurately relects the effects of this flag on a requested checkpoint. Checkpoint-related log messages (i.e., those controlled by the log_checkpoints configuration parameter) will now say "flush-unlogged" instead of "flush-all", too. This is preparatory work for a follow-up commit that will add a FLUSH_UNLOGGED option to the CHECKPOINT command. Author: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aDnaKTEf-0dLiEfz%40msg.df7cb.de	2025-07-11 11:51:25 -05:00
Amit Kapila	72e6c08fea	Fix the handling of two GUCs during upgrade. Previously, the check_hook functions for max_slot_wal_keep_size and idle_replication_slot_timeout would incorrectly raise an ERROR for values set in postgresql.conf during upgrade, even though those values were not actively used in the upgrade process. To prevent logical slot invalidation during upgrade, we used to set special values for these GUCs. Now, instead of relying on those values, we directly prevent WAL removal and logical slot invalidation caused by max_slot_wal_keep_size and idle_replication_slot_timeout. Note: PostgreSQL 17 does not include the idle_replication_slot_timeout GUC, so related changes were not backported. BUG #18979 Reported-by: jorsol <jorsol@gmail.com> Author: Dilip Kumar <dilipbalaut@gmail.com> Reviewed by: vignesh C <vignesh21@gmail.com> Reviewed by: Alvaro Herrera <alvherre@alvh.no-ip.org> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/219561.1751826409@sss.pgh.pa.us Discussion: https://postgr.es/m/18979-a1b7fdbb7cd181c6@postgresql.org	2025-07-11 10:46:43 +05:30
Fujii Masao	110e6dcaa6	doc: Clarify meaning of "idle" in idle_replication_slot_timeout. This commit updates the documentation to clarify that "idle" in idle_replication_slot_timeout means the replication slot is inactive, that is, not currently used by any replication connection. Without this clarification, "idle" could be misinterpreted to mean that the slot is not advancing or that no data is being streamed, even if a connection exists. Back-patch to v18 where idle_replication_slot_timeout was added. Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Gunnar Morling <gunnar.morling@googlemail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com Backpatch-through: 18	2025-07-11 08:44:32 +09:00
Fujii Masao	05dedf43d3	Change unit of idle_replication_slot_timeout to seconds. Previously, the idle_replication_slot_timeout parameter used minutes as its unit, based on the assumption that values would typically exceed one minute in production environments. However, this caused unexpected behavior: specifying a value below 30 seconds would round down to 0, effectively disabling the timeout. This could be surprising to users. To allow finer-grained control and avoid such confusion, this commit changes the unit of idle_replication_slot_timeout to seconds. Larger values can still be specified easily using standard time suffixes, for example, '24h' for 24 hours. Back-patch to v18 where idle_replication_slot_timeout was added. Reported-by: Gunnar Morling <gunnar.morling@googlemail.com> Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CADGJaX_0+FTguWpNSpgVWYQP_7MhoO0D8=cp4XozSQgaZ40Odw@mail.gmail.com Backpatch-through: 18	2025-07-11 08:39:24 +09:00
Jeff Davis	53cd0b71ee	Change wchar2char() and char2wchar() to accept a locale_t. These are libc-specific functions, so should require a locale_t rather than a pg_locale_t (which could use another provider). Discussion: https://postgr.es/m/a8666c391dfcabe79868d95f7160eac533ace718.camel%40j-davis.com	2025-07-09 08:45:34 -07:00
Nathan Bossart	167ed8082f	Introduce pg_dsm_registry_allocations view. This commit adds a new system view that provides information about entries in the dynamic shared memory (DSM) registry. Specifically, it returns the name, type, and size of each entry. Note that since we cannot discover the size of dynamic shared memory areas (DSAs) and hash tables backed by DSAs (dshashes) without first attaching to them, the size column is left as NULL for those. Bumps catversion. Author: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Sungwoo Chang <swchangdev@gmail.com> Discussion: https://postgr.es/m/4D445D3E-81C5-4135-95BB-D414204A0AB4%40gmail.com	2025-07-09 09:17:56 -05:00
Tom Lane	e03c952877	Fix low-probability memory leak in XMLSERIALIZE(... INDENT). xmltotext_with_options() did not consider the possibility that pg_xml_init() could fail --- most likely due to OOM. If that happened, the already-parsed xmlDoc structure would be leaked. Oversight in commit `483bdb2af`. Bug: #18981 Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/18981-9bc3c80f107ae925@postgresql.org Backpatch-through: 16	2025-07-08 12:50:33 -04:00
Andres Freund	f54af9f267	aio: Combine io_uring memory mappings, if supported By default io_uring creates a shared memory mapping for each io_uring instance, leading to a large number of memory mappings. Unfortunately a large number of memory mappings slows things down, backend exit is particularly affected. To address that, newer kernels (6.5) support using user-provided memory for the memory. By putting the relevant memory into shared memory we don't need any additional mappings. On a system with a new enough kernel and liburing, there is no discernible overhead when doing a pgbench -S -C anymore. Reported-by: MARK CALLAGHAN <mdcallag@gmail.com> Reviewed-by: "Burd, Greg" <greg@burd.me> Reviewed-by: Jim Nasby <jnasby@upgrade.com> Discussion: https://postgr.es/m/CAFbpF8OA44_UG+RYJcWH9WjF7E3GA6gka3gvH6nsrSnEe9H0NA@mail.gmail.com Backpatch-through: 18	2025-07-07 22:57:07 -04:00
Richard Guo	55a780e947	Consider explicit incremental sort for Append and MergeAppend For an ordered Append or MergeAppend, we need to inject an explicit sort into any subpath that is not already well enough ordered. Currently, only explicit full sorts are considered; incremental sorts are not yet taken into account. In this patch, for subpaths of an ordered Append or MergeAppend, we choose to use explicit incremental sort if it is enabled and there are presorted keys. The rationale is based on the assumption that incremental sort is always faster than full sort when there are presorted keys, a premise that has been applied in various parts of the code. In addition, the current cost model tends to favor incremental sort as being cheaper than full sort in the presence of presorted keys, making it reasonable not to consider full sort in such cases. No backpatch as this could result in plan changes. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CAMbWs4_V7a2enTR+T3pOY_YZ-FU8ZsFYym2swOz4jNMqmSgyuw@mail.gmail.com	2025-07-08 10:21:44 +09:00
Álvaro Herrera	c616785516	Refactor some repetitive SLRU code Functions to bootstrap and zero pages in various SLRU callers were fairly duplicative. We can slash almost two hundred lines with a couple of simple helpers: - SimpleLruZeroAndWritePage: Does the equivalent of SimpleLruZeroPage followed by flushing the page to disk - XLogSimpleInsertInt64: Does a XLogBeginInsert followed by XLogInsert of a trivial record whose data is just an int64. Author: Evgeny Voropaev <evgeny.voropaev@tantorlabs.com> Reviewed by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed by: Andrey Borodin <x4mmm@yandex-team.ru> Reviewed by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/flat/97820ce8-a1cd-407f-a02b-47368fadb14b%40tantorlabs.com	2025-07-07 16:49:19 +02:00
Álvaro Herrera	2633dae2e4	Standardize LSN formatting by zero padding This commit standardizes the output format for LSNs to ensure consistent representation across various tools and messages. Previously, LSNs were inconsistently printed as `%X/%X` in some contexts, while others used zero-padding. This often led to confusion when comparing. To address this, the LSN format is now uniformly set to `%X/%08X`, ensuring the lower 32-bit part is always zero-padded to eight hexadecimal digits. Author: Japin Li <japinli@hotmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/ME0P300MB0445CA53CA0E4B8C1879AF84B641A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM	2025-07-07 13:57:43 +02:00
Michael Paquier	62a17a9283	Integrate FullTransactionIds deeper into two-phase code This refactoring is a follow-up of the work done in `5a1dfde833`, that has switched 2PC file names to use FullTransactionIds when written on disk. This will help with the integration of a follow-up solution related to the handling of two-phase files during recovery, to address older defects while reading these from disk after a crash. This change is useful in itself as it reduces the need to build the file names from epoch numbers and TransactionIds, because we can use directly FullTransactionIds from which the 2PC file names are guessed. So this avoids a lot of back-and-forth between the FullTransactionIds retrieved from the file names and how these are passed around in the internal 2PC logic. Note that the core of the change is the use of a FullTransactionId instead of a TransactionId in GlobalTransactionData, that tracks 2PC file information in shared memory. The change in TwoPhaseCallback makes this commit unfit for stable branches. Noah has contributed a good chunk of this patch. I have spent some time on it as well while working on the issues with two-phase state files and recovery. Author: Noah Misch <noah@leadboat.com> Co-Authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/Z5sd5O9JO7NYNK-C@paquier.xyz Discussion: https://postgr.es/m/20250116205254.65.nmisch@google.com	2025-07-07 12:50:40 +09:00
Michael Paquier	5a6c39b6df	Disable commit timestamps during bootstrap Attempting to use commit timestamps during bootstrapping leads to an assertion failure, that can be reached for example with an initdb -c that enables track_commit_timestamp. It makes little sense to register a commit timestamp for a BootstrapTransactionId, so let's disable the activation of the module in this case. This problem has been independently reported once by each author of this commit. Each author has proposed basically the same patch, relying on IsBootstrapProcessingMode() to skip the use of commit_ts during bootstrap. The test addition is a suggestion by me, and is applied down to v16. Author: Hayato Kuroda <kuroda.hayato@fujitsu.com> Author: Andy Fan <zhihuifan1213@163.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/OSCPR01MB14966FF9E4C4145F37B937E52F5102@OSCPR01MB14966.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/87plejmnpy.fsf@163.com Backpatch-through: 13	2025-07-04 15:09:24 +09:00
Fujii Masao	78ebda66bf	Speed up truncation of temporary relations. Previously, truncating a temporary relation required scanning the entire local buffer pool once per relation fork to invalidate buffers. This could be slow, especially with a large local buffers, as the scan was repeated multiple times. A similar issue with regular tables (shared buffers) was addressed in commit `6d05086c0a` by scanning the buffer pool only once for all forks. This commit applies the same optimization to temporary relations, improving truncation performance. Author: Daniil Davydov <3danissimo@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Discussion: https://postgr.es/m/CAJDiXggNqsJOH7C5co4jA8nDk8vw-=sokyh5s1_TENWnC6Ofcg@mail.gmail.com	2025-07-04 09:03:58 +09:00
Tom Lane	931766aaec	Simplify COALESCE() with one surviving argument. If, after removal of useless null-constant arguments, a CoalesceExpr has exactly one remaining argument, we can just take that argument as the result, without bothering to wrap a new CoalesceExpr around it. This isn't likely to produce any great improvement in runtime per se, but it can lead to better plans since the planner no longer has to treat the expression as non-strict. However, there were a few regression test cases that intentionally wrote COALESCE(x) as a shorthand way of creating a non-strict subexpression. To avoid ruining the intent of those tests, write COALESCE(x,x) instead. (If anyone ever proposes de-duplicating COALESCE arguments, we'll need another iteration of this arms race. But it seems pretty unlikely that such an optimization would be worthwhile.) Author: Maksim Milyutin <maksim.milyutin@tantorlabs.ru> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/8e8573c3-1411-448d-877e-53258b7b2be0@tantorlabs.ru	2025-07-03 17:39:53 -04:00
Tom Lane	0059bbe1ec	Break out xxx2yyy_opt_overflow APIs for more datetime conversions. Previous commits invented timestamp2timestamptz_opt_overflow, date2timestamp_opt_overflow, and date2timestamptz_opt_overflow functions to perform non-error-throwing conversions between datetime types. This patch completes the set by adding timestamp2date_opt_overflow, timestamptz2date_opt_overflow, and timestamptz2timestamp_opt_overflow. In addition, adjust timestamp2timestamptz_opt_overflow so that it doesn't throw error if timestamp2tm fails, but treats that as an overflow case. The situation probably can't arise except with an invalid timestamp value, and I can't think of a way that that would happen except data corruption. However, it's pretty silly to have a function whose entire reason for existence is to not throw errors for out-of-range inputs nonetheless throw an error for out-of-range input. The new APIs are not used in this patch, but will be needed in upcoming btree_gin changes. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com> Discussion: https://postgr.es/m/262624.1738460652@sss.pgh.pa.us	2025-07-03 16:17:08 -04:00
Tom Lane	a10f21e6ce	Obtain required table lock during cross-table updates, redux. Commits `8319e5cb5` et al missed the fact that ATPostAlterTypeCleanup contains three calls to ATPostAlterTypeParse, and the other two also need protection against passing a relid that we don't yet have lock on. Add similar logic to those code paths, and add some test cases demonstrating the need for it. In v18 and master, the test cases demonstrate that there's a behavioral discrepancy between stored generated columns and virtual generated columns: we disallow changing the expression of a stored column if it's used in any composite-type columns, but not that of a virtual column. Since the expression isn't actually relevant to either sort of composite-type usage, this prohibition seems unnecessary; but changing it is a matter for separate discussion. For now we are just documenting the existing behavior. Reported-by: jian he <jian.universality@gmail.com> Author: jian he <jian.universality@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: CACJufxGKJtGNRRSXfwMW9SqVOPEMdP17BJ7DsBf=tNsv9pWU9g@mail.gmail.com Backpatch-through: 13	2025-07-03 13:46:07 -04:00
Álvaro Herrera	647cffd2f3	Prevent creation of duplicate not-null constraints for domains This was previously harmless, but now that we create pg_constraint rows for those, duplicates are not welcome anymore. Backpatch to 18. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CACJufxFSC0mcQ82bSk58sO-WJY4P-o4N6RD2M0D=DD_u_6EzdQ@mail.gmail.com	2025-07-03 11:46:12 +02:00
Álvaro Herrera	87251e1149	Fix bogus grammar for a CREATE CONSTRAINT TRIGGER error If certain constraint characteristic clauses (NO INHERIT, NOT VALID, NOT ENFORCED) are given to CREATE CONSTRAINT TRIGGER, the resulting error message is ERROR: TRIGGER constraints cannot be marked NO INHERIT which is a bit silly, because these aren't "constraints of type TRIGGER". Hardcode a better error message to prevent it. This is a cosmetic fix for quite a fringe problem with no known complaints from users, so no backpatch. While at it, silently accept ENFORCED if given. Author: Amul Sul <sulamul@gmail.com> Reviewed-by: jian he <jian.universality@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CAAJ_b97hd-jMTS7AjgU6TDBCzDx_KyuKxG+K-DtYmOieg+giyQ@mail.gmail.com Discussion: https://postgr.es/m/CACJufxHSp2puxP=q8ZtUGL1F+heapnzqFBZy5ZNGUjUgwjBqTQ@mail.gmail.com	2025-07-03 11:25:39 +02:00
Michael Paquier	8ec04c8577	Refactor subtype field of AlterDomainStmt AlterDomainStmt.subtype used characters for its subtypes of commands, SET\|DROP DEFAULT\|NOT NULL and ADD\|DROP\|VALIDATE CONSTRAINT, which were hardcoded in a couple of places of the code. The code is improved by using an enum instead, with the same character values as the original code. Note that the field was documented in parsenodes.h and that it forgot to mention 'V' (VALIDATE CONSTRAINT). Author: Quan Zongliang <quanzongliang@yeah.net> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/41ff310b-16bd-44b9-a3ef-97e20f14b709@yeah.net	2025-07-03 16:34:28 +09:00
Fujii Masao	bc2f348e87	Support multi-line headers in COPY FROM command. The COPY FROM command now accepts a non-negative integer for the HEADER option, allowing multiple header lines to be skipped. This is useful when the input contains multi-line headers that should be ignored during data import. Author: Shinya Kato <shinya11.kato@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Yugo Nagata <nagata@sraoss.co.jp> Discussion: https://postgr.es/m/CAOzEurRPxfzbxqeOPF_AGnAUOYf=Wk0we+1LQomPNUNtyZGBZw@mail.gmail.com	2025-07-03 15:27:26 +09:00
Michael Paquier	fd7d7b7191	Improve checks for GUC recovery_target_timeline Currently check_recovery_target_timeline() converts any value that is not "current", "latest", or a valid integer to 0. So, for example, the following configuration added to postgresql.conf followed by a startup: recovery_target_timeline = 'bogus' recovery_target_timeline = '9999999999' ... results in the following error patterns: FATAL: 22023: recovery target timeline 0 does not exist FATAL: 22023: recovery target timeline 1410065407 does not exist This is confusing, because the server does not reflect the intention of the user, and just reports incorrect data unrelated to the GUC. The origin of the problem is that we do not perform a range check in the GUC value passed-in for recovery_target_timeline. This commit improves the situation by using strtou64() and by providing stricter range checks. Some test cases are added for the cases of an incorrect, an upper-bound and a lower-bound timeline value, checking the sanity of the reports based on the contents of the server logs. Author: David Steele <david@pgmasters.net> Discussion: https://postgr.es/m/e5d472c7-e9be-4710-8dc4-ebe721b62cea@pgbackrest.org	2025-07-03 11:14:20 +09:00
Richard Guo	0da29e4cb1	Enable use of Memoize for ANTI joins Currently, we do not support Memoize for SEMI and ANTI joins because nested loop SEMI/ANTI joins do not scan the inner relation to completion, which prevents Memoize from marking the cache entry as complete. One might argue that we could mark the cache entry as complete after fetching the first inner tuple, but that would not be safe: if the first inner tuple and the current outer tuple do not satisfy the join clauses, a second inner tuple matching the parameters would find the cache entry already marked as complete. However, if the inner side is provably unique, this issue doesn't arise, since there would be no second matching tuple. That said, this doesn't help in the case of SEMI joins, because a SEMI join with a provably unique inner side would already have been reduced to an inner join by reduce_unique_semijoins. Therefore, in this patch, we check whether the inner relation is provably unique for ANTI joins and enable the use of Memoize in such cases. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Reviewed-by: Andrei Lepikhov <lepihov@gmail.com> Discussion: https://postgr.es/m/CAMbWs48FdLiMNrmJL-g6mDvoQVt0yNyJAqMkv4e2Pk-5GKCZLA@mail.gmail.com	2025-07-03 10:57:26 +09:00
Michael Paquier	7b2eb72b1b	Add InjectionPointList() to retrieve list of injection points This routine has come as a useful piece to be able to know the list of injection points currently attached in a system. One area would be to use it in a set-returning function, or just let out-of-core code play with it. This hides the internals of the shared memory array lookup holding the information about the injection points (point name, library and function name), allocating the result in a palloc'd List consumable by the caller. Reviewed-by: Jeff Davis <pgsql@j-davis.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/Z_xYkA21KyLEHvWR@paquier.xyz Discussion: https://postgr.es/m/aBG2rPwl3GE7m1-Q@paquier.xyz	2025-07-03 08:41:25 +09:00
Nathan Bossart	bb109382ef	Make more use of RELATION_IS_OTHER_TEMP(). A few places were open-coding it instead of using this handy macro. Author: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Discussion: https://postgr.es/m/CAEG8a3LjTGJcOcxQx-SUOGoxstG4XuCWLH0ATJKKt_aBTE5K8w%40mail.gmail.com	2025-07-02 12:32:19 -05:00
Nathan Bossart	fe07100e82	Add GetNamedDSA() and GetNamedDSHash(). Presently, the dynamic shared memory (DSM) registry only provides GetNamedDSMSegment(), which allocates a fixed-size segment. To use the DSM registry for more sophisticated things like dynamic shared memory areas (DSAs) or a hash table backed by a DSA (dshash), users need to create a DSM segment that stores various handles and LWLock tranche IDs and to write fairly complicated initialization code. Furthermore, there is likely little variation in this initialization code between libraries. This commit introduces functions that simplify allocating a DSA or dshash within the DSM registry. These functions are very similar to GetNamedDSMSegment(). Notable differences include the lack of an initialization callback parameter and the prohibition of calling the functions more than once for a given entry in each backend (which should be trivially avoidable in most circumstances). While at it, this commit bumps the maximum DSM registry entry name length from 63 bytes to 127 bytes. Also note that even though one could presumably detach/destroy the DSAs and dshashes created in the registry, such use-cases are not yet well-supported, if for no other reason than the associated DSM registry entries cannot be removed. Adding such support is left as a future exercise. The test_dsm_registry test module contains tests for the new functions and also serves as a complete usage example. Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Florents Tselai <florents.tselai@gmail.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Discussion: https://postgr.es/m/aEC8HGy2tRQjZg_8%40nathan	2025-07-02 11:50:52 -05:00
Peter Geoghegan	9ca30a0b04	Update obsolete row compare preprocessing comments. Restore nbtree preprocessing comments describing how we mark nbtree row compare members required to how they were prior to 2016 bugfix commit `a298a1e0`. Oversight in commit `bd3f59fd`, which made nbtree preprocessing revert to the original 2006 rules, but neglected to revert these comments. Backpatch-through: 18	2025-07-02 12:36:35 -04:00
Tom Lane	7374b3a536	Allow width_bucket()'s "operand" input to be NaN. The array-based variant of width_bucket() has always accepted NaN inputs, treating them as equal but larger than any non-NaN, as we do in ordinary comparisons. But up to now, the four-argument variants threw errors for a NaN operand. This is inconsistent and unnecessary, since we can perfectly well regard NaN as falling after the last bucket. We do still throw error for NaN or infinity histogram-bound inputs, since there's no way to compute sensible bucket boundaries. Arguably this is a bug fix, but given the lack of field complaints I'm content to fix it in master. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/2822872.1750540911@sss.pgh.pa.us	2025-07-02 11:34:40 -04:00
Álvaro Herrera	c989affb52	Fix error message for ALTER CONSTRAINT ... NOT VALID Trying to alter a constraint so that it becomes NOT VALID results in an error that assumes the constraint is a foreign key. This is potentially wrong, so give a more generic error message. While at it, give CREATE CONSTRAINT TRIGGER a better error message as well. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Fujii Masao <masao.fujii@oss.nttdata.com> Co-authored-by: Álvaro Herrera <alvherre@kurilemu.de> Co-authored-by: Amul Sul <sulamul@gmail.com> Discussion: https://postgr.es/m/CACJufxHSp2puxP=q8ZtUGL1F+heapnzqFBZy5ZNGUjUgwjBqTQ@mail.gmail.com	2025-07-02 17:02:27 +02:00
Peter Geoghegan	bd3f59fdb7	Make row compares robust during nbtree array scans. Recent nbtree bugfix commit `5f4d98d4` added a special case to the code that sets up a page-level prefix of keys that are definitely satisfied by every tuple on the page: whenever _bt_set_startikey reached a row compare key, we'd refuse to apply the pstate.forcenonrequired behavior in scans where that usually happens (scans with a higher-order array key). That hack made the scan avoid essentially the same infinite cycling behavior that also affected nbtree scans with redundant keys (keys that preprocessing could not eliminate) prior to commit `f09816a0`. There are now serious doubts about this row compare workaround. Testing has shown that a scan with a row compare key and an array key could still read the same leaf page twice (without the scan's direction changing), which isn't supposed to be possible following the SAOP enhancements added by Postgres 17 commit `5bf748b8`. Also, we still allowed a required row compare key to be used with forcenonrequired mode when its header key happened to be beyond the pstate.ikey set by _bt_set_startikey, which was complicated and brittle. The underlying problem was that row compares had inconsistent rules around how scans start (which keys can be used for initial positioning purposes) and how scans end (which keys can set continuescan=false). Quals with redundant keys that could not be eliminated by preprocessing also had that same quality to them prior to today's bugfix `f09816a0`. It now seems prudent to bring row compare keys in line with the new charter for required keys, by making the start and end rules symmetric. This commit fixes two points of disagreement between _bt_first and _bt_check_rowcompare. Firstly, _bt_check_rowcompare was capable of ending the scan at the point where it needed to compare an ISNULL-marked row compare member that came immediately after a required row compare member. _bt_first now has symmetric handling for NULL row compares. Secondly, _bt_first had its own ideas about which keys were safe to use for initial positioning purposes. It could use fewer or more keys than _bt_check_rowcompare. _bt_first now uses the same requiredness markings as _bt_check_rowcompare for this. Now that _bt_first and _bt_check_rowcompare agree on how to start and end scans, we can get rid of the forcenonrequired special case, without any risk of infinite cycling. This approach also makes row compare keys behave more like regular scalar keys, particularly within _bt_first. Fixing these inconsistencies necessitates dealing with a related issue with the way that row compares were marked required by preprocessing: we didn't mark any lower-order row members required following 2016 bugfix commit `a298a1e0`. That approach was over broad. The bug in question was actually an oversight in how _bt_check_rowcompare dealt with tuple NULL values that failed to satisfy a scan key marked required in the opposite scan direction (it was a bug in 2011 commits `6980f817` and `882368e8`, not a bug in 2006 commit `3a0a16cb`). Go back to marking row compare members as required using the original 2006 rules, and fix the 2016 bug in a more principled way: by limiting use of the "set continuescan=false with a key required in the opposite scan direction upon encountering a NULL tuple value" optimization to the first/most significant row member key. While it isn't safe to use an implied IS NOT NULL qualifier to end the scan when it comes from a required lower-order row compare member key, it _is_ generally safe for such a required member key to end the scan -- provided the key is marked required in the _current_ scan direction. This fixes what was arguably an oversight in either commit `5f4d98d4` or commit `8a510275`. It is a direct follow-up to today's commit `f09816a0`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://postgr.es/m/CAH2-Wz=pcijHL_mA0_TJ5LiTB28QpQ0cGtT-ccFV=KzuunNDDQ@mail.gmail.com Backpatch-through: 18	2025-07-02 09:48:15 -04:00
Peter Geoghegan	f09816a0a7	Make handling of redundant nbtree keys more robust. nbtree preprocessing's handling of redundant (and contradictory) keys created problems for scans with = arrays. It was just about possible for a scan with an = array key and one or more redundant keys (keys that preprocessing could not eliminate due an incomplete opfamily and a cross-type key) to get stuck. Testing has shown that infinite cycling where the scan never manages to make forward progress was possible. This could happen when the scan's arrays were reset in _bt_readpage's forcenonrequired=true path (added by bugfix commit `5f4d98d4`) when the arrays weren't at least advanced up to the same point that they were in at the start of the _bt_readpage call. Earlier redundant keys prevented the finaltup call to _bt_advance_array_keys from reaching lower-order keys that needed to be used to sufficiently advance the scan's arrays. To fix, make preprocessing leave the scan's keys in a state that is as close as possible to how it'll usually leave them (in the common case where there's no redundant keys that preprocessing failed to eliminate). Now nbtree preprocessing _reliably_ leaves behind at most one required >/>= key per index column, and at most one required </<= key per index column. Columns that have one or more = keys that are eligible to be marked required (based on the traditional rules) prioritize the = keys over redundant inequality keys; they'll _reliably_ be left with only one of the = keys as the index column's only required key. Keys that are not marked required (whether due to the new preprocessing step running or for some other reason) are relocated to the end of the so->keyData[] array as needed. That way they'll always be evaluated after the scan's required keys, and so cannot prevent code in places like _bt_advance_array_keys and _bt_first from reaching a required key. Also teach _bt_first to decide which initial positioning keys to use based on the same requiredness markings that have long been used by _bt_checkkeys/_bt_advance_array_keys. This is a necessary condition for reliably avoiding infinite cycling. _bt_advance_array_keys expects to be able to reason about what'll happen in the next _bt_first call should it start another primitive index scan, by evaluating inequality keys that were marked required in the opposite-to-scan scan direction only. Now everybody (_bt_first, _bt_checkkeys, and _bt_advance_array_keys) will always agree on which exact key will be used on each index column to start and/or end the scan (except when row compare keys are involved, which have similar problems not addressed by this commit). An upcoming commit will finish off the work started by this commit by harmonizing how _bt_first, _bt_checkkeys, and _bt_advance_array_keys apply row compare keys to start and end scans. This fixes what was arguably an oversight in either commit `5f4d98d4` or commit `8a510275`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Heikki Linnakangas <heikki.linnakangas@iki.fi> Discussion: https://postgr.es/m/CAH2-Wz=ds4M+3NXMgwxYxqU8MULaLf696_v5g=9WNmWL2=Uo2A@mail.gmail.com Backpatch-through: 18	2025-07-02 09:40:49 -04:00

1 2 3 4 5 ...

27115 Commits