postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-09 06:21:09 +03:00

Author	SHA1	Message	Date
Andres Freund	dae00f333b	aio: Improve assertions related to io_method First, the assertions in assign_io_method() were the wrong way round. Second, the lengthof() assertion checked the length of io_method_options, which is the wrong array to check and is always longer than pgaio_method_ops_table. While add it, add a static assert to ensure pgaio_method_ops_table and io_method_options stay in sync. Per coverity and Tom Lane. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 18	2025-11-04 20:03:53 -05:00
Andres Freund	2d83d729d5	jit: Fix accidentally-harmless type confusion In `2a0faed9d7`, which added JIT compilation support for expressions, I accidentally used sizeof(LLVMBasicBlockRef *) instead of sizeof(LLVMBasicBlockRef) as part of computing the size of an allocation. That turns out to have no real negative consequences due to LLVMBasicBlockRef being a pointer itself (and thus having the same size). It still is wrong and confusing, so fix it. Reported by coverity. Backpatch-through: 13	2025-11-04 20:03:53 -05:00
Jeff Davis	d115de9d89	Special case C_COLLATION_OID in pg_newlocale_from_collation(). Allow pg_newlocale_from_collation(C_COLLATION_OID) to work even if there's no catalog access, which some extensions expect. Not known to be a bug without extensions involved, but backport to 18. Also corrects an issue in master with dummy_c_locale (introduced in commit `5a38104b36`) where deterministic was not set. That wasn't a bug, but could have been if that structure was used more widely. Reported-by: Alexander Kukushkin <cyberdemn@gmail.com> Reviewed-by: Alexander Kukushkin <cyberdemn@gmail.com> Discussion: https://postgr.es/m/CAFh8B=nj966ECv5vi_u3RYij12v0j-7NPZCXLYzNwOQp9AcPWQ@mail.gmail.com Backpatch-through: 18	2025-11-04 16:48:16 -08:00
Masahiko Sawada	8ae0f6a0c3	Add CHECK_FOR_INTERRUPTS in Evict{Rel,All}UnpinnedBuffers. This commit adds CHECK_FOR_INTERRUPTS to the shared buffer iteration loops in EvictRelUnpinnedBuffers and EvictAllUnpinnedBuffers. These functions, used by pg_buffercache's pg_buffercache_evict_relation and pg_buffercache_evict_all, can now be interrupted during long-running operations. Backpatch to version 18, where these functions and their corresponding pg_buffercache functions were introduced. Author: Yuhang Qiu <iamqyh@gmail.com> Discussion: https://postgr.es/m/8DC280D4-94A2-4E7B-BAB9-C345891D0B78%40gmail.com Backpatch-through: 18	2025-11-04 15:47:25 -08:00
David Rowley	fdda78e361	Fix possible usage of incorrect UPPERREL_SETOP RelOptInfo `03d40e4b5` allowed dummy UNION [ALL] children to be removed from the plan by checking for is_dummy_rel(). That commit neglected to still account for the relids from the dummy rel so that the correct UPPERREL_SETOP RelOptInfo could be found and used for adding the Paths to. Not doing this could result in processing of subsequent UNIONs using the same RelOptInfo as a previously processed UNION, which could result in add_path() freeing old Paths that are needed by the previous UNION. The same fix was independently submitted (2 mins later) by Richard Guo. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/bee34aec-659c-46f1-9ab7-7bbae0b7616c@gmail.com	2025-11-05 11:48:09 +13:00
Álvaro Herrera	0a3d27bfe0	Fix snapshot handling bug in recent BRIN fix Commit `a95e3d84c0` added ActiveSnapshot push+pop when processing work-items (BRIN autosummarization), but forgot to handle the case of a transaction failing during the run, which drops the snapshot untimely. Fix by making the pop conditional on an element being actually there. Author: Álvaro Herrera <alvherre@kurilemu.de> Backpatch-through: 13 Discussion: https://postgr.es/m/202511041648.nofajnuddmwk@alvherre.pgsql	2025-11-04 20:31:43 +01:00
Tomas Vondra	1213cb4753	Trim TIDs during parallel GIN builds more eagerly The parallel GIN builds perform "freezing" of TID lists when merging chunks built earlier. This means determining what part of the list can no longer change, depending on the last received chunk. The frozen part can be evicted from memory and written out. The code attempted to freeze items right before merging the old and new TID list, after already attempting to trim the current buffer. That means part of the data may get frozen based on the new TID list, but will be trimmed later (on next loop). This increases memory usage. This inverts the order, so that we freeze data first (before trimming). The benefits are likely relatively small, but it's also virtually free with no other downsides. Discussion: https://postgr.es/m/CAHLJuCWDwn-PE2BMZE4Kux7x5wWt_6RoWtA0mUQffEDLeZ6sfA@mail.gmail.com	2025-11-04 20:06:01 +01:00
Tomas Vondra	c98dffcb7c	Limit the size of TID lists during parallel GIN build When building intermediate TID lists during parallel GIN builds, split the sorted lists into smaller chunks, to limit the amount of memory needed when merging the chunks later. The leader may need to keep in memory up to one chunk per worker, and possibly one extra chunk (before evicting some of the data). The code processing item pointers uses regular palloc/repalloc calls, which means it's subject to the MaxAllocSize (1GB) limit. We could fix this by allowing huge allocations, but that'd require changes in many places without much benefit. Larger chunks do not actually improve performance, so the memory usage would be wasted. Fixed by limiting the chunk size to not hit MaxAllocSize. Each worker gets a fair share. This requires remembering the number of participating workers, in a place that can be accessed from the callback. Luckily, the bs_worker_id field in GinBuildState was unused, so repurpose that. Report by Greg Smith, investigation and fix by me. Batchpatched to 18, where parallel GIN builds were introduced. Reported-by: Gregory Smith <gregsmithpgsql@gmail.com> Discussion: https://postgr.es/m/CAHLJuCWDwn-PE2BMZE4Kux7x5wWt_6RoWtA0mUQffEDLeZ6sfA@mail.gmail.com Backpatch-through: 18	2025-11-04 18:51:17 +01:00
Jeff Davis	4bfaea11d2	Remove redundant memset() introduced by `a0942f4`. Reported-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAEoWx2kAkNaDa01O0nKsQmkfEmxsDvm09SU=f1T0CV8ew3qJEA@mail.gmail.com	2025-11-04 09:46:00 -08:00
Tom Lane	ff4597acd4	Allow "SET list_guc TO NULL" to specify setting the GUC to empty. We have never had a SET syntax that allows setting a GUC_LIST_INPUT parameter to be an empty list. A locution such as SET search_path = ''; doesn't mean that; it means setting the GUC to contain a single item that is an empty string. (For search_path the net effect is much the same, because search_path ignores invalid schema names and '' must be invalid.) This is confusing, not least because configuration-file entries and the set_config() function can easily produce empty-list values. We considered making the empty-string syntax do this, but that would foreclose ever allowing empty-string items to be valid in list GUCs. While there isn't any obvious use-case for that today, it feels like the kind of restriction that might hurt someday. Instead, let's accept the forbidden-up-to-now value NULL and treat that as meaning an empty list. (An objection to this could be "what if we someday want to allow NULL as a GUC value?". That seems unlikely though, and even if we did allow it for scalar GUCs, we could continue to treat it as meaning an empty list for list GUCs.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Andrei Klychkov <andrew.a.klychkov@gmail.com> Reviewed-by: Jim Jones <jim.jones@uni-muenster.de> Discussion: https://postgr.es/m/CA+mfrmwsBmYsJayWjc8bJmicxc3phZcHHY=yW5aYe=P-1d_4bg@mail.gmail.com	2025-11-04 12:37:40 -05:00
Peter Eisentraut	040cc5f3c7	Tighten check for generated column in partition key expression A generated column may end up being part of the partition key expression, if it's specified as an expression e.g. "(<generated column name>)" or if the partition key expression contains a whole-row reference, even though we do not allow a generated column to be part of partition key expression. Fix this hole. Co-authored-by: jian he <jian.universality@gmail.com> Co-authored-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@oss.nttdata.com> Discussion: https://www.postgresql.org/message-id/flat/CACJufxF%3DWDGthXSAQr9thYUsfx_1_t9E6N8tE3B8EqXcVoVfQw%40mail.gmail.com	2025-11-04 14:46:58 +01:00
Álvaro Herrera	a95e3d84c0	BRIN autosummarization may need a snapshot It's possible to define BRIN indexes on functions that require a snapshot to run, but the autosummarization feature introduced by commit `7526e10224` fails to provide one. This causes autovacuum to leave a BRIN placeholder tuple behind after a failed work-item execution, making such indexes less efficient. Repair by obtaining a snapshot prior to running the task, and add a test to verify this behavior. Author: Álvaro Herrera <alvherre@kurilemu.de> Reported-by: Giovanni Fabris <giovanni.fabris@icon.it> Reported-by: Arthur Nascimento <tureba@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/202511031106.h4fwyuyui6fz@alvherre.pgsql	2025-11-04 13:23:26 +01:00
Peter Eisentraut	c09a06918d	Error message stylistic correction Fixup for commit `ef5e60a9d3`: The inconsistent use of articles was a bit awkward.	2025-11-04 12:25:04 +01:00
Michael Paquier	65f4976189	Add assertion check for WAL receiver state during stream-archive transition When the startup process switches from streaming to archive as WAL source, we avoid calling ShutdownWalRcv() if the WAL receiver is not streaming, based on WalRcvStreaming(). WALRCV_STOPPING is a state set by ShutdownWalRcv(), called only by the startup process, meaning that it should not be possible to reach this state while in WaitForWALToBecomeAvailable(). This commit adds an assertion to make sure that a WAL receiver is never in a WALRCV_STOPPING state should the startup process attempt to reset InstallXLogFileSegmentActive. Idea suggested by Noah Misch. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org	2025-11-04 13:14:46 +09:00
Michael Paquier	e0ca61e7c4	Add WalRcvGetState() to retrieve the state of a WAL receiver This has come up as useful as an alternative of WalRcvStreaming(), to be able to do sanity checks based on the state of a WAL receiver. This will be used in a follow-up commit. Author: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org	2025-11-04 12:57:36 +09:00
Michael Paquier	17b2d5ec75	Fix unconditional WAL receiver shutdown during stream-archive transition Commit `b4f584f9d2` (affecting v15~, later backpatched down to 13 as of `3635a0a35a`) introduced an unconditional WAL receiver shutdown when switching from streaming to archive WAL sources. This causes problems during a timeline switch, when a WAL receiver enters WALRCV_WAITING state but remains alive, waiting for instructions. The unconditional shutdown can break some monitoring scenarios as the WAL receiver gets repeatedly terminated and re-spawned, causing pg_stat_wal_receiver.status to show a "streaming" instead of "waiting" status, masking the fact that the WAL receiver is waiting for a new TLI and a new LSN to be able to continue streaming. This commit changes the WAL receiver behavior so as the shutdown becomes conditional, with InstallXLogFileSegmentActive being always reset to prevent the regression fixed by `b4f584f9d2`: only terminate the WAL receiver when it is actively streaming (WALRCV_STREAMING, WALRCV_STARTING, or WALRCV_RESTARTING). When in WALRCV_WAITING state, just reset InstallXLogFileSegmentActive flag to allow archive restoration without killing the process. WALRCV_STOPPED and WALRCV_STOPPING are not reachable states in this code path. For the latter, the startup process is the one in charge of setting WALRCV_STOPPING via ShutdownWalRcv(), waiting for the WAL receiver to reach a WALRCV_STOPPED state after switching walRcvState, so WaitForWALToBecomeAvailable() cannot be reached while a WAL receiver is in a WALRCV_STOPPING state. A regression test is added to check that a WAL receiver is not stopped on timeline jump, that fails when the fix of this commit is reverted. Reported-by: Ryan Bird <ryanzxg@gmail.com> Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/19093-c4fff49a608f82a0@postgresql.org Backpatch-through: 13	2025-11-04 10:47:38 +09:00
Noah Misch	8b18ed6dfb	Doc: cover index CONCURRENTLY causing errors in INSERT ... ON CONFLICT. Author: Mikhail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Noah Misch <noah@leadboat.com> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com Backpatch-through: 13	2025-11-03 12:57:09 -08:00
Masahiko Sawada	e7ccb247b3	Fix outdated comment of COPY in gram.y. Author: ChangAo Chen <cca5507@qq.com> Discussion: https://postgr.es/m/tencent_392C0E92EC52432D0A336B9D52E66426F009@qq.com	2025-11-03 10:34:49 -08:00
Álvaro Herrera	cf8be02253	Prevent setting a column as identity if its not-null constraint is invalid We don't allow null values to appear in identity-generated columns in other ways, so we shouldn't let unvalidated not-null constraints do it either. Oversight in commit `a379061a22`. Author: jian he <jian.universality@gmail.com> Backpatch-through: 18 Discussion: https://postgr.es/m/CACJufxGQM_+vZoYJMaRoZfNyV=L2jxosjv_0TLAScbuLJXWRfQ@mail.gmail.com	2025-11-03 15:58:19 +01:00
Michael Paquier	ad25744f43	Add wal_fpi_bytes to VACUUM and ANALYZE logs The new wal_fpi_bytes counter calculates the total amount of full page images inserted in WAL records, in bytes. This commit adds this information to VACUUM and ANALYZE logs alongside the existing counters, building upon `f9a09aa295`. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/aQMMSSlFXy4Evxn3@paquier.xyz	2025-11-03 19:42:03 +09:00
Peter Eisentraut	fce7c73fba	Sort guc_parameters.dat alphabetically by name The order in this list was previously pretty random and had grown organically over time. This made it unnecessarily cumbersome to maintain these lists, as there was no clear guidelines about where to put new entries. Also, after the merger of the type-specific GUC structs, the list still reflected the previous type-specific super-order. By using alphabetical order, the place for new entries becomes clear, and often related entries will be listed close together. This patch reorders the existing entries in guc_parameters.dat, and it also augments the generation script to error if an entry is found at the wrong place. Note: The order is actually checked after lower-casing, to handle the likes of "DateStyle". Reviewed-by: John Naylor <johncnaylorls@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-11-03 10:04:14 +01:00
Tom Lane	8f29467c57	Change "long" numGroups fields to be Cardinality (i.e., double). We've been nibbling away at removing uses of "long" for a long time, since its width is platform-dependent. Here's one more: change the remaining "long" fields in Plan nodes to Cardinality, since the three surviving examples all represent group-count estimates. The upstream planner code was converted to Cardinality some time ago; for example the corresponding fields in Path nodes are type Cardinality, as are the arguments of the make_foo_path functions. Downstream in the executor, it turns out that these all feed to the table-size argument of BuildTupleHashTable. Change that to "double" as well, and fix it so that it safely clamps out-of-range values to the uint32 limit of simplehash.h, as was not being done before. Essentially, this is removing all the artificial datatype-dependent limitations on these values from upstream processing, and applying just one clamp at the moment where we're forced to do so by the datatype choices of simplehash.h. Also, remove BuildTupleHashTable's misguided attempt to enforce work_mem/hash_mem_limit. It doesn't have enough information (particularly not the expected tuple width) to do that accurately, and it has no real business second-guessing the caller's choice. For all these plan types, it's really the planner's responsibility to not choose a hashed implementation if the hashtable is expected to exceed hash_mem_limit. The previous patch improved the accuracy of those estimates, and even if BuildTupleHashTable had more information it should arrive at the same conclusions. Reported-by: Jeff Janes <jeff.janes@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com	2025-11-02 16:57:43 -05:00
Tom Lane	1ea5bdb00b	Improve planner's estimates of tuple hash table sizes. For several types of plan nodes that use TupleHashTables, the planner estimated the expected size of the table as basically numEntries * (MAXALIGN(dataWidth) + MAXALIGN(SizeofHeapTupleHeader)). This is pretty far off, especially for small data widths, because it doesn't account for the overhead of the simplehash.h hash table nor for any per-tuple "additional space" the plan node may request. Jeff Janes noted a case where the estimate was off by about a factor of three, even though the obvious hazards such as inaccurate estimates of numEntries or dataWidth didn't apply. To improve matters, create functions provided by the relevant executor modules that can estimate the required sizes with reasonable accuracy. (We're still not accounting for effects like allocator padding, but this at least gets the first-order effects correct.) I added functions that can estimate the tuple table sizes for nodeSetOp and nodeSubplan; these rely on an estimator for TupleHashTables in general, and that in turn relies on one for simplehash.h hash tables. That feels like kind of a lot of mechanism, but if we take any short-cuts we're violating modularity boundaries. The other places that use TupleHashTables are nodeAgg, which took pains to get its numbers right already, and nodeRecursiveunion. I did not try to improve the situation for nodeRecursiveunion because there's nothing to improve: we are not making an estimate of the hash table size, and it wouldn't help us to do so because we have no non-hashed alternative implementation. On top of that, our estimate of the number of entries to be hashed in that module is so suspect that we'd likely often choose the wrong implementation if we did have two ways to do it. Reported-by: Jeff Janes <jeff.janes@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com	2025-11-02 16:57:26 -05:00
Peter Geoghegan	b8f1c62807	Document nbtree row comparison design. Add comments explaining when and where it is safe for nbtree to treat row compare keys as if they were simple scalar inequality keys on the row's most significant column. This is particularly important within _bt_advance_array_keys, which deals with required inequality keys in a general and uniform way, without any special handling for row compares. Also spell out the implications of _bt_check_rowcompare's approach of _conditionally_ evaluating lower-order row compare subkeys, particularly when one of its lower-order subkeys might see NULL index tuple values (these may or may not affect whether the qual as a whole is satisfied). The behavior in this area isn't particularly intuitive, so these issues seem worth going into. In passing, add a few more defensive/documenting row comparison related assertions to _bt_first and _bt_check_rowcompare. Follow-up to commits `bd3f59fd` and `ec986020`. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: Victor Yegorov <vyegorov@gmail.com> Reviewed-By: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/CAH2-Wznwkak_K7pcAdv9uH8ZfNo8QO7+tHXOaCUddMeTfaCCFw@mail.gmail.com Backpatch-through: 18	2025-11-02 15:27:05 -05:00
Peter Geoghegan	4f08586c7a	Remove obsolete nbtree equality key comments. _bt_first reliably uses the same equality key (on each index column) for initial positioning purposes as the one that _bt_checkkeys can use to end the scan following commit `f09816a0`. _bt_first no longer applies its own independent rules to determine which initial positioning key to use on each column (for equality and inequality keys alike). Preprocessing is now fully in control of determining which keys start and end each scan, ensuring that _bt_first and _bt_checkkeys have symmetric behavior. Remove obsolete comments that described why _bt_first was expected to use at least one of the available required equality keys for initial positioning purposes. The rules in this area are now maximally strict and uniform, so there's no reason to draw attention to equality keys. Any column with a required equality key cannot have a redundant required inequality key (nor can it have a redundant required equality key). Oversight in commit `f09816a0`, which removed similar comments from _bt_first, but missed these comments. Author: Peter Geoghegan <pg@bowt.ie> Backpatch-through: 18	2025-11-02 13:34:18 -05:00
Peter Eisentraut	8a27d418f8	Mark function arguments of type "Datum " as "const Datum " where possible Several functions in the codebase accept "Datum " parameters but do not modify the pointed-to data. These have been updated to take "const Datum " instead, improving type safety and making the interfaces clearer about their intent. This change helps the compiler catch accidental modifications and better documents immutability of arguments. Most of "Datum " parameters have a pairing "bool isnull" parameter, they are constified as well. No functional behavior is changed by this patch. Author: Chao Li <lic@highgo.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2msfT0knvzUa72ZBwu9LR_RLY4on85w2a9YpE-o2By5HQ@mail.gmail.com	2025-10-31 10:47:25 +01:00
Peter Eisentraut	aa4535307e	formatting.c cleanup: Change fill_str() return type to void The return value is not used anywhere. In passing, add a comment explaining the function's arguments. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 09:55:12 +01:00
Peter Eisentraut	da2052ab9a	formatting.c cleanup: Rename DCH_S_* to DCH_SUFFIX_* For clarity. Also rename several related macros and turn them into inline functions. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 08:06:46 +01:00
Peter Eisentraut	378212c68a	formatting.c cleanup: Change several int fields to enums This makes their purpose more self-documenting. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 08:06:46 +01:00
Peter Eisentraut	ce5f6817e4	formatting.c cleanup: Change TmFromChar.clock field to bool This makes the purpose clearer and avoids having two extra symbols, one of which (CLOCK_24_HOUR) was unused. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-31 08:06:46 +01:00
Tom Lane	c106ef0807	Use BumpContext contexts in TupleHashTables, and do some code cleanup. For all extant uses of TupleHashTables, execGrouping.c itself does nothing with the "tablecxt" except to allocate new hash entries in it, and the callers do nothing with it except to reset the whole context. So this is an ideal use-case for a BumpContext, and the hash tables are frequently big enough for the savings to be significant. (Commit `cc721c459` already taught nodeAgg.c this idea, but neglected the other callers of BuildTupleHashTable.) While at it, let's clean up some ill-advised leftovers from rebasing TupleHashTables on simplehash.h: * Many comments and variable names were based on the idea that the tablecxt holds the whole TupleHashTable, whereas now it only holds the hashed tuples (plus any caller-defined "additional storage"). Rename to names like tuplescxt and tuplesContext, and adjust the comments. Also adjust the memory context names to be like "<Foo> hashed tuples". * Make ResetTupleHashTable() reset the tuplescxt rather than relying on the caller to do so; that was fairly bizarre and seems like a recipe for leaks. This is less efficient in the case where nodeAgg.c uses the same tuplescxt for several different hashtables, but only microscopically so because mcxt.c will short-circuit the extra resets via its isReset flag. I judge the extra safety and intellectual cleanliness well worth those few cycles. * Remove the long-obsolete "allow_jit" check added by ac88807f9; instead, just Assert that metacxt and tuplescxt are different. We need that anyway for this definition of ResetTupleHashTable() to be safe. There is a side issue of the extent to which this change invalidates the planner's estimates of hashtable memory consumption. However, those estimates are already pretty bad, so improving them seems like it can be a separate project. This change is useful to do first to establish consistent executor behavior that the planner can expect. A loose end not addressed here is that the "entrysize" calculation in BuildTupleHashTable seems wrong: "sizeof(TupleHashEntryData) + additionalsize" corresponds neither to the size of the simplehash entries nor to the total space needed per tuple. It's questionable why BuildTupleHashTable is second-guessing its caller's nbuckets choice at all, since the original source of the number should have had more information. But that all seems wrapped up with the planner's estimation logic, so let's leave it for the planned followup patch. Reported-by: Jeff Janes <jeff.janes@gmail.com> Reported-by: David Rowley <dgrowleyml@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com Discussion: https://postgr.es/m/2268409.1761512111@sss.pgh.pa.us	2025-10-30 11:21:22 -04:00
Peter Eisentraut	e1ac846f3d	Mark ItemPointer arguments as const throughout This is a follow up `991295f`. I searched over src/ and made all ItemPointer arguments as const as much as possible. Note: We cut out from the original patch the pieces that would have created incompatibilities in the index or table AM APIs. Those could be considered separately. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEoWx2nBaypg16Z5ciHuKw66pk850RFWw9ACS2DqqJ_AkKeRsw%40mail.gmail.com	2025-10-30 14:12:06 +01:00
Álvaro Herrera	a27c40bfe8	Simplify coding in ProcessQuery The original is pretty baroque for no apparent reason; arguably, commit `2f9661311b` should have done this. Noted while reviewing related code for bug #18984. This is cosmetic (though I'm surprised that my compiler generates shorter assembly this way), so no backpatch. Discussion: https://postgr.es/m/18984-0f4778a6599ac3ae@postgresql.org	2025-10-30 11:26:35 +01:00
Peter Eisentraut	8ce795fcb7	Fix some confusing uses of const There are a few places where we have typedef struct FooData { ... } FooData; typedef FooData Foo; and then function declarations with bar(const Foo x) which isn't incorrect but probably meant bar(const FooData x) meaning that the thing x points to is immutable, not x itself. This patch makes those changes where appropriate. In one case (execGrouping.c), the thing being pointed to was not immutable, so in that case remove the const altogether, to avoid further confusion. Co-authored-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CAEoWx2m2E0xE8Kvbkv31ULh_E%2B5zph-WA_bEdv3UR9CLhw%2B3vg%40mail.gmail.com Discussion: https://www.postgresql.org/message-id/CAEoWx2kTDz%3Db6T2xHX78vy_B_osDeCC5dcTCi9eG0vXHp5QpdQ%40mail.gmail.com	2025-10-30 11:20:04 +01:00
Peter Eisentraut	3479a0f823	const-qualify ItemPointer comparison functions Add const qualifiers to ItemPointerEquals() and ItemPointerCompare(). This will allow further changes up the stack. It also complements commit `aeb767ca0b`, as we now have all of itemptr.h appropriately const-qualified. Author: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAEoWx2nBaypg16Z5ciHuKw66pk850RFWw9ACS2DqqJ_AkKeRsw@mail.gmail.com	2025-10-30 10:13:47 +01:00
Peter Eisentraut	e2cf524e4a	formatting.c cleanup: Improve formatting of some struct declarations This makes future editing easier. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-30 08:35:33 +01:00
Peter Eisentraut	9a1a5dfee8	formatting.c cleanup: Remove unnecessary zeroize macros Replace with initializer or memset(). Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-30 08:35:28 +01:00
Peter Eisentraut	38506f55fd	formatting.c cleanup: Remove unnecessary extra line breaks in error message literals Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-30 08:35:18 +01:00
Michael Paquier	5ab0b6a248	Expose wal_fpi_bytes in EXPLAIN (WAL) The new wal_fpi_bytes counter calculates the total amount of full page images inserted in WAL records, in bytes. This commit exposes this information in EXPLAIN (ANALYZE, WAL) alongside the existing counters, for both the text and JSON/YAML outputs, building upon `f9a09aa295`. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discusssion: https://postgr.es/m/CAOzEurQtZEAfg6P0kU3Wa-f9BWQOi0RzJEMPN56wNTOmJLmfaQ@mail.gmail.com	2025-10-30 15:34:01 +09:00
Michael Paquier	d432094689	Fix regression with slot invalidation checks This commit reverts `818fefd8fd`, that has been introduced to address a an instability in some of the TAP tests due to the presence of random standby snapshot WAL records, when slots are invalidated by InvalidatePossiblyObsoleteSlot(). Anyway, this commit had also the consequence of introducing a behavior regression. After `818fefd8fd`, the code may determine that a slot needs to be invalidated while it may not require one: the slot may have moved from a conflicting state to a non-conflicting state between the moment when the mutex is released and the moment when we recheck the slot, in InvalidatePossiblyObsoleteSlot(). Hence, the invalidations may be more aggressive than they actually have to. `105b2cb336` has tackled the test instability in a way that should be hopefully sufficient for the buildfarm, even for slow members: - In v18, the test relies on an injection point that bypasses the creation of the random records generated for standby snapshots, eliminating the random factor that impacted the test. This option was not available when `818fefd8fd` was discussed. - In v16 and v17, the problem was bypassed by disallowing a slot to become active in some of the scenarios tested. While on it, this commit adds a comment to document that it is fine for a recheck to use xmin and LSN values stored in the slot, without storing and reusing them across multiple checks. Reported-by: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/f492465f-657e-49af-8317-987460cb68b0.mengjuan.cmj@alibaba-inc.com Backpatch-through: 16	2025-10-30 13:13:28 +09:00
Richard Guo	257ee78341	Disable parallel plans for RIGHT_SEMI joins RIGHT_SEMI joins rely on the HEAP_TUPLE_HAS_MATCH flag to guarantee that only the first match for each inner tuple is considered. However, in a parallel hash join, the inner relation is stored in a shared global hash table that can be probed by multiple workers concurrently. This allows different workers to inspect and set the match flags of the same inner tuples at the same time. If two workers probe the same inner tuple concurrently, both may see the match flag as unset and emit the same tuple, leading to duplicate output rows and violating RIGHT_SEMI join semantics. For now, we disable parallel plans for RIGHT_SEMI joins. In the long term, it may be possible to support parallel execution by performing atomic operations on the match flag, for example using a CAS or similar mechanism. Backpatch to v18, where RIGHT_SEMI join was introduced. Bug: #19094 Reported-by: Lori Corbani <Lori.Corbani@jax.org> Diagnosed-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19094-6ed410eb5b256abd@postgresql.org Backpatch-through: 18	2025-10-30 11:58:45 +09:00
David Rowley	50eb4e1181	Fix bogus use of "long" in AllocSetCheck() Because long is 32-bit on 64-bit Windows, it isn't a good datatype to store the difference between 2 pointers. The under-sized type could overflow and lead to scary warnings in MEMORY_CONTEXT_CHECKING builds, such as: WARNING: problem in alloc set ExecutorState: bad single-chunk %p in block %p However, the problem lies only in the code running the check, not from an actual memory accounting bug. Fix by using "Size" instead of "long". This means using an unsigned type rather than the previous signed type. If the block's freeptr was corrupted, we'd still catch that if the unsigned type wrapped. Unsigned allows us to avoid further needless complexities around comparing signed and unsigned types. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvo-RmiT4s33J=aC9C_-wPZjOXQ232V-EZFgKftSsNRi4w@mail.gmail.com	2025-10-30 14:48:10 +13:00
Jeff Davis	3853a6956c	Use C11 char16_t and char32_t for Unicode code points. Reviewed-by: Tatsuo Ishii <ishii@postgresql.org> Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://postgr.es/m/bedcc93d06203dfd89815b10f815ca2de8626e85.camel%40j-davis.com	2025-10-29 14:17:13 -07:00
Álvaro Herrera	94f95d91b0	CheckNNConstraintFetch: Fill all of ConstrCheck in a single pass Previously, we'd fill all fields except ccbin, and only later obtain and detoast ccbin, with hypothetical failures being possible. If ccbin is null (rare catalog corruption I have never witnessed) or its a corrupted toast entry, we leak a tiny bit of memory in CacheMemoryContext from having strdup'd the constraint name. Repair these by only attempting to fill the struct once ccbin has been detoasted. Author: Ranier Vilela <ranier.vf@gmail.com> Discussion: https://postgr.es/m/CAEudQAr=i3_Z4GvmediX900+sSySTeMkvuytYShhQqEwoGyvhA@mail.gmail.com	2025-10-29 11:41:39 +01:00
Peter Eisentraut	a13833c35f	Reorganize GUC structs Instead of having five separate GUC structs, one for each type, with the generic part contained in each of them, flip it around and have one common struct, with the type-specific part has a subfield. The very original GUC design had type-specific structs and type-specific lists, and the membership in one of the lists defined the type. But now the structs themselves know the type (from the .vartype field), and they are all loaded into a common hash table at run time, and so this original separation no longer makes sense. It creates a bunch of inconsistencies in the code about whether the type-specific or the generic struct is the primary struct, and a lot of casting in between, which makes certain assumptions about the struct layouts. After the change, all these casts are gone and all the data is accessed via normal field references. Also, various code is simplified because only one kind of struct needs to be processed. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Discussion: https://www.postgresql.org/message-id/flat/8fdfb91e-60fb-44fa-8df6-f5dea47353c9@eisentraut.org	2025-10-29 09:52:29 +01:00
Peter Eisentraut	2724830929	formatting.c cleanup: Remove unnecessary extra parentheses Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:29:00 +01:00
Peter Eisentraut	6271d9922e	formatting.c cleanup: Use array syntax instead of pointer arithmetic for easier readability Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:28:55 +01:00
Peter Eisentraut	b9def57a3c	formatting.c cleanup: Add some const pointer qualifiers Co-authored-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:28:50 +01:00
Peter Eisentraut	d98b3cdbaf	formatting.c cleanup: Use size_t for string length variables and arguments Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/6dd9d208-a3ed-49b5-b03d-8617261da973%40eisentraut.org	2025-10-29 09:28:43 +01:00
Michael Paquier	d3111cb753	Fix correctness issue with computation of FPI size in WAL stats XLogRecordAssemble() may be called multiple times before inserting a record in XLogInsertRecord(), and the amount of FPIs generated inside a record whose insertion is attempted multiple times may vary. The logic added in `f9a09aa295` touched directly pgWalUsage in XLogRecordAssemble(), meaning that it could be possible for pgWalUsage to be incremented multiple times for a single record. This commit changes the code to use the same logic as the number of FPIs added to a record, where XLogRecordAssemble() returns this information and feeds it to XLogInsertRecord(), updating pgWalUsage only when a record is inserted. Reported-by: Shinya Kato <shinya11.kato@gmail.com> Discussion: https://postgr.es/m/CAOzEurSiSr+rusd0GzVy8Bt30QwLTK=ugVMnF6=5WhsSrukvvw@mail.gmail.com	2025-10-29 09:13:31 +09:00

1 2 3 4 5 ...

27545 Commits