postgres

mirror of https://github.com/postgres/postgres.git synced 2026-01-27 21:43:08 +03:00

Author	SHA1	Message	Date
Heikki Linnakangas	ad853bb877	Fix misc typos, mostly in comments The only user-visible change is the fix in the "malformed pg_dependencies" error detail. That one is new in commit `e1405aa5e3`, so no backpatching required.	2026-01-08 18:10:08 +02:00
Bruce Momjian	451c43974f	Update copyright for 2026 Backpatch-through: 14	2026-01-01 13:24:10 -05:00
Michael Paquier	0c3c5c3b06	Use palloc_object() and palloc_array() in more areas of the tree The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). The following paths are included in this batch, treating all the areas proposed by the author for the most trivial changes, except src/backend (by far the largest batch): src/bin/ src/common/ src/fe_utils/ src/include/ src/pl/ src/test/ src/tutorial/ Similar work has been done in `31d3847a37`. The code compiles the same before and after this commit, with the following exceptions due to changes in line numbers because some of the new allocation formulas are shorter: blkreftable.c pgfnames.c pl_exec.c Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com	2025-12-09 14:53:17 +09:00
Alexander Korotkov	8af3ae0d4b	Add pairingheap_initialize() for shared memory usage The existing pairingheap_allocate() uses palloc(), which allocates from process-local memory. For shared memory use cases, the pairingheap structure must be allocated via ShmemAlloc() or embedded in a shared memory struct. Add pairingheap_initialize() to initialize an already- allocated pairingheap structure in-place, enabling shared memory usage. Discussion: https://www.postgresql.org/message-id/flat/CAPpHfdsjtZLVzxjGT8rJHCYbM0D5dwkO+BBjcirozJ6nYbOW8Q@mail.gmail.com Discussion: https://www.postgresql.org/message-id/flat/CABPTF7UNft368x-RgOXkfj475OwEbp%2BVVO-wEXz7StgjD_%3D6sw%40mail.gmail.com Author: Kartyshov Ivan <i.kartyshov@postgrespro.ru> Author: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com>	2025-11-05 11:44:13 +02:00
Tom Lane	1ea5bdb00b	Improve planner's estimates of tuple hash table sizes. For several types of plan nodes that use TupleHashTables, the planner estimated the expected size of the table as basically numEntries * (MAXALIGN(dataWidth) + MAXALIGN(SizeofHeapTupleHeader)). This is pretty far off, especially for small data widths, because it doesn't account for the overhead of the simplehash.h hash table nor for any per-tuple "additional space" the plan node may request. Jeff Janes noted a case where the estimate was off by about a factor of three, even though the obvious hazards such as inaccurate estimates of numEntries or dataWidth didn't apply. To improve matters, create functions provided by the relevant executor modules that can estimate the required sizes with reasonable accuracy. (We're still not accounting for effects like allocator padding, but this at least gets the first-order effects correct.) I added functions that can estimate the tuple table sizes for nodeSetOp and nodeSubplan; these rely on an estimator for TupleHashTables in general, and that in turn relies on one for simplehash.h hash tables. That feels like kind of a lot of mechanism, but if we take any short-cuts we're violating modularity boundaries. The other places that use TupleHashTables are nodeAgg, which took pains to get its numbers right already, and nodeRecursiveunion. I did not try to improve the situation for nodeRecursiveunion because there's nothing to improve: we are not making an estimate of the hash table size, and it wouldn't help us to do so because we have no non-hashed alternative implementation. On top of that, our estimate of the number of entries to be hashed in that module is so suspect that we'd likely often choose the wrong implementation if we did have two ways to do it. Reported-by: Jeff Janes <jeff.janes@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMkU=1zia0JfW_QR8L5xA2vpa0oqVuiapm78h=WpNsHH13_9uw@mail.gmail.com	2025-11-02 16:57:26 -05:00
David Rowley	5c0a20003b	Fix reset of incorrect hash iterator in GROUPING SETS queries This fixes an unlikely issue when fetching GROUPING SET results from their internally stored hash tables. It was possible in rare cases that the hash iterator would be set up incorrectly which could result in a crash. This was introduced in `4d143509c`, so backpatch to v18. Many thanks to Yuri Zamyatin for reporting and helping to debug this issue. Bug: #19078 Reported-by: Yuri Zamyatin <yuri@yrz.am> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/19078-dfd62f840a2c0766@postgresql.org Backpatch-through: 18	2025-10-18 16:07:04 +13:00
Peter Eisentraut	a5b35fcedb	Remove PointerIsValid() This doesn't provide any value over the standard style of checking the pointer directly or comparing against NULL. Also remove related: - AllocPointerIsValid() [unused] - IndexScanIsValid() [had one user] - HeapScanIsValid() [unused] - InvalidRelation [unused] Leaving HeapTupleIsValid(), ItemIdIsValid(), PortalIsValid(), RelationIsValid for now, to reduce code churn. Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/ad50ab6b-6f74-4603-b099-1cd6382fb13d%40eisentraut.org Discussion: https://www.postgresql.org/message-id/CA+hUKG+NFKnr=K4oybwDvT35dW=VAjAAfiuLxp+5JeZSOV3nBg@mail.gmail.com Discussion: https://www.postgresql.org/message-id/bccf2803-5252-47c2-9ff0-340502d5bd1c@iki.fi	2025-09-24 15:17:20 +02:00
Peter Eisentraut	a0ed19e0a9	Use PRI?64 instead of "ll?" in format strings (continued). Continuation of work started in commit `15a79c73`, after initial trial. Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/b936d2fb-590d-49c3-a615-92c3a88c6c19%40eisentraut.org	2025-03-29 10:43:57 +01:00
Tatsuo Ishii	a9dcbb4d5c	Add new StringInfo APIs to allow callers to specify the buffer size. Previously StringInfo APIs allocated buffers with fixed initial allocation size of 1024 bytes. This may be too large and inappropriate for some callers that can do with smaller memory buffers. To fix this, introduce new APIs that allow callers to specify initial buffer size. extern StringInfo makeStringInfoExt(int initsize); extern void initStringInfoExt(StringInfo str, int initsize); Existing APIs (makeStringInfo() and initStringInfo()) are changed to call makeStringInfoExt and initStringInfoExt respectively (via inline helper functions makeStringInfoInternal and initStringInfoInternal), with the default buffer size of 1024. Reviewed-by: Nathan Bossart, David Rowley, Michael Paquier, Gurjeet Singh Discussion: https://postgr.es/m/20241225.123704.1194662271286702010.ishii%40postgresql.org	2025-01-11 08:23:46 +09:00
John Naylor	3e70da2781	Always use the caller-provided context for radix tree leaves Previously, it would not have worked for a caller to pass a slab context, since it would have been used for other things which likely had incompatible size. In an attempt to be helpful and avoid possible space wastage due to aset's power-of-two rounding, RT_CREATE would create an additional slab context if the value type was fixed-length and larger than pointer size. The problem was, we have since added the bump context type, and the generation context was a possibility as well, so silently overriding the caller's choice may actually be worse. Commit `e8a6f1f908` arranged so that the caller-provided context is used only for leaves, so it's safe for the caller to use slab here if they wish. As demonstration, use slab in one of the radix tree regression tests. Reviewed by Masahiko Sawada Discussion: https://postgr.es/m/CANWCAZZDCo4k5oURg_pPxM6+WZ1oiG=sqgjmQiELuyP0Vtrwig@mail.gmail.com	2025-01-06 13:26:02 +07:00
John Naylor	e8a6f1f908	Get rid of radix tree's general purpose memory context Previously, this was notionally used only for the entry point of the tree and as a convenient parent for other contexts. For shared memory, the creator previously allocated the entry point in this context, but attaching backends didn't have access to that, so they just used the caller's context. For the sake of consistency, allocate every instance of an entry point in the caller's context. For local memory, allocate the control object in the caller's context as well. This commit also makes the "leaf context" the notional parent of the child contexts used for nodes, so it's a bit of a misnomer, but a future commit will make the node contexts independent rather than children, so leave it this way for now to avoid code churn. The memory context parameter for RT_CREATE is now unused in the case of shared memory, so remove it and adjust callers to match. In passing, remove unused "context" member from struct TidStore, which seems to have been an oversight. Reviewed by Masahiko Sawada Discussion: https://postgr.es/m/CANWCAZZDCo4k5oURg_pPxM6+WZ1oiG=sqgjmQiELuyP0Vtrwig@mail.gmail.com	2025-01-06 11:21:21 +07:00
John Naylor	960013f2a1	Use caller's memory context for radix tree iteration state Typically only one iterator is present at any time, so it's overkill to devote an entire context for this. Get rid of it and use the caller's context. This is tidy-up work, so no backpatch in this form. However, a hypothetical extension to v17 that tried to start iteration from an attaching backend would result in a crash, so that'll be fixed separately in a way that doesn't change behavior in core. Patch by me, reported and reviewed by Masahiko Sawada Discussion: https://postgr.es/m/CAD21AoBB2U47V=F+wQRB1bERov_of5=BOZGaybjaV8FLQyqG3Q@mail.gmail.com	2025-01-06 09:01:58 +07:00
Bruce Momjian	50e6eb731d	Update copyright for 2025 Backpatch-through: 13	2025-01-01 11:21:55 -05:00
Masahiko Sawada	724890ffb7	Include necessary header files in radixtree.h. When #include'ing radixtree.h with RT_SHMEM, it could happen to raise compiler errors due to missing some declarations of types and functions. This commit also removes the inclusion of postgres.h since it's against our usual convention. Backpatch to v17, where radixtree.h was introduced. Reviewed-by: Heikki Linnakangas, Álvaro Herrera Discussion: https://postgr.es/m/CAD21AoCU9YH%2Bb9Rr8YRw7UjmB%3D1zh8GKQkWNiuN9mVhMvkyrRg%40mail.gmail.com Backpatch-through: 17	2024-12-09 13:07:06 -08:00
Alexander Korotkov	3a7ae6b3d9	Revert pg_wal_replay_wait() stored procedure This commit reverts `3c5db1d6b0`, and subsequent improvements and fixes including `8036d73ae3`, `867d396ccd`, `3ac3ec580c`, `0868d7ae70`, `85b98b8d5a`, `2520226c95`, `014f9f34d2`, `e658038772`, `e1555645d7`, `5035172e4a`, `6cfebfe88b`, `73da6b8d1b`, and `e546989a26`. The reason for reverting is a set of remaining issues. Most notably, the stored procedure appears to need more effort than the utility statement to turn the backend into a "snapshot-less" state. This makes an approach to use stored procedures questionable. Catversion is bumped. Discussion: https://postgr.es/m/Zyhj2anOPRKtb0xW%40paquier.xyz	2024-11-04 22:47:57 +02:00
Alexander Korotkov	3c5db1d6b0	Implement pg_wal_replay_wait() stored procedure pg_wal_replay_wait() is to be used on standby and specifies waiting for the specific WAL location to be replayed. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes are on standby. The queue of waiters is stored in the shared memory as an LSN-ordered pairing heap, where the waiter with the nearest LSN stays on the top. During the replay of WAL, waiters whose LSNs have already been replayed are deleted from the shared memory pairing heap and woken up by setting their latches. pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records, implying a kind of self-deadlock. This is why it is only possible to implement pg_wal_replay_wait() as a procedure working without an active snapshot, not a function. Catversion is bumped. Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru Author: Kartyshov Ivan, Alexander Korotkov Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi	2024-08-02 21:16:56 +03:00
John Naylor	fd49e8f323	Prevent access of uninitialized memory in radix tree nodes RT_NODE_16_SEARCH_EQ() performs comparisions using vector registers on x64-64 and aarch64. We apply a mask to the resulting bitfield to eliminate irrelevant bits that may be set. This ensures correct behavior, but Valgrind complains of the partially-uninitialised values. So far the warnings have only occurred on aarch64, which explains why this hasn't been seen earlier. To fix this warning, initialize the whole fixed-sized part of the nodes upon allocation, rather than just do the minimum initialization to function correctly. The initialization for node48 is a bit different in that the 256-byte slot index array must be populated with "invalid index" rather than zero. Experimentation has shown that compilers tend to emit code that uselessly memsets that array twice. To avoid pessimizing this path, swap the order of the slot_idxs[] and isset[] arrays so we can initialize with two non-overlapping memset calls. Reported by Tomas Vondra Analysis and patch by Tom Lane, reviewed by Masahiko Sawada. I investigated the behavior of memset calls to overlapping regions, leading to the above tweaks to node48 as discussed in the thread. Discussion: https://postgr.es/m/120c63ad-3d12-415f-a7bf-3da451c31bf6%40enterprisedb.com	2024-06-21 17:29:39 +07:00
John Naylor	ed52df3b19	Small cosmetic fixes in radix tree template - Bring memory context names in line with other naming - Fix typos, reported off-list by Alexander Lakhin - Remove copy-paste errors from comments - Remove duplicate #undef	2024-04-27 14:42:01 +07:00
Masahiko Sawada	bb7f195ff7	radixtree: Fix SIGSEGV at update of embeddable value to non-embeddable. Also, fix a memory leak when updating from non-embeddable to embeddable. Both were unreachable without adding C code. Reported-by: Noah Misch Author: Noah Misch Reviewed-by: Masahiko Sawada, John Naylor Discussion: https://postgr.es/m/20240424210319.4c.nmisch%40google.com	2024-04-25 21:48:52 +09:00
Daniel Gustafsson	950d4a2cb1	Fix typos and duplicate words This fixes various typos, duplicated words, and tiny bits of whitespace mainly in code comments but also in docs. Author: Daniel Gustafsson <daniel@yesql.se> Author: Heikki Linnakangas <hlinnaka@iki.fi> Author: Alexander Lakhin <exclusion@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se	2024-04-18 21:28:07 +02:00
Alexander Korotkov	772faafca1	Revert: Implement pg_wal_replay_wait() stored procedure This commit reverts `06c418e163`, `e37662f221`, `bf1e650806`, `25f42429e2`, `ee79928441`, and `74eaf66f98` per review by Heikki Linnakangas. Discussion: https://postgr.es/m/b155606b-e744-4218-bda5-29379779da1a%40iki.fi	2024-04-11 17:28:15 +03:00
Masahiko Sawada	810f64a015	Revert indexed and enlargable binary heap implementation. This reverts commit `b840508644` and `bcb14f4abc`. These commits were made for commit `5bec1d6bc5` (Improve eviction algorithm in ReorderBuffer using max-heap for many subtransactions). However, per discussion, commit `efb8acc0d0` replaced binary heap + index with pairing heap, and made these commits unnecessary. Reported-by: Jeff Davis Discussion: https://postgr.es/m/12747c15811d94efcc5cda72d6b35c80d7bf3443.camel%40j-davis.com	2024-04-11 17:18:05 +09:00
John Naylor	0fe5f64367	Teach radix tree to embed values at runtime Previously, the decision to store values in leaves or within the child pointer was made at compile time, with variable length values using leaves by necessity. This commit allows introspecting the length of variable length values at runtime for that decision. This requires the ability to tell whether the last-level child pointer is actually a value, so we use a pointer tag in the lowest level bit. Use this in TID store. This entails adding a byte to the header to reserve space for the tag. Commit `f35bd9bf3` stores up to three offsets within the header with no bitmap, and now the header can be embedded as above. This reduces worst-case memory usage when TIDs are sparse. Reviewed (in an earlier version) by Masahiko Sawada Discussion: https://postgr.es/m/CANWCAZYw+_KAaUNruhJfE=h6WgtBKeDG32St8vBJBEY82bGVRQ@mail.gmail.com Discussion: https://postgr.es/m/CAD21AoBci3Hujzijubomo1tdwH3XtQ9F89cTNQ4bsQijOmqnEw@mail.gmail.com	2024-04-08 18:54:35 +07:00
John Naylor	8a1b31e6e5	Use bump context for TID bitmaps stored by vacuum Vacuum does not pfree individual entries, and only frees the entire storage space when finished with it. This allows using a bump context, eliminating the chunk header in each leaf allocation. Most leaf allocations will be 16 to 32 bytes, so that's a significant savings. TidStoreCreateLocal gets a boolean parameter to indicate that the created store is insert-only. This requires a separate tree context for iteration, since we free the iteration state after iteration completes. Discussion: https://postgr.es/m/CANWCAZac%3DpBePg3rhX8nXkUuaLoiAJJLtmnCfZsPEAS4EtJ%3Dkg%40mail.gmail.com Discussion: https://postgr.es/m/CANWCAZZQFfxvzO8yZHFWtQV+Z2gAMv1ku16Vu7KWmb5kZQyd1w@mail.gmail.com	2024-04-08 14:39:49 +07:00
Andres Freund	af7e90a277	simplehash: Free collisions array in SH_STAT While SH_STAT() is only used for debugging, the allocated array can be large, and therefore should be freed. It's unclear why coverity started warning now. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Reported-by: Coverity Discussion: https://postgr.es/m/3005248.1712538233@sss.pgh.pa.us Backpatch: 12-	2024-04-07 19:08:41 -07:00
Alexander Korotkov	bf1e650806	Use the pairing heap instead of a flat array for LSN replay waiters `06c418e163` introduced pg_wal_replay_wait() procedure allowing to wait for the particular LSN to be replayed on standby. The waiters were stored in the flat array. Even though scanning small arrays is fast, that might be a problem at scale (a lot of waiting processes). This commit replaces the flat shared memory array with the pairing heap, which holds the waiter with the least LSN at the top. This gives us O(log N) complexity for both inserting and removing waiters. Reported-by: Alvaro Herrera Discussion: https://postgr.es/m/202404030658.hhj3vfxeyhft%40alvherre.pgsql	2024-04-03 18:15:41 +03:00
Masahiko Sawada	b840508644	Add functions to binaryheap for efficient key removal and update. Previously, binaryheap didn't support updating a key and removing a node in an efficient way. For example, in order to remove a node from the binaryheap, the caller had to pass the node's position within the array that the binaryheap internally has. Removing a node from the binaryheap is done in O(log n) but searching for the key's position is done in O(n). This commit adds a hash table to binaryheap in order to track the position of each nodes in the binaryheap. That way, by using newly added functions such as binaryheap_update_up() etc., both updating a key and removing a node can be done in O(1) on an average and O(log n) in worst case. This is known as the indexed binary heap. The caller can specify to use the indexed binaryheap by passing indexed = true. The current code does not use the new indexing logic, but it will be used by an upcoming patch. Reviewed-by: Vignesh C, Peter Smith, Hayato Kuroda, Ajin Cherian, Tomas Vondra, Shubham Khanna Discussion: https://postgr.es/m/CAD21AoDffo37RC-eUuyHJKVEr017V2YYDLyn1xF_00ofptWbkg%40mail.gmail.com	2024-04-03 10:44:21 +09:00
Masahiko Sawada	bcb14f4abc	Make binaryheap enlargeable. The node array space of the binaryheap is doubled when there is no available space. Reviewed-by: Vignesh C, Peter Smith, Hayato Kuroda, Ajin Cherian, Tomas Vondra, Shubham Khanna Discussion: https://postgr.es/m/CAD21AoDffo37RC-eUuyHJKVEr017V2YYDLyn1xF_00ofptWbkg%40mail.gmail.com	2024-04-03 10:27:43 +09:00
Daniel Gustafsson	a96a8b15fa	Remove superfluous trailing semicolons Two semicolons were accidentally added to rows which were already terminated semicolons. While harmless, fix by removing these. Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/CAMbWs4_fnJ0+yOgFioswzLE7t6R8P6cqbuacFVeZqbESFAjs1A@mail.gmail.com	2024-03-29 23:51:43 +01:00
Masahiko Sawada	80d5d4937c	Fix potential integer handling issue in radixtree.h. Coverity complained about the integer handling issue; if we start with an arbitrary non-negative shift value, the loop may decrement it down to something less than zero before exiting. This commit adds an assertion to make sure the 'shift' is always 0 after the loop, and uses 0 as the shift to get the key chunk in the following operation. Introduced by `ee1b30f12`. Reported-by: Tom Lane as per coverity Reviewed-by: Tom Lane Discussion: https://postgr.es/m/2089517.1711299216%40sss.pgh.pa.us	2024-03-25 12:06:41 +09:00
Daniel Gustafsson	b783186515	Add destroyStringInfo function for cleaning up StringInfos destroyStringInfo() is a counterpart to makeStringInfo(), freeing a palloc'd StringInfo and its data. This is a convenience function to align the StringInfo API with the PQExpBuffer API. Originally added in the OAuth patchset, it was extracted and committed separately in order to aid upcoming JSON work. Author: Daniel Gustafsson <daniel@yesql.se> Author: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOYmi+mWdTd6ujtyF7MsvXvk7ToLRVG_tYAcaGbQLvf=N4KrQw@mail.gmail.com	2024-03-16 23:18:28 +01:00
John Naylor	e444ebcb85	Fix incorrect format specifier for int64 Follow-up to `ee1b30f12`, per buildfarm member mamba. Discussion: https://postgr.es/m/CANWCAZYwyRMU%2BOTVOjK%3Dno1hm-W3ZQ5vrSFM1MFAaLtLydvwzA%40mail.gmail.com	2024-03-07 14:26:52 +07:00
John Naylor	ac234e6377	Fix redefinition of typedefs Per buildfarm members sifaka and longfin, clang with -Wtypedef-redefinition warns of duplicate typedefs unless building with C11. Follow-up to `ee1b30f12`. Masahiko Sawada Discussion: https://postgr.es/m/CANWCAZauSg%3DLUbBbXhpeQtBuPifmzQNTYS6O8NsoAPz1zL-Txg%40mail.gmail.com	2024-03-07 14:11:49 +07:00
John Naylor	ee1b30f128	Add template for adaptive radix tree This implements a radix tree data structure based on the design in "The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases" by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main technique that makes it adaptive is using several different node types, each with a different capacity of elements, and a different algorithm for accessing them. The nodes start small and grow/shrink as needed. The main advantage over hash tables is efficient sorted iteration and better memory locality when successive keys are lexicographically close together. The implementation currently assumes 64-bit integer keys, and traversing the tree is in general slower than a linear probing hash table, so this is not a general-purpose associative array. The paper describes two other techniques not implemented here, namely "path compression" and "lazy expansion". These can further reduce memory usage and speed up traversal, but the former would add significant complexity and the latter requires storing the full key with the value. We do trivially compress the path when leading bytes of the key are zeros, however. For value storage, we use "combined pointer/value slots", as recommended in the paper. Values of size equal or smaller than the the platform's pointer type are stored in the array of child pointers in the last level node, while larger values are each stored in a separate allocation. This is for now fixed at compile time, but it would be fairly trivial to allow determining at runtime how variable-length values are stored. One innovation in our implementation compared to the ART paper is decoupling the notion of node "size class" from "kind". The size classes within a given node kind have the same underlying type, but a variable capacity for children, so we can introduce additional node sizes with little additional code. To enable different use cases to specialize for different value types and for shared/local memory, we use macro-templatized code generation in the same manner as simplehash.h and sort_template.h. Future commits will use this infrastructure for storing TIDs. Patch by Masahiko Sawada and John Naylor, but a substantial amount of credit is due to Andres Freund, whose proof-of-concept was a valuable source of coding idioms and awareness of performance pitfalls, and who reviewed earlier versions. Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com	2024-03-07 12:40:11 +07:00
Nathan Bossart	e1724af42c	Fix comments for the dshash_parameters struct. A recent commit added a copy_function member to the dshash_parameters struct, but it missed updating a couple of comments that refer to the function pointer members of this struct. One of those comments also refers to a tranche_name member and non- arg variants of the function pointer members, all of which were either removed during development or removed shortly after dshash table support was committed. Oversights in commits `8c0d7bafad`, `d7694fc148`, and `42a1de3013`. Discussion: https://postgr.es/m/20240227045213.GA2329190%40nathanxps13	2024-02-27 09:44:59 -06:00
Nathan Bossart	42a1de3013	Add helper functions for dshash tables with string keys. Presently, string keys are not well-supported for dshash tables. The dshash code always copies key_size bytes into new entries' keys, and dshash.h only provides compare and hash functions that forward to memcmp() and tag_hash(), both of which do not stop at the first NUL. This means that callers must pad string keys so that the data beyond the first NUL does not adversely affect the results of copying, comparing, and hashing the keys. To better support string keys in dshash tables, this commit does a couple things: * A new copy_function field is added to the dshash_parameters struct. This function pointer specifies how the key should be copied into new table entries. For example, we only want to copy up to the first NUL byte for string keys. A dshash_memcpy() helper function is provided and used for all existing in-tree dshash tables without string keys. * A set of helper functions for string keys are provided. These helper functions forward to strcmp(), strcpy(), and string_hash(), all of which ignore data beyond the first NUL. This commit also adjusts the DSM registry's dshash table to use the new helper functions for string keys. Reviewed-by: Andy Fan Discussion: https://postgr.es/m/20240119215941.GA1322079%40nathanxps13	2024-02-26 15:47:13 -06:00
Nathan Bossart	6b80394781	Introduce overflow-safe integer comparison functions. This commit adds integer comparison functions that are designed to be as efficient as possible while avoiding overflow. A follow-up commit will make use of these functions in many of the in-tree qsort() comparators. The new functions are not better in all cases (e.g., when the comparator function is inlined), so it is important to consider the context before using them. Author: Mats Kindahl Reviewed-by: Tom Lane, Heikki Linnakangas, Andres Freund, Thomas Munro, Andrey Borodin, Fabrízio de Royes Mello Discussion: https://postgr.es/m/CA%2B14426g2Wa9QuUpmakwPxXFWG_1FaY0AsApkvcTBy-YfS6uaw%40mail.gmail.com	2024-02-16 13:37:02 -06:00
Bruce Momjian	29275b1d17	Update copyright for 2024 Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12	2024-01-03 20:49:05 -05:00
Peter Eisentraut	141752bbb0	Fix typos in simplehash.h Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/18252-d46d27900a277d87@postgresql.org	2024-01-02 10:18:47 +01:00
Jeff Davis	b282fa88df	simplehash: preserve consistency in case of OOM. Compute size first, then allocate, then update the structure. Previously, an out-of-memory when growing could leave the hashtable in an inconsistent state. Discussion: https://postgr.es/m/20231117201334.eyb542qr5yk4gilq@awork3.anarazel.de Reviewed-by: Andres Freund Reviewed-by: Gurjeet Singh	2023-11-17 13:58:16 -08:00
David Rowley	f0efa5aec1	Introduce the concept of read-only StringInfos There were various places in our codebase which conjured up a StringInfo by manually assigning the StringInfo fields and setting the data field to point to some existing buffer. There wasn't much consistency here as to what fields like maxlen got set to and in one location we didn't correctly ensure that the buffer was correctly NUL terminated at len bytes, as per what was documented as required in stringinfo.h Here we introduce 2 new functions to initialize StringInfos. One allows callers to initialize a StringInfo passing along a buffer that is already allocated by palloc. Here the StringInfo code uses this buffer directly rather than doing any memcpying into a new allocation. Having this as a function allows us to verify the buffer is correctly NUL terminated. StringInfos initialized this way can be appended to and reset just like any other normal StringInfo. The other new initialization function also accepts an existing buffer, but the given buffer does not need to be a pointer to a palloc'd chunk. This buffer could be a pointer pointing partway into some palloc'd chunk or may not even be palloc'd at all. StringInfos initialized this way are deemed as "read-only". This means that it's not possible to append to them or reset them. For the latter of the two new initialization functions mentioned above, we relax the requirement that the data buffer must be NUL terminated. Relaxing this requirement is convenient in a few places as it can save us from having to allocate an entire new buffer just to add the NUL terminator or save us from having to temporarily add a NUL only to have to put the original char back again later. Incompatibility note: Here we also forego adding the NUL in a few places where it does not seem to be required. These locations are passing the given StringInfo into a type's receive function. It does not seem like any of our built-in receive functions require this, but perhaps there's some UDT out there in the wild which does require this. It is likely worthy of a mention in the release notes that a UDT's receive function mustn't rely on the input StringInfo being NUL terminated. Author: David Rowley Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvorfO3iBZ%3DxpiZvp3uHtJVLyFaPBSvcAhAq2HPLnaNSwQ%40mail.gmail.com	2023-10-26 16:31:48 +13:00
Nathan Bossart	c103d07381	Add function for removing arbitrary nodes in binaryheap. This commit introduces binaryheap_remove_node(), which can be used to remove any node from a binary heap. The implementation is straightforward. The target node is replaced with the last node in the heap, and then we sift as needed to preserve the heap property. This new function is intended for use in a follow-up commit that will improve the performance of pg_restore. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/3612876.1689443232%40sss.pgh.pa.us	2023-09-18 14:06:08 -07:00
Nathan Bossart	5af0263afd	Make binaryheap available to frontend code. There are a couple of places in frontend code that could make use of this simple binary heap implementation. This commit makes binaryheap usable in frontend code, much like commit `26aaf97b68` did for StringInfo. Like StringInfo, the header file is left in lib/ to reduce the likelihood of unnecessary breakage. The frontend version of binaryheap exposes a void -based API since frontend code does not have access to the Datum definitions. This seemed like a better approach than switching all existing uses to void or making the Datum definitions available to frontend code. Reviewed-by: Tom Lane, Alvaro Herrera Discussion: https://postgr.es/m/3612876.1689443232%40sss.pgh.pa.us	2023-09-18 12:18:33 -07:00
Andres Freund	f0a94d81e4	Fix type of iterator variable in SH_START_ITERATE Also add comment to make the reasoning behind the Assert() more explicit (per Tom). Reported-by: Ranier Vilela Discussion: https://postgr.es/m/CAEudQAocXNJ6s1VLz+hMamLAQAiewRoW17OJ6-+9GACKfj6iPQ@mail.gmail.com Backpatch: 11-	2023-07-06 09:57:28 -07:00
Michael Paquier	ef7002dbe0	Fix various typos in code and tests Most of these are recent, and the documentation portions are new as of v16 so there is no need for a backpatch. Author: Justin Pryzby Discussion: https://postgr.es/m/20230208155644.GM1653@telsasoft.com	2023-02-09 14:43:53 +09:00
Tom Lane	3b4ac33254	Avoid type cheats for invalid dsa_handles and dshash_table_handles. Invent separate macros for "invalid" values of these types, so that we needn't embed knowledge of their representations into calling code. These are all zeroes anyway ATM, so this is not fixing any live bug, but it makes the code cleaner and more future-proof. I (tgl) also chose to move DSM_HANDLE_INVALID into dsm_impl.h, since it seems like it should live beside the typedef for dsm_handle. Hou Zhijie, Nathan Bossart, Kyotaro Horiguchi, Tom Lane Discussion: https://postgr.es/m/OS0PR01MB5716860B1454C34E5B179B6694C99@OS0PR01MB5716.jpnprd01.prod.outlook.com	2023-01-25 11:48:38 -05:00
Andres Freund	51384cc40c	Add detached node functions to ilist These allow to test whether an element is in a list by checking whether prev/next are NULL. Needed to replace SHMQueueIsDetached() when converting from SHM_QUEUE to ilist.h style lists. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/20221120055930.t6kl3tyivzhlrzu2@awork3.anarazel.de Discussion: https://postgr.es/m/20200211042229.msv23badgqljrdg2@alap3.anarazel.de	2023-01-18 11:41:14 -08:00
Peter Eisentraut	c8ad4d8166	Constify the arguments of ilist.c/h functions Const qualifiers ensure that we don't do something stupid in the function implementation. Additionally they clarify the interface. As an example: void slist_delete(slist_head head, const slist_node node) Here one can instantly tell that node->next is not going to be set to NULL. Finally, const qualifiers potentially allow the compiler to do more optimizations. This being said, no benchmarking was done for this patch. The functions that return non-const pointers like slist_next_node(), dclist_next_node() etc. are not affected by the patch intentionally. Author: Aleksander Alekseev Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAJ7c6TM2%3D08mNKD9aJg8vEY9hd%2BG4L7%2BNvh30UiNT3kShgRgNg%40mail.gmail.com	2023-01-12 08:00:51 +01:00
Michael Paquier	33ab0a2a52	Fix typos in comments, code and documentation While on it, newlines are removed from the end of two elog() strings. The others are simple grammar mistakes. One comment in pg_upgrade referred incorrectly to sequences since `a7e5457`. Author: Justin Pryzby Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com Backpatch-through: 11	2023-01-03 16:26:14 +09:00
Bruce Momjian	c8e1ba736b	Update copyright for 2023 Backpatch-through: 11	2023-01-02 15:00:37 -05:00

1 2 3 4 5

213 Commits