postgres

mirror of https://github.com/postgres/postgres.git synced 2025-07-12 21:01:52 +03:00

Author	SHA1	Message	Date
Jeff Davis	cc721c459d	HashAgg: use Bump allocator for hash TupleHashTable entries. The entries aren't freed until the entire hash table is destroyed, so use the Bump allocator to improve allocation speed, avoid wasting space on the chunk header, and avoid wasting space due to the power-of-two allocations. Discussion: https://postgr.es/m/CAApHDvqv1aNB4cM36FzRwivXrEvBO_LsG_eQ3nqDXTjECaatOQ@mail.gmail.com Reviewed-by: David Rowley	2025-03-24 22:05:33 -07:00
Daniel Gustafsson	8dd7c7cd0a	Replace EEOP_DONE with special steps for return/no return Knowing when the side-effects of an expression is the intended result of the execution, rather than the returnvalue, is important for being able generate more efficient JITed code. This replaces EEOP_DONE with two new steps: EEOP_DONE_RETURN and EEOP_DONE_NO_RETURN. Expressions which return a value should use the former step; expressions used for their side-effects which don't return value should use the latter. Author: Andres Freund <andres@anarazel.de> Co-authored-by: Daniel Gustafsson <daniel@yesql.se> Reviewed-by: Andreas Karlsson <andreas@proxel.se> Discussion: https://postgr.es/m/415721CE-7D2E-4B74-B5D9-1950083BA03E@yesql.se Discussion: https://postgr.es/m/20191023163849.sosqbfs5yenocez3@alap3.anarazel.de	2025-03-11 12:02:38 +01:00
Jeff Davis	a1f7f80bfe	Update outdated comments in nodeAgg.c. Author: Zhang Mingli Reviewed-by: Richard Guo Discussion: https://postgr.es/m/198a8d1e-0792-4e7f-828e-902aa342f36e@Spark	2025-02-18 10:37:50 -08:00
Jeff Davis	38172d1856	Injection points for hash aggregation. Requires adding a guard against shift-by-32. Previously, that was impossible because the number of partitions was always greater than 1, but a new injection point can force the number of partitions to 1. Discussion: https://postgr.es/m/ff4e59305e5d689e03cd256a736348d3e7958f8f.camel@j-davis.com	2025-02-11 11:26:25 -08:00
Jeff Davis	b4a07f532b	Revert "TupleHashTable: store additional data along with tuple." This reverts commit `e0ece2a981` due to performance regressions. Reported-by: David Rowley	2025-01-13 14:14:33 -08:00
Jeff Davis	e0ece2a981	TupleHashTable: store additional data along with tuple. Previously, the caller needed to allocate the memory and the TupleHashTable would store a pointer to it. That wastes space for the palloc overhead as well as the size of the pointer itself. Now, the TupleHashTable relies on the caller to correctly specify the additionalsize, and allocates that amount of space. The caller can then request a pointer into that space. Discussion: https://postgr.es/m/b9cbf0219a9859dc8d240311643ff4362fd9602c.camel@j-davis.com Reviewed-by: Heikki Linnakangas	2025-01-10 17:14:37 -08:00
Jeff Davis	3f482940db	ExecInitAgg: update aggstate->numaggs and ->numtrans earlier. Functions hash_agg_entry_size() and build_hash_tables() make use of those values for memory size estimates. Because this change only affects memory estimates, don't backpatch. Discussion: https://postgr.es/m/7530bd8783b1a78d53a3c70383e38d8da0a5ffe5.camel%40j-davis.com	2025-01-07 15:13:50 -08:00
David Rowley	d93bb8163c	Fix outdated CHUNKHDRSZ value in nodeAgg.c CHUNKHDRSZ was defined as 16 bytes, which was true when that code went in, but since `c6e0fe1f2`, 8 is a more accurate value. Here we adjust it to use sizeof(MemoryChunk), which is normally 8, or 16 for cassert builds. `c6e0fe1f2` first appeared in v16, so this is technically wrong in v16 up to master, but let's apply this only to master as adjusting this does influence the estimated number of batches in the aggregate costing code and we don't want to cause plan instability in released versions. Reviewed-by: Tom Lane Discussion: https://postgr.es/m/CAApHDvpMpRQvsTqZo3FinXkgytwxwF8sCyZm83xDj-1s_hLe+w@mail.gmail.com	2025-01-02 22:04:09 +13:00
Bruce Momjian	50e6eb731d	Update copyright for 2025 Backpatch-through: 13	2025-01-01 11:21:55 -05:00
Tom Lane	e0a2721f7c	Get rid of old version of BuildTupleHashTable(). It was reasonable to preserve the old API of BuildTupleHashTable() in the back branches, but in HEAD we should actively discourage use of that version. There are no remaining callers in core, so just get rid of it. Then rename BuildTupleHashTableExt() back to BuildTupleHashTable(). While at it, fix up the miserably-poorly-maintained header comment for BuildTupleHashTable[Ext]. It looks like more than one patch in this area has had the opinion that updating comments is beneath them. Discussion: https://postgr.es/m/538343.1734646986@sss.pgh.pa.us	2024-12-19 18:07:00 -05:00
David Rowley	d96d1d5152	Fix incorrect slot type in BuildTupleHashTableExt `0f5738202` adjusted the execGrouping.c code so it made use of ExprStates to generate hash values. That commit made a wrong assumption that the slot type to pass to ExecBuildHash32FromAttrs() is always &TTSOpsMinimalTuple. That's not the case as the slot type depends on the slot type passed to LookupTupleHashEntry(), which for nodeRecursiveunion.c, could be any of the current slot types. Here we fix this by adding a new parameter to BuildTupleHashTableExt() to allow the slot type to be passed in. In the case of nodeSubplan.c and nodeAgg.c the slot type is always &TTSOpsVirtual, so for both of those cases, it's beneficial to pass the known slot type as that allows ExecBuildHash32FromAttrs() to skip adding the tuple deform step to the resulting ExprState. Another possible fix would have been to have ExecBuildHash32FromAttrs() set "fetch.kind" to NULL so that ExecComputeSlotInfo() always determines the EEOP_INNER_FETCHSOME is required, however, that option isn't favorable as slows down aggregation and hashed subplan evaluation due to the extra (needless) deform step. Thanks to Nathan Bossart for bisecting to find the offending commit based on Paul's report. Reported-by: Paul Ramsey <pramsey@cleverelephant.ca> Discussion: https://postgr.es/m/99F064C1-B3EB-4BE7-97D2-D2A0AA487A71@cleverelephant.ca	2024-12-18 12:05:55 +13:00
Peter Eisentraut	7f798aca1d	Remove useless casts to (void *) Many of them just seem to have been copied around for no real reason. Their presence causes (small) risks of hiding actual type mismatches or silently discarding qualifiers Discussion: https://www.postgresql.org/message-id/flat/461ea37c-8b58-43b4-9736-52884e862820@eisentraut.org	2024-11-28 08:27:20 +01:00
Peter Eisentraut	dbbca2cf29	Remove unused #include's from backend .c files as determined by include-what-you-use (IWYU) While IWYU also suggests to add a bunch of #include's (which is its main purpose), this patch does not do that. In some cases, a more specific #include replaces another less specific one. Some manual adjustments of the automatic result: - IWYU currently doesn't know about includes that provide global variable declarations (like -Wmissing-variable-declarations), so those includes are being kept manually. - All includes for port(ability) headers are being kept for now, to play it safe. - No changes of catalog/pg_foo.h to catalog/pg_foo_d.h, to keep the patch from exploding in size. Note that this patch touches just *.c files, so nothing declared in header files changes in hidden ways. As a small example, in src/backend/access/transam/rmgr.c, some IWYU pragma annotations are added to handle a special case there. Discussion: https://www.postgresql.org/message-id/flat/af837490-6b2f-46df-ba05-37ea6a6653fc%40eisentraut.org	2024-03-04 12:02:20 +01:00
Bruce Momjian	29275b1d17	Update copyright for 2024 Reported-by: Michael Paquier Discussion: https://postgr.es/m/ZZKTDPxBBMt3C0J9@paquier.xyz Backpatch-through: 12	2024-01-03 20:49:05 -05:00
Amit Langote	d060e921ea	Remove obsolete executor cleanup code This commit removes unnecessary ExecExprFreeContext() calls in ExecEnd* routines because the actual cleanup is managed by FreeExecutorState(). With no callers remaining for ExecExprFreeContext(), this commit also removes the function. This commit also drops redundant ExecClearTuple() calls, because ExecResetTupleTable() in ExecEndPlan() already takes care of resetting and dropping all TupleTableSlots initialized with ExecInitScanTupleSlot() and ExecInitExtraTupleSlot(). After these modifications, the ExecEnd*() routines for ValuesScan, NamedTuplestoreScan, and WorkTableScan became redundant. So, this commit removes them. Reviewed-by: Robert Haas Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com	2023-09-28 09:44:39 +09:00
Tom Lane	0245f8db36	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql	2023-05-19 17:24:48 -04:00
Alvaro Herrera	c44b59fad4	Mark internal messages as no longer translatable The problem that these messages protect against can only occur because a corrupted hash spill file was written, i.e., a Postgres bug. There's no reason to have them as translatable. Backpatch to 15, where these messages were changed by commit `c4649cce39`. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/20230510175407.dwa5v477pw62ikyx@alvherre.pgsql	2023-05-16 11:47:25 +02:00
Michael Paquier	8961cb9a03	Fix typos in comments The changes done in this commit impact comments with no direct user-visible changes, with fixes for incorrect function, variable or structure names. Author: Alexander Lakhin Discussion: https://postgr.es/m/e8c38840-596a-83d6-bd8d-cebc51111572@gmail.com	2023-05-02 12:23:08 +09:00
Tom Lane	fce3b26e97	Rename ExecAggTransReparent, and improve its documentation. The name of this function suggests that it ought to reparent R/W expanded objects to be children of the persistent aggcontext, instead of copying them. In fact it does no such thing, and if you try to make it do so you will see multiple regression failures. Rename it to the less-misleading ExecAggCopyTransValue, and add commentary about why that attractive-sounding optimization won't work. Also adjust comments at call sites, some of which were describing logic that has since been moved into ExecAggCopyTransValue. Discussion: https://postgr.es/m/3004282.1681930251@sss.pgh.pa.us	2023-04-24 13:01:33 -04:00
David Rowley	3f58a4e296	Fix various typos and incorrect/outdated name references Author: Alexander Lakhin Discussion: https://postgr.es/m/699beab4-a6ca-92c9-f152-f559caf6dc25@gmail.com	2023-04-19 13:50:33 +12:00
Tom Lane	78d5952dd0	Ensure result of an aggregate's finalfunc is made read-only. The finalfunc might return a read-write expanded object. If we de-duplicate multiple call sites for the aggregate, any function(s) receiving the aggregate result earlier could alter or destroy the value that reaches the ones called later. This is a brown-paper-bag bug in commit `42b746d4c`, because we actually considered the need for read-only-ness but failed to realize that it applied to the case with a finalfunc as well as the case without. Per report from Justin Pryzby. New error in HEAD, no need for back-patch. Discussion: https://postgr.es/m/ZDm5TuKsh3tzoEjz@telsasoft.com	2023-04-16 14:16:40 -04:00
Tom Lane	141225b251	Mop up some undue familiarity with the innards of Bitmapsets. nodeAppend.c used non-nullness of appendstate->as_valid_subplans as a state flag to indicate whether it'd done ExecFindMatchingSubPlans (or some sufficient approximation to that). This was pretty questionable even in the beginning, since it wouldn't really work right if there are no valid subplans. It got more questionable after commit `27e1f1456` added logic that could reduce as_valid_subplans to an empty set: at that point we were depending on unspecified behavior of bms_del_members, namely that it'd not return an empty set as NULL. It's about to start doing that, which breaks this logic entirely. Hence, add a separate boolean flag to signal whether as_valid_subplans has been computed. Also fix a previously-cosmetic bug in nodeAgg.c, wherein it ignored the return value of bms_del_member instead of updating its pointer. Patch by me; thanks to Nathan Bossart and Richard Guo for review. Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us	2023-03-02 11:37:37 -05:00
Tom Lane	462bb7f128	Remove bms_first_member(). This function has been semi-deprecated ever since we invented bms_next_member(). Its habit of scribbling on the input bitmapset isn't great, plus for sufficiently large bitmapsets it would take O(N^2) time to complete a loop. Now we have the additional problem that reducing the input to empty while leaving it still accessible would violate a planned invariant. So let's just get rid of it, after updating the few extant callers to use bms_next_member(). Patch by me; thanks to Nathan Bossart and Richard Guo for review. Discussion: https://postgr.es/m/1159933.1677621588@sss.pgh.pa.us	2023-03-02 11:34:29 -05:00
Tom Lane	8d83a5d0a2	Remove redundant grouping and DISTINCT columns. Avoid explicitly grouping by columns that we know are redundant for sorting, for example we need group by only one of x and y in SELECT ... WHERE x = y GROUP BY x, y This comes up more often than you might think, as shown by the changes in the regression tests. It's nearly free to detect too, since we are just piggybacking on the existing logic that detects redundant pathkeys. (In some of the existing plans that change, it's visible that a sort step preceding the grouping step already didn't bother to sort by the redundant column, making the old plan a bit silly-looking.) To do this, build processed_groupClause and processed_distinctClause lists that omit any provably-redundant sort items, and consult those not the originals where relevant. This means that within the planner, one should usually consult root->processed_groupClause or root->processed_distinctClause if one wants to know which columns are to be grouped on; but to check whether grouping or distinct-ing is happening at all, check non-NIL-ness of parse->groupClause or parse->distinctClause. This is comparable to longstanding rules about handling the HAVING clause, so I don't think it'll be a huge maintenance problem. nodeAgg.c also needs minor mods, because it's now possible to generate AGG_PLAIN and AGG_SORTED Agg nodes with zero grouping columns. Patch by me; thanks to Richard Guo and David Rowley for review. Discussion: https://postgr.es/m/185315.1672179489@sss.pgh.pa.us	2023-01-18 12:37:57 -05:00
Tom Lane	92957ed98c	Avoid reference to nonexistent array element in ExecInitAgg(). When considering an empty grouping set, we fetched phasedata->eqfunctions[-1]. Because the eqfunctions array is palloc'd, that would always be an aset pointer in released versions, and thus the code accidentally failed to malfunction (since it would do nothing unless it found a null pointer). Nonetheless this seems like trouble waiting to happen, so add a check for length == 0. It's depressing that our valgrind testing did not catch this. Maybe we should reconsider the choice to not mark that word NOACCESS? Richard Guo Discussion: https://postgr.es/m/CAMbWs4-vZuuPOZsKOYnSAaPYGKhmacxhki+vpOKk0O7rymccXQ@mail.gmail.com	2023-01-02 16:17:00 -05:00
Bruce Momjian	c8e1ba736b	Update copyright for 2023 Backpatch-through: 11	2023-01-02 15:00:37 -05:00
Peter Eisentraut	5f2f99c9c6	Remove unnecessary casts Some code carefully cast all data buffer arguments for data write and read function calls to void , even though the respective arguments are already void . Remove this unnecessary clutter. Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com	2022-12-30 10:12:24 +01:00
Peter Eisentraut	c727f511bd	Refactor aclcheck functions Instead of dozens of mostly-duplicate pg_foo_aclcheck() functions, write one common function object_aclcheck() that can handle almost all of them. We already have all the information we need, such as which system catalog corresponds to which catalog table and which column is the ACL column. There are a few pg_foo_aclcheck() that don't work via the generic function and have special APIs, so those stay as is. I also changed most pg_foo_aclmask() functions to static functions, since they are not used outside of aclchk.c. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Reviewed-by: Antonin Houska <ah@cybertec.at> Discussion: https://www.postgresql.org/message-id/flat/95c30f96-4060-2f48-98b5-a4392d3b6066@enterprisedb.com	2022-11-13 09:02:41 +01:00
David Rowley	d37aa3d358	Allow nodeSort to perform Datum sorts for byref types Here we add a new 'copy' parameter to tuplesort_getdatum so that we can instruct the function not to datumCopy() byref Datums before returning. Similar to `91e9e89dc`, this can provide significant performance improvements in nodeSort when sorting by a single byref column and the sort's targetlist contains only that column. This allows us to re-enable Datum sorts for byref types which was disabled in `3a5817695` due to a reported memory leak. Additionally, here we slightly optimize DISTINCT aggregates so that we no longer perform any datumCopy() when we find the current value not to be distinct from the previous value. Previously the code would always take a copy of the most recent Datum and pfree the previous value, even when the values were the same. Testing shows a small but noticeable performance increase when aggregate transitions are skipped due to the current transition value being the same as the prior one. Author: David Rowley Discussion: https://postgr.es/m/CAApHDvqS6wC5U==k9Hd26E4EQXH3QR67-T4=Q1rQ36NGvjfVSg@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvqHonfe9G1cVaKeHbDx70R_zCrM3qP2AGXpGrieSKGnhA@mail.gmail.com	2022-10-28 09:25:12 +13:00
Tom Lane	42b746d4c9	Remove uses of MemoryContextContains in nodeAgg.c and nodeWindowAgg.c. MemoryContextContains is no longer reliable in the wake of `c6e0fe1f2`, so we need to get rid of these uses. It appears that there's no really good reason to force the result of an aggregate's finalfn or serialfn to be allocated in the per-tuple context. The only other plausible case is that the result points to or into the aggregate's transition value, and that's fine because it will last as long as we need it to. (This conclusion depends on the assumption that finalfns are not allowed to scribble on the transition value, but we've long required that.) So we can just drop the MemoryContextContains plus datumCopy business, although we do need to take care to not return a read-write pointer when the transition value is an expanded datum. Likewise, we don't really need to force the result of a window function to be in the output context. In this case, the plausible alternative is that it's pointing into the temporary tuple slot used by WinGetFuncArgInPartition or WinGetFuncArgInFrame (since those functions could return such a pointer, which might become the window function's result). That will hold still for long enough, unless there is another window function using the same WindowObject. I'm content to always perform a datumCopy when there's more than one such function. On net, these changes should provide small speed improvements as well as removing problematic code. Tom Lane and David Rowley Discussion: https://postgr.es/m/1913788.1664898906@sss.pgh.pa.us	2022-10-06 13:27:34 -04:00
David Rowley	2d0bbedda7	Rename shadowed local variables In a similar effort to `f01592f91`, here we mostly rename shadowed local variables to remove the warnings produced when compiling with -Wshadow=compatible-local. This fixes 63 warnings and leaves just 5. Author: Justin Pryzby, David Rowley Reviewed-by: Justin Pryzby Discussion https://postgr.es/m/20220817145434.GC26426%40telsasoft.com	2022-10-05 21:01:41 +13:00
Peter Geoghegan	bfcf1b3480	Harmonize parameter names in storage and AM code. Make sure that function declarations use names that exactly match the corresponding names from function definitions in storage, catalog, access method, executor, and logical replication code, as well as in miscellaneous utility/library code. Like other recent commits that cleaned up function parameter names, this commit was written with help from clang-tidy. Later commits will do the same for other parts of the codebase. Author: Peter Geoghegan <pg@bowt.ie> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com	2022-09-19 19:18:36 -07:00
David Rowley	421892a192	Further reduce warnings with -Wshadow=compatible-local In a similar effort to `f01592f91`, here we're targetting fixing the warnings that -Wshadow=compatible-local produces that we can fix by moving a variable to an inner scope to stop that variable from being shadowed by another variable declared somewhere later in the function. All of the warnings being fixed here are changing the scope of variables which are being used as an iterator for a "for" loop. In each instance, the fix happens to be changing the for loop to use the C99 type initialization. Much of this code likely pre-dates our use of C99. Reducing the scope of the outer scoped variable seems like the safest way to fix these. Renaming seems more likely to risk patches using the wrong variable. Reducing the scope is more likely to result in a compilation failure after applying some future patch rather than introducing bugs with it. By my count, this takes the warning count from 129 down to 114. Author: Justin Pryzby Discussion: https://postgr.es/m/CAApHDvrwLGBP%2BYw9vriayyf%3DXR4uPWP5jr6cQhP9au_kaDUhbA%40mail.gmail.com	2022-08-24 12:27:12 +12:00
David Rowley	1349d2790b	Improve performance of ORDER BY / DISTINCT aggregates ORDER BY / DISTINCT aggreagtes have, since implemented in Postgres, been executed by always performing a sort in nodeAgg.c to sort the tuples in the current group into the correct order before calling the transition function on the sorted tuples. This was not great as often there might be an index that could have provided pre-sorted input and allowed the transition functions to be called as the rows come in, rather than having to store them in a tuplestore in order to sort them once all the tuples for the group have arrived. Here we change the planner so it requests a path with a sort order which supports the most amount of ORDER BY / DISTINCT aggregate functions and add new code to the executor to allow it to support the processing of ORDER BY / DISTINCT aggregates where the tuples are already sorted in the correct order. Since there can be many ORDER BY / DISTINCT aggregates in any given query level, it's very possible that we can't find an order that suits all of these aggregates. The sort order that the planner chooses is simply the one that suits the most aggregate functions. We take the most strictly sorted variation of each order and see how many aggregate functions can use that, then we try again with the order of the remaining aggregates to see if another order would suit more aggregate functions. For example: SELECT agg(a ORDER BY a),agg2(a ORDER BY a,b) ... would request the sort order to be {a, b} because {a} is a subset of the sort order of {a,b}, but; SELECT agg(a ORDER BY a),agg2(a ORDER BY c) ... would just pick a plan ordered by {a} (we give precedence to aggregates which are earlier in the targetlist). SELECT agg(a ORDER BY a),agg2(a ORDER BY b),agg3(a ORDER BY b) ... would choose to order by {b} since two aggregates suit that vs just one that requires input ordered by {a}. Author: David Rowley Reviewed-by: Ronan Dunklau, James Coleman, Ranier Vilela, Richard Guo, Tom Lane Discussion: https://postgr.es/m/CAApHDvpHzfo92%3DR4W0%2BxVua3BUYCKMckWAmo-2t_KiXN-wYH%3Dw%40mail.gmail.com	2022-08-02 23:11:45 +12:00
Tom Lane	8821054210	Remove stray references to lefttree/righttree in the executor. The general convention in the executor is to refer to child plans and planstates via the outerPlan[State] and innerPlan[State] macros, but a few places didn't do it like that. For consistency and readability, convert all the stragglers to use the macros. (See also commit `40f42d2a3`, which did some similar cleanup a few years ago, but missed these cases.) Richard Guo Discussion: https://postgr.es/m/CAMbWs4-vYhh1xsa_veah4PUed2Xq=Ed_YH3=Mqt5A3Y=EgfCEg@mail.gmail.com	2022-07-07 11:23:40 -04:00
Alvaro Herrera	c4f113e8fe	Clean up newlines following left parentheses Like commit `c9d2977519`.	2022-05-13 23:52:35 +02:00
Alvaro Herrera	24d2b2680a	Remove extraneous blank lines before block-closing braces These are useless and distracting. We wouldn't have written the code with them to begin with, so there's no reason to keep them. Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch	2022-04-13 19:16:02 +02:00
David Rowley	77bae396df	Adjust tuplesort API to have bitwise option flags This replaces the bool flag for randomAccess. An upcoming patch requires adding another option, so instead of breaking the API for that, then breaking it again one day if we add more options, let's just break it once. Any boolean options we add in the future will just make use of an unused bit in the flags. Any extensions making use of tuplesorts will need to update their code to pass TUPLESORT_RANDOMACCESS instead of true for randomAccess. TUPLESORT_NONE can be used for a set of empty options. Author: David Rowley Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAApHDvoH4ASzsAOyHcxkuY01Qf%2B%2B8JJ0paw%2B03dk%2BW25tQEcNQ%40mail.gmail.com	2022-04-04 22:24:59 +12:00
Bruce Momjian	27b77ecf9f	Update copyright for 2022 Backpatch-through: 10	2022-01-07 19:04:57 -05:00
Tom Lane	e9d9ba2a4d	Avoid some other O(N^2) hazards in list manipulation. In the same spirit as `6301c3ada`, fix some more places where we were using list_delete_first() in a loop and thereby risking O(N^2) behavior. It's not clear that the lists manipulated in these spots can get long enough to be really problematic ... but it's not clear that they can't, either, and the fixes are simple enough. As before, back-patch to v13. Discussion: https://postgr.es/m/CD2F0E7F-9822-45EC-A411-AE56F14DEA9F@amazon.com	2021-11-01 16:24:39 -04:00
Heikki Linnakangas	c4649cce39	Refactor LogicalTapeSet/LogicalTape interface. All the tape functions, like LogicalTapeRead and LogicalTapeWrite, now take a LogicalTape as argument, instead of LogicalTapeSet+tape number. You can create any number of LogicalTapes in a single LogicalTapeSet, and you don't need to decide the number upfront, when you create the tape set. This makes the tape management in hash agg spilling in nodeAgg.c simpler. Discussion: https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83%40iki.fi Reviewed-by: Peter Geoghegan, Zhihong Yu, John Naylor	2021-10-18 14:46:01 +03:00
Tom Lane	28d936031a	Get rid of artificial restriction on hash table sizes on Windows. The point of introducing the hash_mem_multiplier GUC was to let users reproduce the old behavior of hash aggregation, i.e. that it could use more than work_mem at need. However, the implementation failed to get the job done on Win64, where work_mem is clamped to 2GB to protect various places that calculate memory sizes using "long int". As written, the same clamp was applied to hash_mem. This resulted in severe performance regressions for queries requiring a bit more than 2GB for hash aggregation, as they now spill to disk and there's no way to stop that. Getting rid of the work_mem restriction seems like a good idea, but it's a big job and could not conceivably be back-patched. However, there's only a fairly small number of places that are concerned with the hash_mem value, and it turns out to be possible to remove the restriction there without too much code churn or any ABI breaks. So, let's do that for now to fix the regression, and leave the larger task for another day. This patch does introduce a bit more infrastructure that should help with the larger task, namely pg_bitutils.h support for working with size_t values. Per gripe from Laurent Hasson. Back-patch to v13 where the behavior change came in. Discussion: https://postgr.es/m/997817.1627074924@sss.pgh.pa.us Discussion: https://postgr.es/m/MN2PR15MB25601E80A9B6D1BA6F592B1985E39@MN2PR15MB2560.namprd15.prod.outlook.com	2021-07-25 14:02:27 -04:00
David Rowley	63b1af9437	Cleanup some aggregate code in the executor Here we alter the code that calls build_pertrans_for_aggref() so that the function no longer needs to special-case whether it's dealing with an aggtransfn or an aggcombinefn. This allows us to reuse the build_aggregate_transfn_expr() function and just get rid of the build_aggregate_combinefn_expr() completely. All of the special case code that was in build_pertrans_for_aggref() has been moved up to the calling functions. This saves about a dozen lines of code in nodeAgg.c and a few dozen more in parse_agg.c Also, rename a few variables in nodeAgg.c to try to make it more clear that we're working with either a aggtransfn or an aggcombinefn. Some of the old names would have you believe that we were always working with an aggtransfn. Discussion: https://postgr.es/m/CAApHDvptMQ9FmF0D67zC_w88yVnoNVR2+kkOQGUrCmdxWxLULQ@mail.gmail.com	2021-07-04 18:47:31 +12:00
Tom Lane	def5b065ff	Initial pgindent and pgperltidy run for v14. Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.	2021-05-12 13:14:10 -04:00
Michael Paquier	bcf2667bf6	Fix some typos, grammar and style in docs and comments The portions fixing the documentation are backpatched where needed. Author: Justin Pryzby Discussion: https://postgr.es/m/20210210235557.GQ20012@telsasoft.com backpatch-through: 9.6	2021-02-24 16:13:17 +09:00
Tom Lane	0ff865fbe5	Fix bug in HashAgg's selective-column-spilling logic. Commit `230230223` taught nodeAgg.c that, when spilling tuples from memory in an oversized hash aggregation, it only needed to spill input columns referenced in the node's tlist and quals. Unfortunately, that's wrong: we also have to save the grouping columns. The error is masked in common cases because the grouping columns also appear in the tlist, but that's not necessarily true. The main category of plans where it's not true seem to come from semijoins ("WHERE outercol IN (SELECT innercol FROM innertable)") where the innercol needs an implicit promotion to make it comparable to the outercol. The grouping column will be "innercol::promotedtype", but that expression appears nowhere in the Agg node's own tlist and quals; only the bare "innercol" is found in the tlist. I spent quite a bit of time looking for a suitable regression test case for this, without much success. If the number of distinct values of the innercol is large enough to make spilling happen, the planner tends to prefer a non-HashAgg plan, at least for problem sizes that are reasonable to use in the regression tests. So, no new regression test. However, this patch does demonstrably fix the originally-reported test case. Per report from s.p.e (at) gmx-topmail.de. Backpatch to v13 where the troublesome code came in. Discussion: https://postgr.es/m/trinity-1c565d44-159f-488b-a518-caf13883134f-1611835701633@3c-app-gmx-bap78	2021-02-04 23:01:37 -05:00
Bruce Momjian	ca3b37487b	Update copyright for 2021 Backpatch-through: 9.5	2021-01-02 13:06:25 -05:00
Jeff Davis	05c0258966	Fix bug #16784 in Disk-based Hash Aggregation. Before processing tuples, agg_refill_hash_table() was setting all pergroup pointers to NULL to signal to advance_aggregates() that it should not attempt to advance groups that had spilled. The problem was that it also set the pergroups for sorted grouping sets to NULL, which caused rescanning to fail. Instead, change agg_refill_hash_table() to only set the pergroups for hashed grouping sets to NULL; and when compiling the expression, pass doSort=false. Reported-by: Alexander Lakhin Discussion: https://postgr.es/m/16784-7ff169bf2c3d1588%40postgresql.org Backpatch-through: 13	2020-12-26 17:25:30 -08:00
Heikki Linnakangas	0a2bc5d61e	Move per-agg and per-trans duplicate finding to the planner. This has the advantage that the cost estimates for aggregates can count the number of calls to transition and final functions correctly. Bump catalog version, because views can contain Aggrefs. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/b2e3536b-1dbc-8303-c97e-89cb0b4a9a48%40iki.fi	2020-11-24 10:45:00 +02:00
Heikki Linnakangas	2cccb627f1	Skip allocating hash table in EXPLAIN-only mode. Author: Alexey Bashtanov Discussion: https://www.postgresql.org/message-id/36823f65-050d-ae24-aa4d-a37726998240%40imap.cc	2020-11-18 12:39:15 +02:00

1 2 3 4 5 ...

365 Commits