postgres

mirror of https://github.com/postgres/postgres.git synced 2025-12-19 17:02:53 +03:00

Author	SHA1	Message	Date
Tom Lane	ef5f559b95	Fix jsonb_object_agg crash after eliminating null-valued pairs. In commit `b61aa76e4` I added an assumption in jsonb_object_agg_finalfn that it'd be okay to apply uniqueifyJsonbObject repeatedly to a JsonbValue. I should have studied that code more closely first, because in skip_nulls mode it removed leading nulls by changing the "pairs" array start pointer. This broke the data structure's invariants in two ways: pairs no longer references a repalloc-able chunk, and the distance from pairs to the end of its array is less than parseState->size. So any subsequent addition of more pairs is at high risk of clobbering memory and/or causing repalloc to crash. Unfortunately, adding more pairs is exactly what will happen when the aggregate is being used as a window function. Fix by rewriting uniqueifyJsonbObject to not do that. The prior coding had little to recommend it anyway. Reported-by: Alexander Lakhin <exclusion@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/ec5e96fb-ee49-4e5f-8a09-3f72b4780538@gmail.com	2025-12-13 16:18:29 -05:00
Álvaro Herrera	630a93799d	Reject opclass options in ON CONFLICT clause It's as pointless as ASC/DESC and NULLS FIRST/LAST are, so reject all of them in the same way. While at it, normalize the others' error messages to have less translatable strings. Add tests for these errors. Noticed while reviewing recent INSERT ON CONFLICT patches. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://postgr.es/m/202511271516.oiefpvn3z27m@alvherre.pgsql	2025-12-12 14:26:42 +01:00
Álvaro Herrera	81f72115cf	Fix infer_arbiter_index for partitioned tables The fix for concurrent index operations in `bc32a12e0d` started considering indexes that are not yet marked indisvalid as arbiters for INSERT ON CONFLICT. For partitioned tables, this leads to including indexes that may not exist in partitions, causing a trivially reproducible "invalid arbiter index list" error to be thrown because of failure to match the index. To fix, it suffices to ignore !indisvalid indexes on partitioned tables. There should be no risk that the set of indexes will change for concurrent transactions, because in order for such an index to be marked valid, an ALTER INDEX ATTACH PARTITION must run which requires AccessExclusiveLock. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/17622f79-117a-4a44-aa8e-0374e53faaf0%40gmail.com	2025-12-11 20:56:37 +01:00
Heikki Linnakangas	820343bab3	Fix bogus extra arguments to query_safe in test The test seemed to incorrectly think that query_safe() takes an argument that describes what the query does, similar to e.g. command_ok(). Until commit `bd8d9c9bdf` the extra arguments were harmless and were just ignored, but when commit `bd8d9c9bdf` introduced a new optional argument to query_safe(), the extra arguments started clashing with that, causing the test to fail. Backpatch to v17, that's the oldest branch where the test exists. The extra arguments didn't cause any trouble on the older branches, but they were clearly bogus anyway.	2025-12-10 19:38:07 +02:00
Heikki Linnakangas	343693c3c1	Improve DDL deparsing test 1. The test initially focuses on the "parent" table, then switches to the "part" table, and goes back to the "parent" table. That seems a little weird, so move the tests around so that all the commands on the "parent" table are done first, followed by the "part" table. 2. ALTER TABLE ALTER COLUMN SET EXPRESSION was not tested, so add that. Author: jian he <jian.universality@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/CACJufxFDi7fnwB-8xXd_ExML7-7pKbTaK4j46AJ=4-14DXvtVg@mail.gmail.com	2025-12-10 19:27:02 +02:00
Michael Paquier	1d7b00dc14	Fix failures with cross-version pg_upgrade tests Buildfarm members skimmer and crake have reported that pg_upgrade running from v18 fails due to the changes of `d52c24b0f8`, with the expectations that the objects removed in the test module injection_points should still be present post upgrades, but the test module does not have them anymore. The origin of the issue is that the following test modules depend on injection_points, but they do not drop the extension once the tests finish, leaving its traces in the dumps used for the upgrades: - gin, down to v17 - typcache, down to v18 - nbtree, HEAD-only Test modules have no upgrade requirements, as they are used only for.. Tests, so there is no point in keeping them around. An alternative solution would be to drop the databases created by these modules in AdjustUpgrade.pm, but the solution of this commit to drop the extension is simpler. Note that there would be a catch if using a solution based on AdjustUpgrade.pm as the database name used for the test runs differs between configure and meson: - configure relies on USE_MODULE_DB for the database name unicity, that would build a database name based on the first entry of REGRESS, that lists all the SQL tests. - meson relies on a "name" field. For example, for the test module "gin", the regression database is named "regression_gin" under meson, while it is more complex for configure, as of "contrib_regression_gin_incomplete_splits". So a AdjustUpgrade.pm would need a set of DROP DATABASE IF EXISTS to solve this issue, to cope with each build system. The failure has been caused by `d52c24b0f8`, and the problem can happen with upgrade dumps from v17 and v18 to HEAD. This problem is not currently reachable in the back-branches, but it could be possible that a future change in injection_points in stable branches invalidates this theory, so this commit is applied down to v17 in the test modules that matter. Per discussion with Tom Lane and Heikki Linnakangas. Discussion: https://postgr.es/m/2899652.1765167313@sss.pgh.pa.us Backpatch-through: 17	2025-12-10 12:46:45 +09:00
Michael Paquier	06817fc8a4	Fix two issues with recently-introduced nbtree test REGRESS has forgotten about the test nbtree_half_dead_pages, and a .gitignore was missing from the module. Oversights in `c085aab278` for REGRESS and `1e4e5783e7` for the missing .gitignore. Discussion: https://postgr.es/m/aTipJA1Y1zVSmH3H@paquier.xyz	2025-12-10 11:56:42 +09:00
Thomas Munro	c507ba55f5	Fix O_CLOEXEC flag handling in Windows port. PostgreSQL's src/port/open.c has always set bInheritHandle = TRUE when opening files on Windows, making all file descriptors inheritable by child processes. This meant the O_CLOEXEC flag, added to many call sites by commit `1da569ca1f` (v16), was silently ignored. The original commit included a comment suggesting that our open() replacement doesn't create inheritable handles, but it was a mis- understanding of the code path. In practice, the code was creating inheritable handles in all cases. This hasn't caused widespread problems because most child processes (archive_command, COPY PROGRAM, etc.) operate on file paths passed as arguments rather than inherited file descriptors. Even if a child wanted to use an inherited handle, it would need to learn the numeric handle value, which isn't passed through our IPC mechanisms. Nonetheless, the current behavior is wrong. It violates documented O_CLOEXEC semantics, contradicts our own code comments, and makes PostgreSQL behave differently on Windows than on Unix. It also creates potential issues with future code or security auditing tools. To fix, define O_CLOEXEC to _O_NOINHERIT in master, previously used by O_DSYNC. We use different values in the back branches to preserve existing values. In pgwin32_open_handle() we set bInheritHandle according to whether O_CLOEXEC is specified, for the same atomic semantics as POSIX in multi-threaded programs that create processes. Backpatch-through: 16 Author: Bryan Green <dbryan.green@gmail.com> Co-authored-by: Thomas Munro <thomas.munro@gmail.com> (minor adjustments) Discussion: https://postgr.es/m/e2b16375-7430-4053-bda3-5d2194ff1880%40gmail.com	2025-12-10 09:01:35 +13:00
Masahiko Sawada	ab40db3852	Add started_by column to pg_stat_progress_analyze view. The new column, started_by, indicates the initiator of the analyze ('manual' or 'autovacuum'), helping users and monitoring tools to better understand ANALYZE behavior. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Discussion: https://postgr.es/m/CAA5RZ0suoicwxFeK_eDkUrzF7s0BVTaE7M%2BehCpYcCk5wiECpw%40mail.gmail.com	2025-12-09 11:23:45 -08:00
Masahiko Sawada	0d78952061	Add mode and started_by columns to pg_stat_progress_vacuum view. The new columns, mode and started_by, indicate the vacuum mode ('normal', 'aggressive', or 'failsafe') and the initiator of the vacuum ('manual', 'autovacuum', or 'autovacuum_wraparound'), respectively. This allows users and monitoring tools to better understand VACUUM behavior. Bump catalog version. Author: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Robert Treat <rob@xzilla.net> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Yu Wang <wangyu_runtime@163.com> Discussion: https://postgr.es/m/CAOzEurQcOY-OBL_ouEVfEaFqe_md3vB5pXjR_m6L71Dcp1JKCQ@mail.gmail.com	2025-12-09 10:51:14 -08:00
Heikki Linnakangas	bd8d9c9bdf	Widen MultiXactOffset to 64 bits This eliminates MultiXactOffset wraparound and the 2^32 limit on the total number of multixid members. Multixids are still limited to 2^31, but this is a nice improvement because 'members' can grow much faster than the number of multixids. On such systems, you can now run longer before hitting hard limits or triggering anti-wraparound vacuums. Not having to deal with MultiXactOffset wraparound also simplifies the code and removes some gnarly corner cases. We no longer need to perform emergency anti-wraparound freezing because of running out of 'members' space, so the offset stop limit is gone. But you might still not want 'members' to consume huge amounts of disk space. For that reason, I kept the logic for lowering vacuum's multixid freezing cutoff if a large amount of 'members' space is used. The thresholds for that are roughly the same as the "safe" and "danger" thresholds used before, 2 billion transactions and 4 billion transactions. This keeps the behavior for the freeze cutoff roughly the same as before. It might make sense to make this smarter or configurable, now that the threshold is only needed to manage disk usage, but that's left for the future. Add code to pg_upgrade to convert multitransactions from the old to the new format, rewriting the pg_multixact SLRU files. Because pg_upgrade now rewrites the files, we can get rid of some hacks we had put in place to deal with old bugs and upgraded clusters. Bump catalog version for the pg_multixact/offsets format change. Author: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com> Reviewed-by: wenhui qiu <qiuwenhuifx@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3DezaWg7_nt-8ey4aKv2w9LcuLthHknwCawmBgEeTnJrJTcw@mail.gmail.com	2025-12-09 13:53:03 +02:00
Richard Guo	f00484c170	Fix distinctness check for queries with grouping sets query_is_distinct_for() is intended to determine whether a query never returns duplicates of the specified columns. For queries using grouping sets, if there are no grouping expressions, the query may contain one or more empty grouping sets. The goal is to detect whether there is exactly one empty grouping set, in which case the query would return a single row and thus be distinct. The previous logic in query_is_distinct_for() was incomplete because the check was insufficiently thorough and could return false when it could have returned true. It failed to consider cases where the DISTINCT clause is used on the GROUP BY, in which case duplicate empty grouping sets are removed, leaving only one. It also did not correctly handle all possible structures of GroupingSet nodes that represent a single empty grouping set. To fix, add a check for the groupDistinct flag, and expand the query's groupingSets tree into a flat list, then verify that the expanded list contains only one element. No backpatch as this could result in plan changes. Author: Richard Guo <guofenglinux@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/CAMbWs480Z04NtP8-O55uROq2Zego309+h3hhaZhz6ztmgWLEBw@mail.gmail.com	2025-12-09 17:09:27 +09:00
Richard Guo	c925ad30b0	Fix const-simplification for index expressions and predicate Similar to the issue with constraint and statistics expressions fixed in `317c117d6`, index expressions and predicate can also suffer from incorrect reduction of NullTest clauses during const-simplification, due to unfixed varnos and the use of a NULL root. It has been reported that this issue can cause the planner to fail to pick up a partial index that it previously matched successfully. Because we need to cache the const-simplified index expressions and predicate in the relcache entry, we cannot fix the Vars before applying eval_const_expressions. To ensure proper reduction of NullTest clauses, this patch runs eval_const_expressions a second time -- after the Vars have been fixed and with a valid root. It could be argued that the additional call to eval_const_expressions might increase planning time, but I don't think that's a concern. It only runs when index expressions and predicate are present; it is relatively cheap when run on small expression trees (which is typically the case for index expressions and predicate), and it runs on expressions that have already been const-simplified once, making the second pass even cheaper. In return, in cases like the one reported, it allows the planner to match and use partial indexes, which can lead to significant execution-time improvements. Bug: #19007 Reported-by: Bryan Fox <bryfox@gmail.com> Author: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/19007-4cc6e252ed8aa54a@postgresql.org	2025-12-09 16:56:26 +09:00
Michael Paquier	0c3c5c3b06	Use palloc_object() and palloc_array() in more areas of the tree The idea is to encourage more the use of these new routines across the tree, as these offer stronger type safety guarantees than palloc(). The following paths are included in this batch, treating all the areas proposed by the author for the most trivial changes, except src/backend (by far the largest batch): src/bin/ src/common/ src/fe_utils/ src/include/ src/pl/ src/test/ src/tutorial/ Similar work has been done in `31d3847a37`. The code compiles the same before and after this commit, with the following exceptions due to changes in line numbers because some of the new allocation formulas are shorter: blkreftable.c pgfnames.c pl_exec.c Author: David Geier <geidav.pg@gmail.com> Discussion: https://postgr.es/m/ad0748d4-3080-436e-b0bc-ac8f86a3466a@gmail.com	2025-12-09 14:53:17 +09:00
Álvaro Herrera	d0d0ba6cf6	Unify some more messages No backpatch here because of message wording changes. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/202512081537.ahw5gwoencou@alvherre.pgsql	2025-12-08 19:25:36 +01:00
Michael Paquier	31280d96a6	test_custom_stats: Test module for custom cumulative statistics This test module acts as a replacement that existed prior to `d52c24b0f8` in the test module injection_points. It uses a more flexible structure than its ancestor: - Two libraries are built, one for fixed-sized stats and one for variable-sized stats. - No GUCs required. The stats are enabled only if one or both libraries are loaded with shared_preload_libraries. - Same kind IDs reserved: 25 (variable-sized) and 26 (fixed-sized) The goal of this redesign is to be able to easier extend the code coverage provided by this module for other changes that are currently under discussion, and injection_points was not suited for these. Injection points are also now widely used in the tree now, so extending more the test coverage for custom pgstats in the test module injection_points would be a riskier long-term move. The new code is mostly a copy of what existed previously in the test module injection_points, with the same callbacks defined for fixed-sized and variable-sized stats, but a simpler overall structure in terms of the stats counters updated. The test coverage should remain the same as previously: one TAP test is used to check data reports, crash recovery and clean restart scenarios. Tests are added for the manual reset of fixed-sized stats, something not tested until now. Author: Sami Imseih <samimseih@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com	2025-12-08 15:23:09 +09:00
Michael Paquier	d52c24b0f8	injection_points: Remove portions related to custom pgstats The test module injection_points has been used as a landing spot to provide coverage for the custom pgstats APIs, for both fixed-sized and variable-sized stats kinds. Some recent work related to pgstats is proving that this structure makes the implementation of new tests harder. This commit removes the code related to pgstats from injection_points, and an equivalent will be reintroduced as a separate test module in a follow-up commit. This removal is done in its own commit for clarity. Using injection_points for this test coverage was perhaps not the best way to design things, but this was good enough while working on the first flavor of the custom pgstats APIs. Using a new test module will make easier the introduction of new tests, and we will not need to worry about the impact of new changes related to custom pgstats could have with the internals of injection_points. Author: Sami Imseih <samimseih@gmail.com> Discussion: https://postgr.es/m/CAA5RZ0sJgO6GAwgFxmzg9MVP=rM7Us8KKcWpuqxe-f5qxmpE0g@mail.gmail.com	2025-12-08 12:45:20 +09:00
Michael Paquier	f68597ee77	Improve error messages of input functions for pg_dependencies and pg_ndistinct The error details updated in this commit can be reached in the regression tests. They did not follow the project style, and they should be written them as full sentences. Some of the errors are switched to use an elog(), for cases that involve paths that cannot be reached based on the previous state of the parser processing the input data (array start, object end, etc.). The error messages for these cases use now a more consistent style across the board, with the state of the parser reported for debugging. Reported-by: Tom Lane <tgl@sss.pgh.pa.us> Author: Michael Paquier <michael@paquier.xyz> Co-authored-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://postgr.es/m/1353179.1764901790@sss.pgh.pa.us	2025-12-08 10:23:48 +09:00
Tom Lane	6498287696	Handle constant inputs to corr() and related aggregates more precisely. The SQL standard says that corr() and friends should return NULL in the mathematically-undefined case where all the inputs in one of the columns have the same value. We were checking that by seeing if the sums Sxx and Syy were zero, but that approach is very vulnerable to roundoff error: if a sum is close to zero but not exactly that, we'd come out with a pretty silly non-NULL result. Instead, directly track whether the inputs are all equal by remembering the common value in each column. Once we detect that a new input is different from before, represent that by storing NaN for the common value. (An objection to this scheme is that if the inputs are all NaN, we will consider that they were not all equal. But under IEEE float arithmetic rules, one NaN is never equal to another, so this behavior is arguably correct. Moreover it matches what we did before in such cases.) Then, leave the sums at their exact value of zero for as long as we haven't detected different input values. This solution requires the aggregate transition state to contain 8 float values not 6, which is not problematic, and it seems to add less than 1% to the aggregates' runtime, which seems acceptable. While we're here, improve corr()'s final function to cope with overflow/underflow in the final calculation, and to clamp its result to [-1, 1] in case of roundoff error. Although this is arguably a bug fix, it requires a catversion bump due to the change in aggregates' initial states, so it can't be back-patched. Patch written by me, but many of the ideas are due to Dean Rasheed, who also did a deal of testing. Bug: #19340 Reported-by: Oleg Ivanov <o15611@gmail.com> Author: Tom Lane <tgl@sss.pgh.pa.us> Co-authored-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://postgr.es/m/19340-6fb9f6637f562092@postgresql.org	2025-12-06 18:31:26 -05:00
Michael Paquier	47da198934	Improve error reporting of recovery test 027_stream_regress Previously, the 027_stream_regress test reported the full contents of regression.diffs upon a test failure, when the standby and the primary were still alive. If a test fails quite badly, the amount of information reported can be really high, bloating the reports in the buildfarm, the CI, or even local runs. In most cases, we have noticed that having all this information is not necessary when attempting to identify the source of a problem in this test. This commit changes the situation by including the head and tail of regression.diffs in the reports generated on failure rather than its full contents, building upon `b93f4e2f98` to optionally control the size of the reports with the new environment variable PG_TEST_FILE_READ_LINES. This will perhaps require some more tuning, but the hope is to reduce some of the buildfarm report bloat while making the information good enough to deduce what is happening when something is going wrong, be it in the buildfarm or some tests run in the CI, at least. Suggested-by: Andres Freund <andres@anarazel.de> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com	2025-12-06 14:41:29 +09:00
Michael Paquier	b93f4e2f98	Add PostgreSQL::Test::Cluster::read_head_tail() helper to PostgreSQL/Utils.pm This function reads the lines from a file and filters its contents to report its head and tail contents. The amount of contents to read from a file can be tuned by the environment variable PG_TEST_FILE_READ_LINES, that can be used to override the default of 50 lines. If the file whose content is read has less lines than two times PG_TEST_FILE_READ_LINES, the whole file is returned. This will be used in a follow-up commit to limit the amount of information reported by some of the TAP tests on failure, where we have noticed that the contents reported by the buildfarm can be heavily bloated in some cases, with the head and tail contents of a report being able to provide enough information to be useful for debugging. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Co-authored-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAN55FZ1D6KXvjSs7YGsDeadqCxNF3UUhjRAfforzzP0k-cE=bA@mail.gmail.com	2025-12-06 14:27:53 +09:00
Tom Lane	6dfce8420e	Fix text substring search for non-deterministic collations. Due to an off-by-one error, the code failed to find matches at the end of the haystack. Fix by rewriting the loop. While at it, fix a comment that claimed that the function could find a zero-length match. Such a match could send a caller into an endless loop. However, zero-length matches only make sense with an empty search string, and that case is explicitly excluded by all callers. To make sure it stays that way, add an Assert and a comment. Bug: #19341 Reported-by: Adam Warland <adam.warland@infor.com> Author: Laurenz Albe <laurenz.albe@cybertec.at> Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/19341-1d9a22915edfec58@postgresql.org Backpatch-through: 18	2025-12-05 20:10:33 -05:00
Heikki Linnakangas	7c2061bdfb	Fix test to work with non-8kB block sizes Author: Maxim Orlov <orlovmg@gmail.com> Discussion: https://www.postgresql.org/message-id/CACG%3Dezbtm%2BLOzEMyLX7rzGcAv3ez3F6nNpSJjvZeMzed0Oe6Pw%40mail.gmail.com	2025-12-05 23:39:01 +02:00
Robert Haas	014f9a831a	Don't reset the pathlist of partitioned joinrels. apply_scanjoin_target_to_paths wants to avoid useless work and platform-specific dependencies by throwing away the path list created prior to applying the final scan/join target and constructing a whole new one using the final scan/join target. However, this is only valid when we'll consider all the same strategies after the pathlist reset as before. After resetting the path list, we reconsider Append and MergeAppend paths with the modified target list; therefore, it's only valid for a partitioned relation. However, what the previous coding missed is that it cannot be a partitioned join relation, because that also has paths that are not Append or MergeAppend paths and will not be reconsidered. Thus, before this patch, we'd sometimes choose a partitionwise strategy with a higher total cost than cheapest non-partitionwise strategy, which is not good. We had a surprising number of tests cases that were relying on this bug to work as they did. A big part of the reason for this is that row counts in regression test cases tend to be low, which brings the cost of partitionwise and non-partitionwise strategies very close together, especially for merge joins, where the real and perceived advantages of a partitionwise approach are minimal. In addition, one test case included a row-count-inflating join. In such cases, a partitionwise join can easily be a loser on cost, because the total number of tuples passing through an Append node is much higher than it is with a non-partitionwise strategy. That test case is adjusted by adding additional join clauses to avoid the row count inflation. Although the failure of the planner to choose the lowest-cost path is a bug, we generally do not back-patch fixes of this type, because planning is not an exact science and there is always a possibility that some user will end up with a plan that has a lower estimated cost but actually runs more slowly. Hence, no backpatch here, either. The code change here is exactly what was originally proposed by Ashutosh, but the changes to the comments and test cases have been very heavily rewritten by me, helped along by some very useful advice from Richard Guo. Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com> Author: Robert Haas <rhaas@postgresql.org> Reviewed-by: Jakub Wartak <jakub.wartak@enterprisedb.com> Reviewed-by: Arne Roland <arne.roland@malkut.net> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Discussion: http://postgr.es/m/CAExHW5toze58+jL-454J3ty11sqJyU13Sz5rJPQZDmASwZgWiA@mail.gmail.com	2025-12-05 12:00:18 -05:00
Tom Lane	8f1791c618	Fix some cases of indirectly casting away const. Newest versions of gcc are able to detect cases where code implicitly casts away const by assigning the result of strchr() or a similar function applied to a "const char " value to a target variable that's just "char ". This of course creates a hazard of not getting a compiler warning about scribbling on a string one was not supposed to, so fixing up such cases is good. This patch fixes a dozen or so places where we were doing that. Most are trivial additions of "const" to the target variable, since no actually-hazardous change was occurring. There is one place in ecpg.trailer where we were indeed violating the intention of not modifying a string passed in as "const char *". I believe that's harmless not a live bug, but let's fix it by copying the string before modifying it. There is a remaining trouble spot in ecpg/preproc/variable.c, which requires more complex surgery. I've left that out of this commit because I want to study that code a bit more first. We probably will want to back-patch this once compilers that detect this pattern get into wider circulation, but for now I'm just going to apply it to master to see what the buildfarm says. Thanks to Bertrand Drouvot for finding a couple more spots than I had. Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://postgr.es/m/1324889.1764886170@sss.pgh.pa.us	2025-12-05 11:17:23 -05:00
Álvaro Herrera	a4a0fa0c75	Stabilize tests some more Tests added by commits `90eae926ab`, `2bc7e886fc`, `bc32a12e0d` have occasionally failed, depending on timing. Add some dependency markers to the spec to try and remove the instability. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/202512041739.sgg3tb2yobe2@alvherre.pgsql	2025-12-05 16:16:27 +01:00
Michael Paquier	2f04110225	Improve test output of extended statistics for ndistinct and dependencies Corey Huinker has come up with a recipe that is more compact and more pleasant to the eye for extended stats because we know that all of them are 1-dimension JSON arrays. This commit switches the extended stats tests to use replace() instead of jsonb_pretty(), splitting the data so as one line is used for each item in the extended stats object. This results in the removal of a good chunk of test output, that is now easier to debug with one line used for each item in a stats object. This patch has not been provided by Corey. This is some post-commit cleanup work that I have noticed as good enough to do on its own while reviewing the rest of the patch set Corey has posted. Discussion: https://postgr.es/m/CADkLM=csMd52i39Ye8-PUUHyzBb3546eSCUTh-FBQ7bzT2uZ4Q@mail.gmail.com	2025-12-05 14:15:21 +09:00
Amit Kapila	5db6a344ab	Rename column slotsync_skip_at to slotsync_last_skip. Commit `76b78721ca` introduced two new columns in pg_stat_replication_slots to improve monitoring of slot synchronization. One of these columns was named slotsync_skip_at, which is inconsistent with the naming convention used for similar columns in other system views. Columns that store timestamps of the most recent event typically use the 'last_' in the column name (e.g., last_autovacuum, checksum_last_failure). Renaming slotsync_skip_at to slotsync_last_skip aligns with this pattern, making the purpose of the column clearer and improving overall consistency across the views. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Michael Banck <mbanck@gmx.net> Discussion: https://postgr.es/m/20251128091552.GB13635@p46.dedyn.io;lightning.p46.dedyn.io Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com	2025-12-05 04:12:55 +00:00
Michael Paquier	83f2f8413e	Show version of nodes in output of TAP tests This commit adds the version information of a node initialized by Cluster.pm, that may vary depending on the install_path given by the test. The code was written so as the node information, that includes the version number, was dumped before the version number was set. This is particularly useful for the pg_upgrade TAP tests, that may mix several versions for cross-version runs. The TAP infrastructure also allows mixing nodes with different versions, so this information can be useful for out-of-core tests. Backpatch down to v15, where Cluster.pm and the pg_upgrade TAP tests have been introduced. Author: Potapov Alexander <a.potapov@postgrespro.com> Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/e59bb-692c0a80-5-6f987180@170377126 Backpatch-through: 15	2025-12-05 09:21:13 +09:00
Andres Freund	6c5c393b74	Rename BUFFERPIN wait event class to BUFFER In an upcoming patch more wait events will be added to the wait event class (for buffer locking), making the current name too specific. Alternatively we could introduce a dedicated wait event class for those, but it seems somewhat confusing to have a BUFFERPIN and a BUFFER wait event class. Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Discussion: https://postgr.es/m/fvfmkr5kk4nyex56ejgxj3uzi63isfxovp2biecb4bspbjrze7@az2pljabhnff	2025-12-03 18:38:20 -05:00
Tom Lane	8d61228717	Make stats_ext test faster under cache-clobbering test conditions. Commit `1eccb9315` added a test case that will cause a large number of evaluations of a plpgsql function. With -DCLOBBER_CACHE_ALWAYS, that takes an unreasonable amount of time (hours) because the function's cache entries are repeatedly deleted and rebuilt. That doesn't add any useful test coverage --- other test cases already exercise plpgsql well enough --- and it's not part of what this test intended to cover. We can get the same planner coverage, if not more, by making the test directly invoke numeric_lt(). Reported-by: Tomas Vondra <tomas@vondra.me> Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/baf1ae02-83bd-4f5d-872a-1d04f11a9073@vondra.me	2025-12-03 13:23:50 -05:00
Heikki Linnakangas	7b81be9b42	Add test for multixid wraparound Author: Andrey Borodin <amborodin@acm.org> Discussion: https://www.postgresql.org/message-id/7de697df-d74d-47db-9f73-e069b7349c4b@iki.fi	2025-12-03 19:39:34 +02:00
Heikki Linnakangas	789d65364c	Set next multixid's offset when creating a new multixid With this commit, the next multixid's offset will always be set on the offsets page, by the time that a backend might try to read it, so we no longer need the waiting mechanism with the condition variable. In other words, this eliminates "corner case 2" mentioned in the comments. The waiting mechanism was broken in a few scenarios: - When nextMulti was advanced without WAL-logging the next multixid. For example, if a later multixid was already assigned and WAL-logged before the previous one was WAL-logged, and then the server crashed. In that case the next offset would never be set in the offsets SLRU, and a query trying to read it would get stuck waiting for it. Same thing could happen if pg_resetwal was used to forcibly advance nextMulti. - In hot standby mode, a deadlock could happen where one backend waits for the next multixid assignment record, but WAL replay is not advancing because of a recovery conflict with the waiting backend. The old TAP test used carefully placed injection points to exercise the old waiting code, but now that the waiting code is gone, much of the old test is no longer relevant. Rewrite the test to reproduce the IPC/MultixactCreation hang after crash recovery instead, and to verify that previously recorded multixids stay readable. Backpatch to all supported versions. In back-branches, we still need to be able to read WAL that was generated before this fix, so in the back-branches this includes a hack to initialize the next offsets page when replaying XLOG_MULTIXACT_CREATE_ID for the last multixid on a page. On 'master', bump XLOG_PAGE_MAGIC instead to indicate that the WAL is not compatible. Author: Andrey Borodin <amborodin@acm.org> Reviewed-by: Dmitry Yurichev <dsy.075@yandex.ru> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Reviewed-by: Ivan Bykov <i.bykov@modernsys.ru> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://www.postgresql.org/message-id/172e5723-d65f-4eec-b512-14beacb326ce@yandex.ru Backpatch-through: 14	2025-12-03 19:15:08 +02:00
Álvaro Herrera	be25c77677	Put back alternative-output expected files These were removed in `5dee7a603f`, but that was too optimistic, per buildfarm member prion as reported by Tom Lane. Mea (Álvaro's) culpa. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/570630.1764737028@sss.pgh.pa.us	2025-12-03 16:37:06 +01:00
Peter Eisentraut	8c6bbd674e	Use more appropriate DatumGet* function Use DatumGetCString() instead of DatumGetPointer() for returning a C string. Right now, they are the same, but that doesn't always have to be so. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Reviewed-by: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Discussion: https://www.postgresql.org/message-id/4154950a-47ae-4223-bd01-1235cc50e933%40eisentraut.org	2025-12-03 08:52:28 +01:00
Heikki Linnakangas	cbe04e5d72	Fix amcheck's handling of half-dead B-tree pages amcheck incorrectly reported the following error if there were any half-dead pages in the index: ERROR: mismatch between parent key and child high key in index "amchecktest_id_idx" It's expected that a half-dead page does not have a downlink in the parent level, so skip the test. Reported-by: Konstantin Knizhnik <knizhnik@garret.ru> Reviewed-by: Peter Geoghegan <pg@bowt.ie> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://www.postgresql.org/message-id/33e39552-6a2a-46f3-8b34-3f9f8004451f@garret.ru Backpatch-through: 14	2025-12-02 21:11:15 +02:00
Heikki Linnakangas	c085aab278	Add a test for half-dead pages in B-tree indexes To increase our test coverage in general, and because I will use this in the next commit to test a bug we currently have in amcheck. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/33e39552-6a2a-46f3-8b34-3f9f8004451f@garret.ru	2025-12-02 21:11:05 +02:00
Heikki Linnakangas	6c05ef5729	Fix amcheck's handling of incomplete root splits in B-tree When the root page is being split, it's normal that root page according to the metapage is not marked BTP_ROOT. Fix bogus error in amcheck about that case. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/abd65090-5336-42cc-b768-2bdd66738404@iki.fi Backpatch-through: 14	2025-12-02 21:10:51 +02:00
Heikki Linnakangas	1e4e5783e7	Add a test for incomplete splits in B-tree indexes To increase our test coverage in general, and because I will add onto this in the next commit to also test amcheck with incomplete splits. This is copied from the similar test we had for GIN indexes. B-tree's incomplete splits work similarly to GIN's, so with small changes, the same test works for B-tree too. Reviewed-by: Peter Geoghegan <pg@bowt.ie> Discussion: https://www.postgresql.org/message-id/abd65090-5336-42cc-b768-2bdd66738404@iki.fi	2025-12-02 21:10:47 +02:00
Nathan Bossart	f894acb24a	Show size of DSAs and dshashes in pg_dsm_registry_allocations. Presently, this view reports NULL for the size of DSAs and dshash tables because 1) the current backend might not be attached to them and 2) the registry doesn't save the pointers to the dsa_area or dshash_table in local memory. Also, the view doesn't show partially-initialized entries to avoid ambiguity, since those entries would report a NULL size as well. This commit introduces a function that looks up the size of a DSA given its handle (transiently attaching to the control segment if needed) and teaches pg_dsm_registry_allocations to use it to show the size of successfully-initialized DSA and dshash entries. Furthermore, the view now reports partially-initialized entries with a NULL size. Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aSeEDeznAsHR1_YF%40nathan	2025-12-02 10:29:45 -06:00
Álvaro Herrera	5dee7a603f	Avoid use of NOTICE to wait for snapshot invalidation This idea (implemented in commits and `bc32a12e0d` and `9e8fa05d34`) of using notices to detect that a session is sleeping was unreliable, so simplify the concurrency controller session to just look at pg_stat_activity for a process sleeping on the injection point we want it to hit. This change allows us to remove a secondary injection point and the alternative expected output files. Reproduced by Alexander Lakhin following a report in buildfarm member skink (which runs the server under valgrind). Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reported-by: Alexander Lakhin <exclusion@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/3e302c96-cdd2-45ec-af84-03dbcdccde4a@gmail.com	2025-12-02 16:43:27 +01:00
Álvaro Herrera	90eae926ab	Fix ON CONFLICT with REINDEX CONCURRENTLY and partitions When planning queries with ON CONFLICT on partitioned tables, the indexes to consider as arbiters for each partition are determined based on those found in the parent table. However, it's possible for an index on a partition to be reindexed, and in that case, the auxiliary indexes created on the partition must be considered as arbiters as well; failing to do that may result in spurious "duplicate key" errors given sufficient bad luck. We fix that in this commit by matching every index that doesn't have a parent to each initially-determined arbiter index. Every unparented matching index is considered an additional arbiter index. Closely related to the fixes in `bc32a12e0d` and `2bc7e886fc`, and for identical reasons, not backpatched (for now) even though it's a longstanding issue. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com	2025-12-02 13:51:53 +01:00
Peter Eisentraut	4f941d432b	Remove useless casting to same type This removes some casts where the input already has the same type as the type specified by the cast. Their presence could cause risks of hiding actual type mismatches in the future or silently discarding qualifiers. It also improves readability. Same kind of idea as `7f798aca1d` and `ef8fe69360`. (This does not change all such instances, but only those hand-picked by the author.) Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Peter Eisentraut <peter@eisentraut.org> Discussion: https://www.postgresql.org/message-id/flat/aSQy2JawavlVlEB0%40ip-10-97-1-34.eu-west-3.compute.internal	2025-12-02 10:09:32 +01:00
Álvaro Herrera	2bc7e886fc	Fix ON CONFLICT ON CONSTRAINT during REINDEX CONCURRENTLY When REINDEX CONCURRENTLY is processing the index that supports a constraint, there are periods during which multiple indexes match the constraint index's definition. Those must all be included in the set of inferred index for INSERT ON CONFLICT, in order to avoid spurious "duplicate key" errors. To fix, we set things up to match all indexes against attributes, expressions and predicates of the constraint index, then return all indexes that match those, rather than just the one constraint index. This is more onerous than before, where we would just test the named constraint for validity, but it's not more onerous than processing "conventional" inference (where a list of attribute names etc is given). This is closely related to the misbehaviors fixed by `bc32a12e0d`, for a different situation. We're not backpatching this one for now either, for the same reasons. Author: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Discussion: https://postgr.es/m/CANtu0ojXmqjmEzp-=aJSxjsdE76iAsRgHBoK0QtYHimb_mEfsg@mail.gmail.com	2025-12-01 17:34:13 +01:00
Dean Rasheed	3881561d77	Avoid rewriting data-modifying CTEs more than once. Formerly, when updating an auto-updatable view, or a relation with rules, if the original query had any data-modifying CTEs, the rewriter would rewrite those CTEs multiple times as RewriteQuery() recursed into the product queries. In most cases that was harmless, because RewriteQuery() is mostly idempotent. However, if the CTE involved updating an always-generated column, it would trigger an error because any subsequent rewrite would appear to be attempting to assign a non-default value to the always-generated column. This could perhaps be fixed by attempting to make RewriteQuery() fully idempotent, but that looks quite tricky to achieve, and would probably be quite fragile, given that more generated-column-type features might be added in the future. Instead, fix by arranging for RewriteQuery() to rewrite each CTE exactly once (by tracking the number of CTEs already rewritten as it recurses). This has the advantage of being simpler and more efficient, but it does make RewriteQuery() dependent on the order in which rewriteRuleAction() joins the CTE lists from the original query and the rule action, so care must be taken if that is ever changed. Reported-by: Bernice Southey <bernice.southey@gmail.com> Author: Bernice Southey <bernice.southey@gmail.com> Author: Dean Rasheed <dean.a.rasheed@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CAEDh4nyD6MSH9bROhsOsuTqGAv_QceU_GDvN9WcHLtZTCYM1kA@mail.gmail.com Backpatch-through: 14	2025-11-29 12:28:59 +00:00
Amit Kapila	e68b6adad9	Add slotsync_skip_reason column to pg_replication_slots view. Introduce a new column, slotsync_skip_reason, in the pg_replication_slots view. This column records the reason why the last slot synchronization was skipped. It is primarily relevant for logical replication slots on standby servers where the 'synced' field is true. The value is NULL when synchronization succeeds. Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: shveta malik <shveta.malik@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Hou Zhijie <houzj.fnst@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/CAE9k0PkhfKrTEAsGz4DjOhEj1nQ+hbQVfvWUxNacD38ibW3a1g@mail.gmail.com	2025-11-28 05:21:35 +00:00
Tom Lane	5528e8d104	Allow indexscans on partial hash indexes with implied quals. Normally, if a WHERE clause is implied by the predicate of a partial index, we drop that clause from the set of quals used with the index, since it's redundant to test it if we're scanning that index. However, if it's a hash index (or any !amoptionalkey index), this could result in dropping all available quals for the index's first key, preventing us from generating an indexscan. It's fair to question the practical usefulness of this case. Since hash only supports equality quals, the situation could only arise if the index's predicate is "WHERE indexkey = constant", implying that the index contains only one hash value, which would make hash a really poor choice of index type. However, perhaps there are other !amoptionalkey index AMs out there with which such cases are more plausible. To fix, just don't filter the candidate indexquals this way if the index is !amoptionalkey. That's a bit hokey because it may result in testing quals we didn't need to test, but to do it more accurately we'd have to redundantly identify which candidate quals are actually usable with the index, something we don't know at this early stage of planning. Doesn't seem worth the effort. Reported-by: Sergei Glukhov <s.glukhov@postgrespro.ru> Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/e200bf38-6b45-446a-83fd-48617211feff@postgrespro.ru Backpatch-through: 14	2025-11-27 13:09:59 -05:00
Álvaro Herrera	9e8fa05d34	Fix new test for CATCACHE_FORCE_RELEASE builds Two of the isolation tests introduce by commit `bc32a12e0d` had a problem under CATCACHE_FORCE_RELEASE, as evidenced by buildfarm member prion. An injection point is hit ahead of what the test spec expects, so a session goes to sleep and there's no one there to wait it up. Fix in the simplest possible way, which is to conditionally wake the process up if it's waiting. An alternative output file is necessary to cover both cases. This suggests a couple of possible improvements to the injection points infrastructure: a conditional wakeup (doing nothing if no one is sleeping, as opposed to throwing an error), as well as a way to attach to a point in "deactivated" mode, activated later. Author: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Mihail Nikalayeu <mihailnikalayeu@gmail.com> Discussion: https://postgr.es/m/202511261817.fyixgtt3hqdr@alvherre.pgsql	2025-11-27 13:10:56 +01:00
Amit Langote	519fa0433b	Fix error reporting for SQL/JSON path type mismatches transformJsonFuncExpr() used exprType()/exprLocation() on the possibly coerced path expression, which could be NULL when coercion to jsonpath failed, leading to "cache lookup failed for type 0" errors. Preserve the original expression node so that type and location in the "must be of type jsonpath" error are reported correctly. Add regression tests to cover these cases. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Kirill Reshke <reshkekirill@gmail.com> Discussion: https://postgr.es/m/CACJufxHunVg81JMuNo8Yvv_hJD0DicgaVN2Wteu8aJbVJPBjZA@mail.gmail.com Backpatch-through: 17	2025-11-27 12:07:01 +09:00
David Rowley	0ca3b16973	Add parallelism support for TID Range Scans In v14, `bb437f995` added support for scanning for ranges of TIDs using a dedicated executor node for the purpose. Here, we allow these scans to be parallelized. The range of blocks to scan is divvied up similarly to how a Parallel Seq Scans does that, where 'chunks' of blocks are allocated to each worker and the size of those chunks is slowly reduced down to 1 block per worker by the time we're nearing the end of the scan. Doing that means workers finish at roughly the same time. Allowing TID Range Scans to be parallelized removes the dilemma from the planner as to whether a Parallel Seq Scan will cost less than a non-parallel TID Range Scan due to the CPU concurrency of the Seq Scan (disk costs are not divided by the number of workers). It was possible the planner could choose the Parallel Seq Scan which would result in reading additional blocks during execution than the TID Scan would have. Allowing Parallel TID Range Scans removes the trade-off the planner makes when choosing between reduced CPU costs due to parallelism vs additional I/O from the Parallel Seq Scan due to it scanning blocks from outside of the required TID range. There is also, of course, the traditional parallelism performance benefits to be gained as well, which likely doesn't need to be explained here. Author: Cary Huang <cary.huang@highgo.ca> Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Rafia Sabih <rafia.pghackers@gmail.com> Reviewed-by: Steven Niu <niushiji@gmail.com> Discussion: https://postgr.es/m/18f2c002a24.11bc2ab825151706.3749144144619388582@highgo.ca	2025-11-27 14:05:04 +13:00

1 2 3 4 5 ...

8620 Commits