postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-04 20:11:56 +03:00

Author	SHA1	Message	Date
Andres Freund	3f059dd397	ci: Fix Windows and MinGW task names They use Windows Server 2022, not 2019. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/flat/CAN55FZ1OsaM+852BMQDJ+Kgfg+07knJ6dM3PjbGbtYaK4qwfqA@mail.gmail.com	2025-10-30 13:07:03 -04:00
Michael Paquier	f3fb6bc9fe	Fix regression with slot invalidation checks This commit reverts `818fefd8fd`, that has been introduced to address a an instability in some of the TAP tests due to the presence of random standby snapshot WAL records, when slots are invalidated by InvalidatePossiblyObsoleteSlot(). Anyway, this commit had also the consequence of introducing a behavior regression. After `818fefd8fd`, the code may determine that a slot needs to be invalidated while it may not require one: the slot may have moved from a conflicting state to a non-conflicting state between the moment when the mutex is released and the moment when we recheck the slot, in InvalidatePossiblyObsoleteSlot(). Hence, the invalidations may be more aggressive than they actually have to. `105b2cb336` has tackled the test instability in a way that should be hopefully sufficient for the buildfarm, even for slow members: - In v18, the test relies on an injection point that bypasses the creation of the random records generated for standby snapshots, eliminating the random factor that impacted the test. This option was not available when `818fefd8fd` was discussed. - In v16 and v17, the problem was bypassed by disallowing a slot to become active in some of the scenarios tested. While on it, this commit adds a comment to document that it is fine for a recheck to use xmin and LSN values stored in the slot, without storing and reusing them across multiple checks. Reported-by: "suyu.cmj" <mengjuan.cmj@alibaba-inc.com> Author: Bertrand Drouvot <bertranddrouvot.pg@gmail.com> Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/f492465f-657e-49af-8317-987460cb68b0.mengjuan.cmj@alibaba-inc.com Backpatch-through: 16	2025-10-30 13:13:34 +09:00
David Rowley	bd6f986c9e	Fix bogus use of "long" in AllocSetCheck() Because long is 32-bit on 64-bit Windows, it isn't a good datatype to store the difference between 2 pointers. The under-sized type could overflow and lead to scary warnings in MEMORY_CONTEXT_CHECKING builds, such as: WARNING: problem in alloc set ExecutorState: bad single-chunk %p in block %p However, the problem lies only in the code running the check, not from an actual memory accounting bug. Fix by using "Size" instead of "long". This means using an unsigned type rather than the previous signed type. If the block's freeptr was corrupted, we'd still catch that if the unsigned type wrapped. Unsigned allows us to avoid further needless complexities around comparing signed and unsigned types. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvo-RmiT4s33J=aC9C_-wPZjOXQ232V-EZFgKftSsNRi4w@mail.gmail.com	2025-10-30 14:49:42 +13:00
Amit Kapila	0024f5a102	Fix GUC check_hook validation for synchronized_standby_slots. Previously, the check_hook for synchronized_standby_slots attempted to validate that each specified slot existed and was physical. However, these checks were not performed during server startup. As a result, if users configured non-existent slots before startup, the misconfiguration would go undetected initially. This could later cause parallel query failures, as newly launched workers would detect the issue and raise an ERROR. This patch improves the check_hook by validating the syntax and format of slot names. Validation of slot existence and type is deferred to the WAL sender process, aligning with the behavior of the check_hook for primary_slot_name. Reported-by: Fabrice Chapuis <fabrice636861@gmail.com> Author: Shlok Kyal <shlok.kyal.oss@gmail.com> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com> Reviewed-by: Rahila Syed <rahilasyed90@gmail.com> Backpatch-through: 17, where it was introduced Discussion: https://postgr.es/m/CAA5-nLCeO4MQzWipCXH58qf0arruiw0OeUc1+Q=Z=4GM+=v1NQ@mail.gmail.com	2025-10-27 06:34:29 +00:00
David Rowley	0d30746153	Fix incorrect logic for caching ResultRelInfos for triggers When dealing with ResultRelInfos for partitions, there are cases where there are mixed requirements for the ri_RootResultRelInfo. There are cases when the partition itself requires a NULL ri_RootResultRelInfo and in the same query, the same partition may require a ResultRelInfo with its parent set in ri_RootResultRelInfo. This could cause the column mapping between the partitioned table and the partition not to be done which could result in crashes if the column attnums didn't match exactly. The fix is simple. We now check that the ri_RootResultRelInfo matches what the caller passed to ExecGetTriggerResultRel() and only return a cached ResultRelInfo when the ri_RootResultRelInfo matches what the caller wants, otherwise we'll make a new one. Author: David Rowley <dgrowleyml@gmail.com> Author: Amit Langote <amitlangote09@gmail.com> Reported-by: Dmitry Fomin <fomin.list@gmail.com> Discussion: https://postgr.es/m/7DCE78D7-0520-4207-822B-92F60AEA14B4@gmail.com Backpatch-through: 15	2025-10-26 11:01:53 +13:00
Daniel Gustafsson	256698d267	doc: Remove mention of Git protocol support The project Git server hasn't supported cloning with the Git protocol in a very long time, but the documentation never got the memo. Remove the mention of using the Git protocol, and while there wrap a mention of Git in <productname> tags. Backpatch down to all supported versions. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Jacob Champion <jacob.champion@enterprisedb.com> Reviewed-by: Gurjeet Singh <gurjeet@singh.im> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CABwTF4WMiMb-KT2NRcib5W0C8TQF6URMb+HK9a_=rnZnY8Q42w@mail.gmail.com Backpatch-through: 13	2025-10-23 21:26:15 +02:00
Tom Lane	39d24475c1	Fix off-by-one Asserts in FreePageBtreeInsertInternal/Leaf. These two functions expect there to be room to insert another item in the FreePageBtree's array, but their assertions were too weak to guarantee that. This has little practical effect granting that the callers are not buggy, but it seems to be misleading late-model Coverity into complaining about possible array overrun. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/799984.1761150474@sss.pgh.pa.us Backpatch-through: 13	2025-10-23 12:32:26 -04:00
Tom Lane	fbc41a145a	Fix resource leaks in PL/Python error reporting, redux. Commit `c6f7f11d8` intended to prevent leaking any PyObject reference counts in edge cases (such as out-of-memory during string construction), but actually it introduced a leak in the normal case. Repeating an error-trapping operation often enough would lead to session-lifespan memory bloat. The problem is that I failed to think about the fact that PyObject_GetAttrString() increments the refcount of the returned PyObject, so that simply walking down the list of error frame objects causes all but the first one to have their refcount incremented. I experimented with several more-or-less-complex ways around that, and eventually concluded that the right fix is simply to drop the newly-obtained refcount as soon as we walk to the next frame object in PLy_traceback. This sounds unsafe, but it's perfectly okay because the caller holds a refcount on the first frame object and each frame object holds a refcount on the next one; so the current frame object can't disappear underneath us. By the same token, we can simplify the caller's cleanup back to simply dropping its refcount on the first object. Cleanup of each frame object will lead in turn to the refcount of the next one going to zero. I also added a couple of comments explaining why PLy_elog_impl() doesn't try to free the strings acquired from PLy_get_spi_error_data() or PLy_get_error_data(). That's because I got here by looking at a Coverity complaint about how those strings might get leaked. They are not leaked, but in testing that I discovered this other leak. Back-patch, as `c6f7f11d8` was. It's a bit nervous-making to be putting such a fix into v13, which is only a couple weeks from its final release; but I can't see that leaving a recently-introduced leak in place is a better idea. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/1203918.1761184159@sss.pgh.pa.us Backpatch-through: 13	2025-10-23 11:47:46 -04:00
Fujii Masao	8ceab82ca5	Add comments explaining overflow entries in the replication lag tracker. Commit `883a95646a` introduced overflow entries in the replication lag tracker to fix an issue where lag columns in pg_stat_replication could stall when the replay LSN stopped advancing. This commit adds comments clarifying the purpose and behavior of overflow entries to improve code readability and understanding. Since commit `883a95646a` was recently applied and backpatched to all supported branches, this follow-up commit is also backpatched accordingly. Author: Xuneng Zhou <xunengzhou@gmail.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/CABPTF7VxqQA_DePxyZ7Y8V+ErYyXkmwJ1P6NC+YC+cvxMipWKw@mail.gmail.com Backpatch-through: 13	2025-10-23 13:26:30 +09:00
Masahiko Sawada	9310be78ae	Add copyright notice to vacuum_horizon_floor.pl test. Fix oversight in commit `303ba0573`, which was backpatched through 14. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAD21AoBeFdTJcwUfUYPcEgONab3TS6i1PB9S5cSXcBAmdAdQKw%40mail.gmail.com Backpatch-through: 14	2025-10-22 17:17:43 -07:00
David Rowley	10945148ee	Fix incorrect zero extension of Datum in JIT tuple deform code When JIT deformed tuples (controlled via the jit_tuple_deforming GUC), types narrower than sizeof(Datum) would be zero-extended up to Datum width. This wasn't the same as what fetch_att() does in the standard tuple deforming code. Logically the values are the same when fetching via the DatumGet*() marcos, but negative numbers are not the same in binary form. In the report, the problem was manifesting itself with: ERROR: could not find memoization table entry in a query which had a "Cache Mode: binary" Memoize node. However, it's currently unclear what else is affected. Anything that uses datum_image_eq() or datum_image_hash() on a Datum from a tuple deformed by JIT could be affected, but it may not be limited to that. The fix for this is simple: use signed extension instead of zero extension. Many thanks to Emmanuel Touzery for reporting this issue and providing steps and backup which allowed the problem to easily be recreated. Reported-by: Emmanuel Touzery <emmanuel.touzery@plandela.si> Author: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/DB8P194MB08532256D5BAF894F241C06393F3A@DB8P194MB0853.EURP194.PROD.OUTLOOK.COM Backpatch-through: 13	2025-10-23 13:12:49 +13:00
Tom Lane	4eb6992af1	Fix memory leaks in pg_combinebackup/reconstruct.c. One code path forgot to free the separately-malloc'd filename part of a struct rfile. Another place freed the filename but forgot the struct rfile itself. These seem worth fixing because with a large backup we could be dealing with many files. Coverity found the bug in make_rfile(). I found the other one by manual inspection.	2025-10-22 13:38:37 -04:00
Tom Lane	5c659da980	Add commit `24f6c1bd4` to v17 .abi-compliance-history. Per buildfarm. The breakage was discussed and deemed acceptable in the patch thread, but it was not mentioned in the commit log, so I missed spotting this one yesterday. Discussion: https://postgr.es/m/787065.1761144050@sss.pgh.pa.us Backpatch-through: 17 only	2025-10-22 11:13:00 -04:00
Fujii Masao	1db2870bb5	Make invalid primary_slot_name follow standard GUC error reporting. Previously, if primary_slot_name was set to an invalid slot name and the configuration file was reloaded, both the postmaster and all other backend processes reported a WARNING. With many processes running, this could produce a flood of duplicate messages. The problem was that the GUC check hook for primary_slot_name reported errors at WARNING level via ereport(). This commit changes the check hook to use GUC_check_errdetail() and GUC_check_errhint() for error reporting. As with other GUC parameters, this causes non-postmaster processes to log the message at DEBUG3, so by default, only the postmaster's message appears in the log file. Backpatch to all supported versions. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de> Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com> Discussion: https://postgr.es/m/CAHGQGwFud-cvthCTfusBfKHBS6Jj6kdAPTdLWKvP2qjUX6L_wA@mail.gmail.com Backpatch-through: 13	2025-10-22 20:11:47 +09:00
Fujii Masao	62d5ee75bb	Fix stalled lag columns in pg_stat_replication when replay LSN stops advancing. Previously, when the replay LSN reported in feedback messages from a standby stopped advancing, for example, due to a recovery conflict, the write_lag and flush_lag columns in pg_stat_replication would initially update but then stop progressing. This prevented users from correctly monitoring replication lag. The problem occurred because when any LSN stopped updating, the lag tracker's cyclic buffer became full (the write head reached the slowest read head). In that state, the lag tracker could no longer compute round-trip lag values correctly. This commit fixes the issue by handling the slowest read entry (the one causing the buffer to fill) as a separate overflow entry and freeing space so the write and other read heads can continue advancing in the buffer. As a result, write_lag and flush_lag now continue updating even if the reported replay LSN remains stalled. Backpatch to all supported versions. Author: Fujii Masao <masao.fujii@gmail.com> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Shinya Kato <shinya11.kato@gmail.com> Reviewed-by: Xuneng Zhou <xunengzhou@gmail.com> Discussion: https://postgr.es/m/CAHGQGwGdGQ=1-X-71Caee-LREBUXSzyohkoQJd4yZZCMt24C0g@mail.gmail.com Backpatch-through: 13	2025-10-22 11:28:57 +09:00
Nathan Bossart	a91f2d8225	Add .abi-compliance-history to back-branches. This file was previously added to v18 by commits `a72f7d97be` and `93fb76ca4e`. Unlike the v18 version of the file, the back-branch versions set the original baseline point to the most recent ABI break documented in the git commit history. While we'd ordinarily set it to something just before the .0 release, we're unlikely to act upon ABI breaks in released minor versions, so it doesn't seem worth the trouble to construct a comprehensive history. Discussion: https://postgr.es/m/aPfDOD6F4FaJJd7M%40nathan Backpatch-through: 13-17	2025-10-21 16:37:29 -05:00
Nathan Bossart	e9514a2f73	Add previous commit to .git-blame-ignore-revs. Backpatch-through: 13	2025-10-21 10:02:19 -05:00
Nathan Bossart	c089712517	Re-pgindent brin.c. Backpatch-through: 13	2025-10-21 09:56:26 -05:00
David Rowley	c4f5a59ab1	Fix BRIN 32-bit counter wrap issue with huge tables A BlockNumber (32-bit) might not be large enough to add bo_pagesPerRange to when the table contains close to 2^32 pages. At worst, this could result in a cancellable infinite loop during the BRIN index scan with power-of-2 pagesPerRange, and slow (inefficient) BRIN index scans and scanning of unneeded heap blocks for non power-of-2 pagesPerRange. Backpatch to all supported versions. Author: sunil s <sunilfeb26@gmail.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CAOG6S4-tGksTQhVzJM19NzLYAHusXsK2HmADPZzGQcfZABsvpA@mail.gmail.com Backpatch-through: 13	2025-10-21 20:47:10 +13:00
Michael Paquier	199a0ecbcf	Fix POSIX compliance in pgwin32_unsetenv() for "name" argument pgwin32_unsetenv() (compatibility routine of unsetenv() on Windows) lacks the input validation that its sibling pgwin32_setenv() has. Without these checks, calling unsetenv() with incorrect names crashes on WIN32. However, invalid names should be handled, failing on EINVAL. This commit adds the same checks as setenv() to fail with EINVAL for a "name" set to NULL, an empty string, or if '=' is included in the value, per POSIX requirements. Like `7ca37fb040`, backpatch down to v14. pgwin32_unsetenv() is defined on REL_13_STABLE, but with the branch going EOL soon and the lack of setenv() there for WIN32, nothing is done for v13. Author: Bryan Green <dbryan.green@gmail.com> Discussion: https://postgr.es/m/b6a1e52b-d808-4df7-87f7-2ff48d15003e@gmail.com Backpatch-through: 14	2025-10-21 08:08:35 +09:00
Tom Lane	2fa8f94db1	Fix thinko in commit `7d129ba54`. The revised logic in 001_ssltests.pl would fail if openssl doesn't work or if Perl is a 32-bit build, because it had already overwritten $serialno with something inappropriate to use in the eventual match. We could go back to the previous code layout, but it seems best to introduce a separate variable for the output of openssl. Per failure on buildfarm member mamba, which has a 32-bit Perl.	2025-10-20 08:45:57 -04:00
Tom Lane	2efca16339	Don't rely on zlib's gzgetc() macro. It emerges that zlib's configuration logic is not robust enough to guarantee that the macro will have the same ideas about struct field layout as the library itself does, leading to corruption of zlib's state struct followed by unintelligible failure messages. This hazard has existed for a long time, but we'd not noticed for several reasons: (1) We only use gzgetc() when trying to read a manually-compressed TOC file within a directory-format dump, which is a rarely-used scenario that we weren't even testing before `20ec99589`. (2) No corruption actually occurs unless sizeof(long) is different from sizeof(off_t) and the platform is big-endian. (3) Some platforms have already fixed the configuration instability, at least sufficiently for their environments. Despite (3), it seems foolish to assume that the problem isn't going to be present in some environments for a long time to come. Hence, avoid relying on this macro. We can just #undef it and fall back on the underlying function of the same name. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/2122679.1760846783@sss.pgh.pa.us Backpatch-through: 13	2025-10-19 14:36:58 -04:00
Tom Lane	d4e8c37cca	Allow role created by new test to log in on Windows. We must tell init about each role name we plan to connect as, else SSPI auth fails. Similar to previous patches such as `14793f471`, `973542866`. Oversight in `208927e65`, per buildfarm member drongo. (Although that was back-patched to v13, the test script only exists in v16 and up.)	2025-10-18 18:36:21 -04:00
Álvaro Herrera	c945b06d5f	Fix determination of not-null constraint "locality" for inherited columns It is possible to have a non-inherited not-null constraint on an inherited column, but we were failing to preserve such constraints during pg_upgrade where the source is 17 or older, because of a bug in the pg_dump query for it. Oversight in commit `14e87ffa5c`. Fix that query. In passing, touch-up a bogus nearby comment introduced by the same commit. In version 17, make the regression tests leave a table in this situation, so that this scenario is tested in the cross-version upgrade tests of 18 and up. Author: Dilip Kumar <dilipbalaut@gmail.com> Reported-by: Andrew Bille <andrewbille@gmail.com> Bug: #19074 Backpatch-through: 18 Discussion: https://postgr.es/m/19074-ae2548458cf0195c@postgresql.org	2025-10-18 18:18:19 +02:00
Álvaro Herrera	7419c99a25	Fix pg_dump sorting of foreign key constraints Apparently, commit `04bc2c42f7` failed to notice that DO_FK_CONSTRAINT objects require identical handling as DO_CONSTRAINT ones, which causes some pg_upgrade tests in debug builds to fail spuriously. Add that. Author: Álvaro Herrera <alvherre@kurilemu.de> Backpatch-through: 13 Discussion: https://postgr.es/m/202510181201.k6y75v2tpf5r@alvherre.pgsql	2025-10-18 17:50:10 +02:00
Nathan Bossart	a0551bc573	Fix privilege checks for pg_prewarm() on indexes. pg_prewarm() currently checks for SELECT privileges on the target relation. However, indexes do not have access rights of their own, so a role may be denied permission to prewarm an index despite having the SELECT privilege on its parent table. This commit fixes this by locking the parent table before the index (to avoid deadlocks) and checking for SELECT on the parent table. Note that the code is largely borrowed from amcheck_lock_relation_and_check(). An obvious downside of this change is the extra AccessShareLock on the parent table during prewarming, but that isn't expected to cause too much trouble in practice. Author: Ayush Vatsa <ayushvatsa1810@gmail.com> Co-authored-by: Nathan Bossart <nathandbossart@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Jeff Davis <pgsql@j-davis.com> Discussion: https://postgr.es/m/CACX%2BKaMz2ZoOojh0nQ6QNBYx8Ak1Dkoko%3DD4FSb80BYW%2Bo8CHQ%40mail.gmail.com Backpatch-through: 13	2025-10-17 11:36:50 -05:00
Daniel Gustafsson	f01c4eb4e9	Avoid warnings in tests when openssl binary isn't available The SSL tests for pg_stat_ssl tries to exactly match the serial from the certificate by extracting it with the openssl binary. If that fails due to the binary not being available, a fallback match is used, but the attempt to execute a missing binary adds a warning to the output which can confuse readers for a failure in the test. Fix by only attempting if the openssl binary was found by autoconf/meson. Backpatch down to v16 where commit `c8e4030d1b` made the test use the OPENSSL variable from autoconf/meson instead of a hard- coded value. Author: Daniel Gustafsson <daniel@yesql.se> Reported-by: Christoph Berg <myon@debian.org> Discussion: https://postgr.es/m/aNPSp1-RIAs3skZm@msg.df7cb.de Backpatch-through: 16	2025-10-17 14:21:26 +02:00
Michael Paquier	61b5fa029a	Fix matching check in recovery test 042_low_level_backup 042_low_level_backup compared the result of a query two times with a comparison operator based on an integer, while the result should be compared with a string. The outcome of the tests is currently not impacted by this change. However, it could be possible that the tests fail to detect future issues if the query results become different, for some reason. Oversight in `99b4a63bef`. Author: Sadhuprasad Patro <b.sadhu@gmail.com> Discussion: https://postgr.es/m/CAFF0-CHhwNx_Cv2uy7tKjODUbeOgPrJpW4Rpf1jqB16_1bU2sg@mail.gmail.com Backpatch-through: 17	2025-10-17 13:06:11 +09:00
Álvaro Herrera	a8933194e5	Fix update-po for the PGXS case The original formulation failed to take into account the fact that for the PGXS case, the source dir is not $(top_srcdir), so it ended up not doing anything. Handle it explicitly. Author: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Bryan Green <dbryan.green@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/TYCPR01MB113164770FB0B0BE6ED21E68EE8DCA@TYCPR01MB11316.jpnprd01.prod.outlook.com	2025-10-16 20:21:05 +02:00
Etsuro Fujita	2bb84ea7ef	Fix EvalPlanQual handling of foreign/custom joins in ExecScanFetch. If inside an EPQ recheck, ExecScanFetch would run the recheck method function for foreign/custom joins even if they aren't descendant nodes in the EPQ recheck plan tree, which is problematic at least in the foreign-join case, because such a foreign join isn't guaranteed to have an alternative local-join plan required for running the recheck method function; in the postgres_fdw case this could lead to a segmentation fault or an assert failure in an assert-enabled build when running the recheck method function. Even if inside an EPQ recheck, any scan nodes that aren't descendant ones in the EPQ recheck plan tree should be normally processed by using the access method function; fix by modifying ExecScanFetch so that if inside an EPQ recheck, it runs the recheck method function for foreign/custom joins that are descendant nodes in the EPQ recheck plan tree as before and runs the access method function for foreign/custom joins that aren't. This fix also adds to postgres_fdw an isolation test for an EPQ recheck that caused issues stated above. Oversight in commit `385f337c9`. Reported-by: Kristian Lejao <kristianlejao@gmail.com> Author: Masahiko Sawada <sawada.mshk@gmail.com> Co-authored-by: Etsuro Fujita <etsuro.fujita@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Reviewed-by: Etsuro Fujita <etsuro.fujita@gmail.com> Discussion: https://postgr.es/m/CAD21AoBpo6Gx55FBOW+9s5X=nUw3Xpq64v35fpDEKsTERnc4TQ@mail.gmail.com Backpatch-through: 13	2025-10-15 17:15:02 +09:00
Michael Paquier	a734677081	Fix version number calculation for data folder flush in pg_combinebackup The version number calculated by read_pg_version_file() is multiplied once by 10000, to be able to do comparisons based on PG_VERSION_NUM or equivalents with a minor version included. However, the version number given sync_pgdata() was multiplied by 10000 a second time, leading to an overestimated number. This issue was harmless (still incorrect) as pg_combinebackup does not support versions of Postgres older than v10, and sync_pgdata() only includes a version check due to the rename of pg_xlog/ to pg_wal/. This folder rename happened in the development cycle of v10. This would become a problem if in the future sync_pgdata() is changed to have more version-specific checks. Oversight in `dc21234005`, so backpatch down to v17. Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/aOil5d0y87ZM_wsZ@paquier.xyz Backpatch-through: 17	2025-10-14 08:31:25 +09:00
Tom Lane	4c53519e15	Fix incorrect message-printing in win32security.c. log_error() would probably fail completely if used, and would certainly print garbage for anything that needed to be interpolated into the message, because it was failing to use the correct printing subroutine for a va_list argument. This bug likely went undetected because the error cases this code is used for are rarely exercised - they only occur when Windows security API calls fail catastrophically (out of memory, security subsystem corruption, etc). The FRONTEND variant can be fixed just by calling vfprintf() instead of fprintf(). However, there was no va_list variant of write_stderr(), so create one by refactoring that function. Following the usual naming convention for such things, call it vwrite_stderr(). Author: Bryan Green <dbryan.green@gmail.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAF+pBj8goe4fRmZ0V3Cs6eyWzYLvK+HvFLYEYWG=TzaM+tWPnw@mail.gmail.com Backpatch-through: 13	2025-10-13 17:56:45 -04:00
David Rowley	f428b1231c	Doc: clarify n_distinct_inherited setting There was some confusion around how to adjust the n_distinct estimates for partitioned tables. Here we try and clarify that n_distinct_inherited needs to be adjusted rather than n_distinct. Also fix some slightly misleading text which was talking about table size rather than table rows, fix a grammatical error, and adjust some text which indicated that ANALYZE was performing calculations based on the n_distinct settings. Really it's the query planner that does this and ANALYZE only stores the overridden n_distinct estimate value in pg_statistic. Author: David Rowley <dgrowleyml@gmail.com> Reviewed-by: David G. Johnston <david.g.johnston@gmail.com> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Backpatch-through: 13 Discussion: https://postgr.es/m/CAApHDvrL7a-ZytM1SP8Uk9nEw9bR2CPzVb+uP+bcNj=_q-ZmVw@mail.gmail.com	2025-10-14 09:25:57 +13:00
Tom Lane	bf18e9bd70	Fix issue with reading zero bytes in Gzip_read. pg_dump expects a read request of zero bytes to be a no-op; see for example ReadStr(). Gzip_read got this wrong and falsely supposed that the resulting gzret == 0 indicated an error. We could complicate that error-checking logic some more, but it seems best to just fall out immediately when passed size == 0. This bug breaks the nominally-supported case of manually gzip'ing the toc.dat file within a directory-style dump, so back-patch to v16 where this code came in. (Prior branches already have a short-circuit for size == 0 before their only gzread call.) Author: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Chao Li <li.evan.chao@gmail.com> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us Backpatch-through: 16	2025-10-13 12:44:20 -04:00
Tom Lane	eac2b1697d	Restore test coverage of LZ4Stream_gets(). In commit `a45c78e32` I removed the only regression test case that reaches this function, because it turns out that we only use it if reading an LZ4-compressed blobs.toc file in a directory dump, and that is a state that has to be created manually. That seems like a bad thing to not test, not so much for LZ4Stream_gets() itself as because it means the squirrely eol_flag logic in LZ4Stream_read_internal() is not tested. The reason for the change was that I thought the lz4 program did not have any way to perform compression without explicit specification of the output file name. However, it turns out that the syntax synopsis in its man page is a lie, and if you read enough of the man page you find out that with "-m" it will do what's needful. So restore the manual compression step in that test case. Noted while testing some proposed changes in pg_dump's compression logic. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/3515357.1760128017@sss.pgh.pa.us Backpatch-through: 17	2025-10-11 16:33:55 -04:00
Álvaro Herrera	ea06f97eeb	Stop creating constraints during DETACH CONCURRENTLY Commit `71f4c8c6f7` (which implemented DETACH CONCURRENTLY) added code to create a separate table constraint when a table is detached concurrently, identical to the partition constraint, on the theory that such a constraint was needed in case the optimizer had constructed any query plans that depended on the constraint being there. However, that theory was apparently bogus because any such plans would be invalidated. For hash partitioning, those constraints are problematic, because their expressions reference the OID of the parent partitioned table, to which the detached table is no longer related; this causes all sorts of problems (such as inability of restoring a pg_dump of that table, and the table no longer working properly if the partitioned table is later dropped). We'd like to get rid of all those constraints. In fact, for branch master, do that -- no longer create any substitute constraints. However, out of fear that some users might somehow depend on these constraints for other partitioning strategies, for stable branches (back to 14, which added DETACH CONCURRENTLY), only do it for hash partitioning. (If you repeatedly DETACH CONCURRENTLY and then ATTACH a partition, then with this constraint addition you don't need to scan the table in the ATTACH step, which presumably is good. But if users really valued this feature, they would have requested that it worked for non-concurrent DETACH also.) Author: Haiyang Li <mohen.lhy@alibaba-inc.com> Reported-by: Fei Changhong <feichanghong@qq.com> Reported-by: Haiyang Li <mohen.lhy@alibaba-inc.com> Backpatch-through: 14 Discussion: https://postgr.es/m/18371-7fef49f63de13f02@postgresql.org Discussion: https://postgr.es/m/19070-781326347ade7c57@postgresql.org	2025-10-11 20:30:12 +02:00
Álvaro Herrera	f9993ac646	dbase_redo: Fix Valgrind-reported memory leak Introduced by my (Álvaro's) commit `9e4f914b5e`, which was itself backpatched to pg10, though only pg15 and up contain the problem because of commit `9c08aea6a3`. This isn't a particularly significant leak, but given the fix is trivial, we might as well backpatch to all branches where it applies, so do that. Author: Nathan Bossart <nathandbossart@gmail.com> Reported-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/x4odfdlrwvsjawscnqsqjpofvauxslw7b4oyvxgt5owoyf4ysn@heafjusodrz7	2025-10-11 16:39:22 +02:00
Peter Geoghegan	ae15cebc20	Remove overzealous _bt_killitems assertion. An assertion in _bt_killitems expected the scan's currPos state to contain a valid LSN, saved from when currPos's page was initially read. The assertion failed to account for the fact that even logged relations can have leaf pages with an invalid LSN when built with wal_level set to "minimal". Remove the faulty assertion. Oversight in commit `e6eed40e` (though note that the assertion was backpatched to stable branches before 18 by commit `7c319f54`). Author: Peter Geoghegan <pg@bowt.ie> Reported-By: Matthijs van der Vleuten <postgresql@zr40.nl> Bug: #19082 Discussion: https://postgr.es/m/19082-628e62160dbbc1c1@postgresql.org Backpatch-through: 13	2025-10-10 14:52:21 -04:00
Michael Paquier	ede2f6b893	Fix two typos in xlogstats.h and xlogstats.c Issue found while browsing this area of the code, introduced and copy-pasted around by `2258e76f90`. Backpatch-through: 15	2025-10-10 11:51:51 +09:00
Michael Paquier	42348839d8	Remove state.tmp when failing to save a replication slot An error happening while a slot data is saved on disk in SaveSlotToPath() could cause a state.tmp file (temporary file holding the slot state data, renamed to its permanent name at the end of the function) to remain around after it has been created. This temporary file is created with O_EXCL, meaning that if an existing state.tmp is found, its creation would fail. This would prevent the slot data to be saved, requiring a manual intervention to remove state.tmp before being able to save again a slot. Possible scenarios where this temporary file could remain on disk is for example a ENOSPC case (no disk space) while writing, syncing or renaming it. The bug reports point to a write failure as the principal cause of the problems. Using O_TRUNC has been argued back in 2019 as a potential solution to discard any temporary file that could exist. This solution was rejected as O_EXCL can also act as a safety measure when saving the slot state, crash recovery offering cleanup guarantees post-crash. This commit uses the alternative approach that has been suggested by Andres Freund back in 2019. When the temporary state file cannot be written, synced, closed or renamed (note: not when created!), an unlink() is used to remove the temporary state file while holding the in-progress I/O LWLock, so as any follow-up attempts to save a slot's data would not choke on an existing file that remained around because of a previous failure. This problem has been reported a few times across the years, going back to 2019, but for some reason I have never come back to do something about it and it has been forgotten. A recent report has reminded me that this was still a problem. Reported-by: Kevin K Biju <kevinkbiju@gmail.com> Reported-by: Sergei Kornilov <sk@zsrv.org> Reported-by: Grigory Smolkin <g.smolkin@postgrespro.ru> Discussion: https://postgr.es/m/CAM45KeHa32soKL_G8Vk38CWvTBeOOXcsxAPAs7Jt7yPRf2mbVA@mail.gmail.com Discussion: https://postgr.es/m/3559061693910326@qy4q4a6esb2lebnz.sas.yp-c.yandex.net Discussion: https://postgr.es/m/08bbfab1-a61d-3750-fc18-4ab2c1aa7f09@postgrespro.ru Backpatch-through: 13	2025-10-10 09:24:50 +09:00
Masahiko Sawada	a61592253e	Fix access-to-already-freed-memory issue in pgoutput. While pgoutput caches relation synchronization information in RelationSyncCache that resides in CacheMemoryContext, each entry's information (such as row filter expressions and column lists) is stored in the entry's private memory context (entry_cxt in RelationSyncEntry), which is a descendant memory context of the decoding context. If a logical decoding invoked via SQL functions like pg_logical_slot_get_binary_changes fails with an error, subsequent logical decoding executions could access already-freed memory of the entry's cache, resulting in a crash. With this change, it's ensured that RelationSyncCache is cleaned up even in error cases by using a memory context reset callback function. Backpatch to 15, where entry_cxt was introduced for column filtering and row filtering. While the backbranches v13 and v14 have a similar issue where RelationSyncCache persists even after an error when pgoutput is used via SQL API, we decided not to backport this fix. This decision was made because v13 is approaching its final minor release, and we won't have an chance to fix any new issues that might arise. Additionally, since using pgoutput via SQL API is not a common use case, the risk outwights the benefit. If we receive bug reports, we can consider backporting the fixes then. Author: vignesh C <vignesh21@gmail.com> Co-authored-by: Masahiko Sawada <sawada.mshk@gmail.com> Reviewed-by: Zhijie Hou <houzj.fnst@fujitsu.com> Reviewed-by: Euler Taveira <euler@eulerto.com> Discussion: https://postgr.es/m/CALDaNm0x-aCehgt8Bevs2cm=uhmwS28MvbYq1=s2Ekf0aDPkOA@mail.gmail.com Backpatch-through: 15	2025-10-09 10:59:31 -07:00
Amit Langote	09f86a42f2	Fix internal error from CollateExpr in SQL/JSON DEFAULT expressions SQL/JSON functions such as JSON_VALUE could fail with "unrecognized node type" errors when a DEFAULT clause contained an explicit COLLATE expression. That happened because assign_collations_walker() could invoke exprSetCollation() on a JsonBehavior expression whose DEFAULT still contained a CollateExpr, which exprSetCollation() does not handle. For example: SELECT JSON_VALUE('{"a":1}', '$.c' RETURNING text DEFAULT 'A' COLLATE "C" ON EMPTY); Fix by validating in transformJsonBehavior() that the DEFAULT expression's collation matches the enclosing JSON expression’s collation. In exprSetCollation(), replace the recursive call on the JsonBehavior expression with an assertion that its collation already matches the target, since the parser now enforces that condition. Reported-by: Jian He <jian.universality@gmail.com> Author: Jian He <jian.universality@gmail.com> Reviewed-by: Amit Langote <amitlangote09@gmail.com> Discussion: https://postgr.es/m/CACJufxHVwYYSyiVQ6o+PsRX6zQ7rAFinh_fv1kCfTsT1xG4Zeg@mail.gmail.com Backpatch-through: 17	2025-10-09 01:07:36 -04:00
Tom Lane	1c4671f7b7	Use SOCK_ERRNO[_SET] in fe-secure-gssapi.c. On Windows, this code did not handle error conditions correctly at all, since it looked at "errno" which is not used for socket-related errors on that platform. This resulted, for example, in failure to connect to a PostgreSQL server with GSSAPI enabled. We have a convention for dealing with this within libpq, which is to use SOCK_ERRNO and SOCK_ERRNO_SET rather than touching errno directly; but the GSSAPI code is a relative latecomer and did not get that memo. (The equivalent backend code continues to use errno, because the backend does this differently. Maybe libpq's approach should be rethought someday.) Apparently nobody tries to build libpq with GSSAPI support on Windows, or we'd have heard about this before, because it's been broken all along. Back-patch to all supported branches. Author: Ning Wu <ning94803@gmail.com> Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAFGqpvg-pRw=cdsUpKYfwY6D3d-m9tw8WMcAEE7HHWfm-oYWvw@mail.gmail.com Backpatch-through: 13	2025-10-05 16:27:47 -04:00
John Naylor	3549ffb6af	Fix reuse-after-free hazard in dead_items_reset In similar vein to commit `ccc8194e42`, a reset instance of a shared memory TID store happened to occupy the same private memory as the old one for the entry point, since the chunk freed after the last round of index vacuuming was put on the context's freelist. The failure to update the vacrel->dead_items pointer was evident by nudging the system to allocate memory in a different area. This was not discovered at the time of the earlier commit since our regression tests didn't cover multiple index passes with parallel vacuum. Backpatch to v17, when TidStore came in. Author: Kevin Oommen Anish <kevin.o@zohocorp.com> Reviewed-by: Richard Guo <guofenglinux@gmail.com> Tested-by: Richard Guo <guofenglinux@gmail.com> Discussion: https://postgr.es/m/199a07cbdfc.7a1c4aac25838.1675074408277594551%40zohocorp.com Backpatch-through: 17	2025-10-03 16:07:34 +07:00
Michael Paquier	de6de069d9	pgbench: Fail cleanly when finding a COPY result state Currently, pgbench aborts when a COPY response is received in readCommandResponse(). However, as PQgetResult() returns an empty result when there is no asynchronous result, through getCopyResult(), the logic done at the end of readCommandResponse() for the error path leads to an infinite loop. This commit forcefully exits the COPY state with PQendcopy() before moving to the error handler when fiding a COPY state, avoiding the infinite loop. The COPY protocol is not supported by pgbench anyway, as an error is assumed in this case, so giving up is better than having the tool be stuck forever. pgbench was interruptible in this state. A TAP test is added to check that an error happens if trying to use COPY. Author: Anthonin Bonnefoy <anthonin.bonnefoy@datadoghq.com> Discussion: https://postgr.es/m/CAO6_XqpHyF2m73ifV5a=5jhXxH2chk=XrgefY+eWWPe2Eft3=A@mail.gmail.com Backpatch-through: 13	2025-10-03 14:04:01 +09:00
Michael Paquier	036decbba2	pgstattuple: Improve reports generated for indexes (hash, gist, btree) pgstattuple checks the state of the pages retrieved for gist and hash using some check functions from each index AM, respectively gistcheckpage() and _hash_checkpage(). When these are called, they would fail when bumping on data that is found as incorrect (like opaque area size not matching, or empty pages), contrary to btree that simply discards these cases and continues to aggregate data. Zero pages can happen after a crash, with these AMs being able to do an internal cleanup when these are seen. Also, sporadic failures are annoying when doing for example a large-scale diagnostic query based on pgstattuple with a join of pg_class, as it forces one to use tricks like quals to discard hash or gist indexes, or use a PL wrapper able to catch errors. This commit changes the reports generated for btree, gist and hash to be more user-friendly; - When seeing an empty page, report it as free space. This new rule applies to gist and hash, and already applied to btree. - For btree, a check based on the size of BTPageOpaqueData is added. - For gist indexes, gistcheckpage() is not called anymore, replaced by a check based on the size of GISTPageOpaqueData. - For hash indexes, instead of _hash_getbuf_with_strategy(), use a direct call to ReadBufferExtended(), coupled with a check based on HashPageOpaqueData. The opaque area size check was already used. - Pages that do not match these criterias are discarded from the stats reports generated. There have been a couple of bug reports over the years that complained about the current behavior for hash and gist, as being not that useful, with nothing being done about it. Hence this change is backpatched down to v13. Reported-by: Noah Misch <noah@leadboat.com> Author: Nitin Motiani <nitinmotiani@google.com> Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com> Discussion: https://postgr.es/m/CAH5HC95gT1J3dRYK4qEnaywG8RqjbwDdt04wuj8p39R=HukayA@mail.gmail.com Backpatch-through: 13	2025-10-02 11:09:12 +09:00
Jacob Champion	c335775908	test_json_parser: Speed up 002_inline.pl Some macOS machines are having trouble with 002_inline, which executes the JSON parser test executables hundreds of times in a nested loop. Both developer machines and buildfarm critters have shown excessive test durations, upwards of 20 seconds. Push the innermost loop of 002_inline, which iterates through differing chunk sizes, down into the test executable. (I'd eventually like to push all of the JSON unit tests down into C, but this is an easy win in the short term.) Testers have reported a speedup between 4-9x. Reported-by: Robert Haas <robertmhaas@gmail.com> Suggested-by: Andres Freund <andres@anarazel.de> Tested-by: Andrew Dunstan <andrew@dunslane.net> Tested-by: Tom Lane <tgl@sss.pgh.pa.us> Tested-by: Robert Haas <robertmhaas@gmail.com> Discussion: https://postgr.es/m/CA%2BTgmobKoG%2BgKzH9qB7uE4MFo-z1hn7UngqAe9b0UqNbn3_XGQ%40mail.gmail.com Backpatch-through: 17	2025-10-01 09:25:21 -07:00
Fujii Masao	a912118c60	pgbench: Fix error reporting in readCommandResponse(). pgbench uses readCommandResponse() to process server responses. When readCommandResponse() encounters an error during a call to PQgetResult() to fetch the current result, it attempts to report it with an additional error message from PQerrorMessage(). However, previously, this extra error message could be lost or become incorrect. The cause was that after fetching the current result (and detecting an error), readCommandResponse() called PQgetResult() again to peek at the next result. This second call could overwrite the libpq connection's error message before the original error was reported, causing the error message retrieved from PQerrorMessage() to be lost or overwritten. This commit fixes the issue by updating readCommandResponse() to use PQresultErrorMessage() instead of PQerrorMessage() to retrieve the error message generated when the PQgetResult() for the current result causes an error, ensuring the correct message is reported. Backpatch to all supported versions. Author: Yugo Nagata <nagata@sraoss.co.jp> Reviewed-by: Chao Li <lic@highgo.com> Reviewed-by: Fujii Masao <masao.fujii@gmail.com> Discussion: https://postgr.es/m/20250925110940.ebacc31725758ec47d5432c6@sraoss.co.jp Backpatch-through: 13	2025-09-30 23:53:46 +09:00
Noah Misch	6778fbca63	Fix StatisticsObjIsVisibleExt() for pg_temp. Neighbor get_statistics_object_oid() ignores objects in pg_temp, as has been the standard for non-relation, non-type namespace searches since CVE-2007-2138. Hence, most operations that name a statistics object correctly decline to map an unqualified name to a statistics object in pg_temp. StatisticsObjIsVisibleExt() did not. Consequently, pg_statistics_obj_is_visible() wrongly returned true for such objects, psql \dX wrongly listed them, and getObjectDescription()-based ereport() and pg_describe_object() wrongly omitted namespace qualification. Any malfunction beyond that would depend on how a human or application acts on those wrong indications. Commit `d99d58cdc8` introduced this. Back-patch to v13 (all supported versions). Reviewed-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/20250920162116.2e.nmisch@google.com Backpatch-through: 13	2025-09-29 11:15:48 -07:00
Tom Lane	3fc9aa5b02	Fix missed copying of groupDistinct in transformPLAssignStmt. Because we failed to do this, DISTINCT in GROUP BY DISTINCT would be ignored in PL/pgSQL assignment statements. It's not surprising that no one noticed, since such statements will throw an error if the query produces more than one row. That eliminates most scenarios where advanced forms of GROUP BY could be useful, and indeed makes it hard even to find a simple test case. Nonetheless it's wrong. This is directly the fault of `be45be9c3` which added the groupDistinct field, but I think much of the blame has to fall on `c9d529848`, in which I incautiously supposed that we'd manage to keep two copies of a big chunk of parse-analysis logic in sync. As a follow-up, I plan to refactor so that there's only one copy. But that seems useful only in master, so let's use this one-line fix for the back branches. Author: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/31027.1758919078@sss.pgh.pa.us Backpatch-through: 14	2025-09-27 14:29:41 -04:00

1 2 3 4 5 ...

59735 Commits