postgres

mirror of https://github.com/postgres/postgres.git synced 2025-10-28 11:55:03 +03:00

Author	SHA1	Message	Date
Tom Lane	3fd334795e	Don't depend on -fwrapv semantics in pgbench's random() function. Instead use the common/int.h functions to check for integer overflow in a more C-standard-compliant fashion. This is motivated by recent failures on buildfarm member moonjelly, where it appears that development-tip gcc is optimizing without regard to the -fwrapv switch. Presumably that's a gcc bug that will be fixed soon, but we might as well install cleaner coding here rather than wait. (This does not address the question of whether we'll ever be able to get rid of using -fwrapv. Testing shows that this spot is the only place where doing so creates visible regression test failures, but unfortunately that proves very little.) Back-patch to v12. The common/int.h functions exist in v11, but that branch doesn't use them in any client-side code. I judge that this case isn't interesting enough in the real world to take even a small risk of issues from being the first such use. Tom Lane and Fabien Coelho Discussion: https://postgr.es/m/73927.1624815543@sss.pgh.pa.us	2021-06-28 12:40:37 -04:00
Amit Kapila	b75c1f6879	Fix race condition in TransactionGroupUpdateXidStatus(). When we cannot immediately acquire CLogControlLock in exclusive mode at commit time, we add ourselves to a list of processes that need their XIDs status update. We do this if the clog page where we need to update the current transaction status is the same as the group leader's clog page, otherwise, we allow the caller to clear it by itself. Now, when we can't add ourselves to any group, we were not clearing the current proc if it has already become a member of some group which was leading to an assertion failure when the same proc was assigned to another backend after the current backend exits. Reported-by: Alexander Lakhin Bug: 17072 Author: Amit Kapila Tested-By: Alexander Lakhin Backpatch-through: 11, where it was introduced Discussion: https://postgr.es/m/17072-2f8764857ef2c92a@postgresql.org	2021-06-28 09:17:10 +05:30
Michael Paquier	ce8949c4b6	Add test for CREATE INDEX CONCURRENTLY with not-so-immutable predicate `83158f7` has improved index_set_state_flags() so as it is possible to use transactional updates when updating pg_index state flags, but there was not really a test case which stressed directly the possibility it fixed. This commit adds such a test, using a predicate that looks valid in appearance but calls a stable function. Author: Andrey Lepikhov Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru Backpatch-through: 9.6	2021-06-28 11:17:16 +09:00
Michael Paquier	e52f7cbec1	Make index_set_state_flags() transactional `3c84046` is the original commit that introduced index_set_state_flags(), where the presence of SnapshotNow made necessary the use of an in-place update. SnapshotNow has been removed in `813fb03`, so there is no actual reasons to not make this operation transactional. As reported by Andrey, it is possible to trigger the assertion of this routine expecting no transactional updates when switching the pg_index state flags, using a predicate mark as immutable but calling stable or volatile functions. `83158f7` has been around for a couple of months on HEAD now with no issues found related to it, so it looks safe enough for a backpatch. Reported-by: Andrey Lepikhov Author: Michael Paquier Reviewed-by: Anastasia Lubennikova Discussion: https://postgr.es/m/20200903080440.GA8559@paquier.xyz Discussion: https://postgr.es/m/9b905019-5297-7372-0ad2-e1a4bb66a719@postgrespro.ru Backpatch-through: 9.6	2021-06-28 10:43:01 +09:00
Tom Lane	bc031cf133	Remove memory leaks in isolationtester. specscanner.l leaked a kilobyte of memory per token of the spec file. Apparently somebody thought that the introductory code block would be executed once; but it's once per yylex() call. A couple of functions in isolationtester.c leaked small amounts of memory due to not bothering to free one-time allocations. Might as well improve these so that valgrind gives this program a clean bill of health. Also get rid of an ugly static variable. Coverity complained about one of the one-time leaks, which led me to try valgrind'ing isolationtester, which led to discovery of the larger leak.	2021-06-27 12:45:04 -04:00
Michael Paquier	aa2734f5bc	Remove some useless logs from the TAP tests of pgbench 002_pgbench_no_server was printing some array pointers instead of the actual contents of those arrays for the expected outputs of stdout and stderr for a tested command. This does not add any new information that can help with debugging as the test names allow to track failure locations, if any. This commit simply removes those logs as the rest of the printed information is redundant with command_checks_all(). Per discussion with Andrew Dunstan and Álvaro Herrera. Discussion: https://postgr.es/m/YNXNFaG7IgkzZanD@paquier.xyz Backpatch-through: 11	2021-06-26 12:40:07 +09:00
Tom Lane	f851696a21	Remove unnecessary failure cases in RemoveRoleFromObjectPolicy(). It's not really necessary for this function to open or lock the relation associated with the pg_policy entry it's modifying. The error checks it's making on the rel are if anything counterproductive (e.g., if we don't want to allow installation of policies on system catalogs, here is not the place to prevent that). In particular, it seems just wrong to insist on an ownership check. That has the net effect of forcing people to use superuser for DROP OWNED BY, which surely is not an effect we want. Also there is no point in rebuilding the dependencies of the policy expressions, which aren't being changed. Lastly, locking the table also seems counterproductive; it's not helping to prevent race conditions, since we failed to re-read the pg_policy row after acquiring the lock. That means that concurrent DDL would likely result in "tuple concurrently updated/deleted" errors; which is the same behavior this code will produce, with less overhead. Per discussion of bug #17062. Back-patch to all supported versions, as the failure cases this eliminates seem just as undesirable in 9.6 as in HEAD. Discussion: https://postgr.es/m/1573181.1624220108@sss.pgh.pa.us	2021-06-25 13:59:38 -04:00
Tom Lane	04a4760572	Make walsenders show their replication commands in pg_stat_activity. A walsender process that has executed a SQL command left the text of that command in pg_stat_activity.query indefinitely, which is quite confusing if it's in RUNNING state but not doing that query. An easy and useful fix is to treat replication commands as if they were SQL queries, and show them in pg_stat_activity according to the same rules as for regular queries. While we're at it, it seems also sensible to set debug_query_string, allowing error logging and debugging to see the replication command. While here, clean up assorted silliness in exec_replication_command: * Clean up SQLCmd code path, and fix its only-accidentally-not-buggy memory management. * Remove useless duplicate call of SnapBuildClearExportedSnapshot(). * replication_scanner_finish() was never called. Back-patch of commit `f560209c6` into v10-v13. I'd originally felt that this didn't merit back-patching, but subsequent confusion while debugging walsender problems suggests that it'll be useful. Also, the original commit has now aged long enough to provide some comfort that it won't cause problems. Discussion: https://postgr.es/m/2673480.1624557299@sss.pgh.pa.us Discussion: https://postgr.es/m/880181.1600026471@sss.pgh.pa.us	2021-06-25 10:46:10 -04:00
Michael Paquier	79c24a0c21	Cleanup some code related to pgbench log checks in TAP tests This fixes a couple of problems within the so-said code of this commit subject: - Replace the use of open() with slurp_file(), fixing an issue reported by buildfarm member fairywren whose perl installation keep around CRLF characters, causing the matching patterns for the logs to fail. - Remove the eval block, which is not really necessary. This set of issues has come into light after fixing a different issue with `c13585fe`, and this is wrong since this code has been introduced. Reported-by: Andrew Dunstan, and buildfarm member fairywren Author: Michael Paquier Reviewed-by: Andrew Dunstan Discussion: https://postgr.es/m/0f49303e-7784-b3ee-200b-cbf67be2eb9e@dunslane.net Backpatch-through: 11	2021-06-25 20:15:36 +09:00
Thomas Munro	47d22649e6	Prepare for forthcoming LLVM 13 API change. LLVM 13 (due out in September) has changed the semantics of LLVMOrcAbsoluteSymbols(), so we need to bump some reference counts to avoid a double-free that causes crashes and bad query results. A proactive change seems necessary to avoid having a window of time where our respective latest releases would interact badly. It's possible that the situation could change before then, though. Thanks to Fabien Coelho for monitoring bleeding edge LLVM and Andres Freund for tracking down the change. Back-patch to 11, where the JIT code arrived. Discussion: https://postgr.es/m/CA%2BhUKGLEy8mgtN7BNp0ooFAjUedDTJj5dME7NxLU-m91b85siA%40mail.gmail.com	2021-06-25 11:29:20 +12:00
Michael Paquier	79ff96aa9d	Fix pattern matching logic for logs in TAP tests of pgbench The logic checking for the format of per-thread logs used grep() with directly "$re", which would cause the test to consider all the logs as a match without caring about their format at all. Using "/$re/" makes grep() perform a regex test, which is what we want here. While on it, improve some of the tests to be more picky with the patterns expected and add more comments to describe the tests. Issue discovered while digging into a separate patch. Author: Fabien Coelho, Michael Paquier Discussion: https://postgr.es/m/YNPsPAUoVDCpPOGk@paquier.xyz Backpatch-through: 11	2021-06-25 06:52:52 +09:00
Tom Lane	7a48dfbb88	Stabilize results of insert-conflict-toast.spec. This back-branch test script was later absorbed into insert-conflict-specconflict.spec, which required some stabilization in commit `741d7f104`, so perhaps it's not surprising that it needs a bit of love too. It's odd though that we hadn't seen it fail before now, because I thought that `741d7f104` did not change isolationtester's timing behavior for scripts without any annotation markers. In any case, this script is racy on its face, so add an annotation to force stable reporting order. Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=piculet&dt=2021-06-24%2009%3A54%3A56 Report: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura&dt=2021-06-24%2010%3A10%3A00	2021-06-24 11:30:32 -04:00
Amit Kapila	72b51e214a	Fix ABI break introduced by commit `4daa140a2f`. Move the newly defined enum value REORDER_BUFFER_CHANGE_INTERNAL_SPEC_ABORT at the end to avoid ABI break in the back branches. We need to back-patch this till v11 because before that it is already at the end. Reported-by: Tomas Vondra Backpatch-through: 11 Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com	2021-06-24 15:25:13 +05:30
Heikki Linnakangas	8b01a403c5	Another fix to relmapper race condition. In previous commit, I missed that relmap_redo() was also not acquiring the RelationMappingLock. Thanks to Thomas Munro for pointing that out. Backpatch-through: 9.6, like previous commit. Discussion: https://www.postgresql.org/message-id/CA%2BhUKGLev%3DPpOSaL3WRZgOvgk217et%2BbxeJcRr4eR-NttP1F6Q%40mail.gmail.com	2021-06-24 11:19:34 +03:00
Heikki Linnakangas	caac196507	Prevent race condition while reading relmapper file. Contrary to the comment here, POSIX does not guarantee atomicity of a read(), if another process calls write() concurrently. Or at least Linux does not. Add locking to load_relmap_file() to avoid the race condition. Fixes bug #17064. Thanks to Alexander Lakhin for the report and test case. Backpatch-through: 9.6, all supported versions. Discussion: https://www.postgresql.org/message-id/17064-bb0d7904ef72add3@postgresql.org	2021-06-24 10:45:43 +03:00
Amit Kapila	a73bd49c69	Doc: Update caveats in synchronous logical replication. Reported-by: Simon Riggs Author: Takamichi Osumi Reviewed-by: Amit Kapila Backpatch-through: 9.6 Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky@alap3.anarazel.de	2021-06-24 09:37:57 +05:30
Tom Lane	35e6b3bbf7	Allow non-quoted identifiers as isolation test session/step names. For no obvious reason, isolationtester has always insisted that session and step names be written with double quotes. This is fairly tedious and does little for test readability, especially since the names that people actually choose almost always look like normal identifiers. Hence, let's tweak the lexer to allow SQL-like identifiers not only double-quoted strings. (They're SQL-like, not exactly SQL, because I didn't add any case-folding logic. Also there's no provision for U&"..." names, not that anyone's likely to care.) There is one incompatibility introduced by this change: if you write "foo""bar" with no space, that used to be taken as two identifiers, but now it's just one identifier with an embedded quote mark. I converted all the src/test/isolation/ specfiles to remove unnecessary double quotes, but stopped there because my eyes were glazing over already. Like `741d7f104`, back-patch to all supported branches, so that this isn't a stumbling block for back-patching isolation test changes. Discussion: https://postgr.es/m/759113.1623861959@sss.pgh.pa.us	2021-06-23 18:41:39 -04:00
Tom Lane	47018af55a	Doc: fix confusion about LEAKPROOF in syntax summaries. The syntax summaries for CREATE FUNCTION and allied commands made it look like LEAKPROOF is an alternative to IMMUTABLE/STABLE/VOLATILE, when of course it is an orthogonal option. Improve that. Per gripe from aazamrafeeque0. Thanks to David Johnston for suggestions. Discussion: https://postgr.es/m/162444349581.694.5818572718530259025@wrigleys.postgresql.org	2021-06-23 14:27:13 -04:00
Tom Lane	7eaf654514	Don't assume GSSAPI result strings are null-terminated. Our uses of gss_display_status() and gss_display_name() assumed that the gss_buffer_desc strings returned by those functions are null-terminated. It appears that they generally are, given the lack of field complaints up to now. However, the available documentation does not promise this, and some man pages for gss_display_status() show examples that rely on the gss_buffer_desc.length field instead of expecting null termination. Also, we now have a report that on some implementations, clang's address sanitizer is of the opinion that the byte after the specified length is undefined. Hence, change the code to rely on the length field instead. This might well be cosmetic rather than fixing any real bug, but it's hard to be sure, so back-patch to all supported branches. While here, also back-patch the v12 changes that made pg_GSS_error deal honestly with multiple messages available from gss_display_status. Per report from Sudheer H R. Discussion: https://postgr.es/m/5372B6D4-8276-42C0-B8FB-BD0918826FC3@tekenlight.com	2021-06-23 14:01:32 -04:00
Tom Lane	d7da3ef089	Improve display of query results in isolation tests. Previously, isolationtester displayed SQL query results using some ad-hoc code that clearly hadn't had much effort expended on it. Field values longer than 14 characters weren't separated from the next field, and usually caused misalignment of the columns too. Also there was no visual separation of a query's result from subsequent isolationtester output. This made test result files confusing and hard to read. To improve matters, let's use libpq's PQprint() function. Although that's long since unused by psql, it's still plenty good enough for the purpose here. Like `741d7f104`, back-patch to all supported branches, so that this isn't a stumbling block for back-patching isolation test changes. Discussion: https://postgr.es/m/582362.1623798221@sss.pgh.pa.us	2021-06-23 11:12:31 -04:00
Tom Lane	f228c401b8	Use annotations to reduce instability of isolation-test results. We've long contended with isolation test results that aren't entirely stable. Some test scripts insert long delays to try to force stable results, which is not terribly desirable; but other erratic failure modes remain, causing unrepeatable buildfarm failures. I've spent a fair amount of time trying to solve this by improving the server-side support code, without much success: that way is fundamentally unable to cope with diffs that stem from chance ordering of arrival of messages from different server processes. We can improve matters on the client side, however, by annotating the test scripts themselves to show the desired reporting order of events that might occur in different orders. This patch adds three types of annotations to deal with (a) test steps that might or might not complete their waits before the isolationtester can see them waiting; (b) test steps in different sessions that can legitimately complete in either order; and (c) NOTIFY messages that might arrive before or after the completion of a step in another session. We might need more annotation types later, but this seems to be enough to deal with the instabilities we've seen in the buildfarm. It also lets us get rid of all the long delays that were previously used, cutting more than a minute off the runtime of the isolation tests. Back-patch to all supported branches, because the buildfarm instabilities affect all the branches, and because it seems desirable to keep isolationtester's capabilities the same across all branches to simplify possible future back-patching of tests. Discussion: https://postgr.es/m/327948.1623725828@sss.pgh.pa.us	2021-06-22 21:43:12 -04:00
Tom Lane	29d5d5761a	Restore the portal-level snapshot for simple expressions, too. Commits `84f5c2908` et al missed the need to cover plpgsql's "simple expression" code path. If the first thing we execute after a COMMIT/ROLLBACK is one of those, rather than a full-fledged SPI command, we must explicitly do EnsurePortalSnapshotExists() to make sure we have an outer snapshot. Note that it wouldn't be good enough to just push a snapshot for the duration of the expression execution: what comes back might be toasted, so we'd better have a snapshot protecting it. The test case demonstrating this fact cheats a bit by marking a SQL function immutable even though it fetches from a table. That's nothing that users haven't been seen to do, though. Per report from Jim Nasby. Back-patch to v11, like the previous fix. Discussion: https://postgr.es/m/378885e4-f85f-fc28-6c91-c4d1c080bf26@amazon.com	2021-06-22 17:48:39 -04:00
Tom Lane	c58a41605f	Fix misbehavior of DROP OWNED BY with duplicate polroles entries. Ordinarily, a pg_policy.polroles array wouldn't list the same role more than once; but CREATE POLICY does not prevent that. If we perform DROP OWNED BY on a role that is listed more than once, RemoveRoleFromObjectPolicy either suffered an assertion failure or encountered a tuple-updated-by-self error. Rewrite it to cope correctly with duplicate entries, and add a CommandCounterIncrement call to prevent the other problem. Per discussion, there's other cleanup that ought to happen here, but this seems like the minimum essential fix. Per bug #17062 from Alexander Lakhin. It's been broken all along, so back-patch to all supported branches. Discussion: https://postgr.es/m/17062-11f471ae3199ca23@postgresql.org	2021-06-18 18:00:09 -04:00
Tom Lane	b2c740c426	Avoid scribbling on input node tree in CREATE/ALTER DOMAIN. This works fine in the "simple Query" code path; but if the statement is in the plan cache then it's corrupted for future re-execution. Apply copyObject() to protect the original tree from modification, as we've done elsewhere. This narrow fix is applied only to the back branches. In HEAD, the problem was fixed more generally by commit 7c337b6b5; but that changed ProcessUtility's API, so it's infeasible to back-patch. Per bug #17053 from Charles Samborski. Discussion: https://postgr.es/m/931771.1623893989@sss.pgh.pa.us Discussion: https://postgr.es/m/17053-3ca3f501bbc212b4@postgresql.org	2021-06-18 12:09:22 -04:00
Andrew Dunstan	6432bfe8a3	Don't set a fast default for anything but a plain table The fast default code added in Release 11 omitted to check that the table a fast default was being added to was a plain table. Thus one could be added to a foreign table, which predicably blows up. Here we perform that check. In addition, on the back branches, since some of these might have escaped into the wild, if we encounter a missing value for an attribute of something other than a plain table we ignore it. Fixes bug #17056 Backpatch to release 11, Reviewed by: Andres Freund, Álvaro Herrera and Tom Lane	2021-06-18 07:46:21 -04:00
Peter Eisentraut	70293e946e	Update plpython_subtransaction alternative expected files The original patch only targeted Python 2.6 and newer, since that is what we have supported in PostgreSQL 13 and newer. For older branches, we need to fix it up for older Python versions.	2021-06-17 16:39:13 +02:00
Heikki Linnakangas	6765cbd22a	Tidy up GetMultiXactIdMembers()'s behavior on error One of the error paths left members uninitialized. That's not a live bug, because most callers don't look at members when the function returns -1, but let's be tidy. One caller, in heap_lock_tuple(), does "if (members != NULL) pfree(members)", but AFAICS it never passes an invalid 'multi' value so it should not reach that error case. The callers are also a bit inconsistent in their expectations. heap_lock_tuple() pfrees the 'members' array if it's not-NULL, others pfree() it if "nmembers >= 0", and others if "nmembers > 0". That's not a live bug either, because the function should never return 0, but add an Assert for that to make it more clear. I left the callers alone for now. I also moved the line where we set *nmembers. It wasn't wrong before, but I like to do that right next to the 'return' statement, to make it clear that it's always set on return. Also remove one unreachable return statement after ereport(ERROR), for brevity and for consistency with the similar if-block right after it. Author: Greg Nancarrow with the additional changes by me Backpatch-through: 9.6, all supported versions	2021-06-17 14:52:48 +03:00
Peter Eisentraut	9438962ce3	Fix subtransaction test for Python 3.10 Starting with Python 3.10, the stacktrace looks differently: - PL/Python function "subtransaction_exit_subtransaction_in_with", line 3, in <module> - s.__exit__(None, None, None) + PL/Python function "subtransaction_exit_subtransaction_in_with", line 2, in <module> + with plpy.subtransaction() as s: Using try/except specifically makes the error look always the same. (See https://github.com/python/cpython/pull/25719 for the discussion of this change in Python.) Author: Honza Horak <hhorak@redhat.com> Discussion: https://www.postgresql.org/message-id/flat/853083.1620749597%40sss.pgh.pa.us RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1959080	2021-06-17 09:02:20 +02:00
Amit Kapila	eb231dbd80	Document a few caveats in synchronous logical replication. In a synchronous logical setup, locking [user] catalog tables can cause deadlock. This is because logical decoding of transactions can lock catalog tables to access them so exclusively locking those in transactions can lead to deadlock. To avoid this users must refrain from having exclusive locks on catalog tables. Author: Takamichi Osumi Reviewed-by: Vignesh C, Amit Kapila Backpatch-through: 9.6 Discussion: https://www.postgresql.org/message-id/20210222222847.tpnb6eg3yiykzpky%40alap3.anarazel.de	2021-06-17 10:40:56 +05:30
Michael Paquier	96f3661e45	Detect unused steps in isolation specs and do some cleanup This is useful for developers to find out if an isolation spec is over-engineered or if it needs more work by warning at the end of a test run if a step is not used, generating a failure with extra diffs. While on it, clean up all the specs which include steps not used in any permutations to simplify them. This is a backpatch of `989d23b` and `06fdc4e`, as it is becoming useful to make all the branches consistent for an upcoming patch that will improve the output generated by isolationtester. Author: Michael Paquier Reviewed-by: Asim Praveen, Melanie Plageman Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz Discussion: https://postgr.es/m/794820.1623872009@sss.pgh.pa.us Backpatch-through: 9.6	2021-06-17 11:57:21 +09:00
Michael Paquier	a8f687927e	Remove dry-run mode from isolationtester The original purpose of the dry-run mode is to be able to print all the possible permutations from a spec file, but it has become less useful since isolation tests have improved regarding deadlock detection as one step not wanted by the author could block indefinitely now (originally the step blocked would have been detected rather quickly). Per discussion, let's remove it. This is a backpatch of `9903338` for 9.6~12. It is proving to become useful to have on those branches so as the code gets consistent across all supported versions, as a matter of improving the output generated by isolationtester. Author: Michael Paquier Reviewed-by: Asim Praveen, Melanie Plageman Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz Discussion: https://postgr.es/m/794820.1623872009@sss.pgh.pa.us Backpatch-through: 9.6	2021-06-17 11:01:16 +09:00
Tom Lane	17d962ccae	Fix plancache refcount leak after error in ExecuteQuery. When stuffing a plan from the plancache into a Portal, one is not supposed to risk throwing an error between GetCachedPlan and PortalDefineQuery; if that happens, the plan refcount incremented by GetCachedPlan will be leaked. I managed to break this rule while refactoring code in `9dbf2b7d7`. There is no visible consequence other than some memory leakage, and since nobody is very likely to trigger the relevant error conditions many times in a row, it's not surprising we haven't noticed. Nonetheless, it's a bug, so rearrange the order of operations to remove the hazard. Noted on the way to looking for a better fix for bug #17053. This mistake is pretty old, so back-patch to all supported branches.	2021-06-16 19:30:17 -04:00
Heikki Linnakangas	f98635ad62	Fix outdated comment that talked about seek position of WAL file. Since commit `c24dcd0cfd`, we have been using pg_pread() to read the WAL file, which doesn't change the seek position (unless we fall back to the implementation in src/port/pread.c). Update comment accordingly. Backpatch-through: 12, where we started to use pg_pread()	2021-06-16 12:37:13 +03:00
Andrew Dunstan	fb3d6b0e1a	Further refinement of stuck_on_old_timeline recovery test TestLib::perl2host can take a file argument as well as a directory argument, so that code becomes substantially simpler. Also add comments on why we're using forward slashes, and why we're setting PERL_BADLANG=0. Discussion: https://postgr.es/m/e9947bcd-20ee-027c-f0fe-01f736b7e345@dunslane.net	2021-06-15 15:36:59 -04:00
Amit Kapila	40ad7ebff6	Fix decoding of speculative aborts. During decoding for speculative inserts, we were relying for cleaning toast hash on confirmation records or next change records. But that could lead to multiple problems (a) memory leak if there is neither a confirmation record nor any other record after toast insertion for a speculative insert in the transaction, (b) error and assertion failures if the next operation is not an insert/update on the same table. The fix is to start queuing spec abort change and clean up toast hash and change record during its processing. Currently, we are queuing the spec aborts for both toast and main table even though we perform cleanup while processing the main table's spec abort record. Later, if we have a way to distinguish between the spec abort record of toast and the main table, we can avoid queuing the change for spec aborts of toast tables. Reported-by: Ashutosh Bapat Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 9.6, where it was introduced Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com	2021-06-15 08:50:12 +05:30
Tom Lane	b7c5823ac3	Work around portability issue with newer versions of mktime(). Recent glibc versions have made mktime() fail if tm_isdst is inconsistent with the prevailing timezone; in particular it fails for tm_isdst = 1 when the zone is UTC. (This seems wildly inconsistent with the POSIX-mandated treatment of "incorrect" values for the other fields of struct tm, so if you ask me it's a bug, but I bet they'll say it's intentional.) This has been observed to cause cosmetic problems when pg_restore'ing an archive created in a different timezone. To fix, do mktime() using the field values from the archive, and if that fails try again with tm_isdst = -1. This will give a result that's off by the UTC-offset difference from the original zone, but that was true before, too. It's not terribly critical since we don't do anything with the result except possibly print it. (Someday we should flush this entire bit of logic and record a standard-format timestamp in the archive instead. That's not okay for a back-patched bug fix, though.) Also, guard our only other use of mktime() by having initdb's build_time_t() set tm_isdst = -1 not 0. This case could only have an issue in zones that are DST year-round; but I think some do exist, or could in future. Per report from Wells Oliver. Back-patch to all supported versions, since any of them might need to run with a newer glibc. Discussion: https://postgr.es/m/CAOC+FBWDhDHO7G-i1_n_hjRzCnUeFO+H-Czi1y10mFhRWpBrew@mail.gmail.com	2021-06-13 14:32:42 -04:00
Andrew Dunstan	914c716ca0	Further tweaks to stuck_on_old_timeline recovery test Translate path slashes on target directory path. This was confusing old branches, but is applied to all branches for the sake of uniformity. Perl is perfectly able to understand paths with forward slashes. Along the way, restore the previous archive_wait query, for the sake of uniformity with other tests, per gripe from Tom Lane.	2021-06-13 07:19:35 -04:00
Michael Paquier	37d4edef00	Ignore more environment variables in pg_regress.c This is similar to the work done in `8279f68` for TestLib.pm, where environment variables set may cause unwanted failures if using a temporary installation with pg_regress. The list of variables reset is adjusted in each stable branch depending on what is supported. Comments are added to remember that the lists in TestLib.pm and pg_regress.c had better be kept in sync. Reviewed-by: Álvaro Herrera Discussion: https://postgr.es/m/YMNR9GYDn+fHlMta@paquier.xyz Backpatch-through: 9.6	2021-06-13 20:07:50 +09:00
Tom Lane	0b619df6da	Restore robustness of TAP tests that wait for postmaster restart. Several TAP tests use poll_query_until() to wait for the postmaster to restart. They were checking to see if a trivial query (e.g. "SELECT 1") succeeds. However, that's problematic in the wake of commit `11e9caff8`, because now that we feed said query to psql via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql quits before reading the query. Hence, we can't use a nonempty query in cases where we need to wait for connection failures to stop happening. Per the precedent of commits `c757a3da0` and `6d41dd045`, we can pass "undef" as the query in such cases to ensure that IPC::Run has nothing to write. However, then we have to say that the expected output is empty, and this exposes a deficiency in poll_query_until: if psql fails altogether and returns empty stdout, poll_query_until will treat that as a success! That's because, contrary to its documentation, it makes no actual check for psql failure, looking neither at the exit status nor at stderr. To fix that, adjust poll_query_until to insist on empty stderr as well as a stdout match. (I experimented with checking exit status instead, but it seems that psql often does exit(1) in cases that we need to consider successes. That might be something to fix someday, but it would be a non-back-patchable behavior change.) Back-patch to v10. The test cases needing this exist only as far back as v11, but it seems wise to keep poll_query_until's behavior the same in v10, in case we back-patch another such test case in future. (9.6 does not currently need this change, because in that branch poll_query_until can't be told to accept empty stdout as a success case.) Per assorted buildfarm failures, mostly on hoverfly. Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com	2021-06-12 15:12:10 -04:00
Tom Lane	7681b78fba	Ensure pg_filenode_relation(0, 0) returns NULL. Previously, a zero value for the relfilenode resulted in a confusing error message about "unexpected duplicate". This function returns NULL for other invalid relfilenode values, so zero should be treated likewise. It's been like this all along, so back-patch to all supported branches. Justin Pryzby Discussion: https://postgr.es/m/20210612023324.GT16435@telsasoft.com	2021-06-12 13:29:24 -04:00
Tom Lane	1f280e8331	Don't use Asserts to check for violations of replication protocol. Using an Assert to check the validity of incoming messages is an extremely poor decision. In a debug build, it should not be that easy for a broken or malicious remote client to crash the logrep worker. The consequences could be even worse in non-debug builds, which will fail to make such checks at all, leading to who-knows-what misbehavior. Hence, promote every Assert that could possibly be triggered by wrong or out-of-order replication messages to a full test-and-ereport. To avoid bloating the set of messages the translation team has to cope with, establish a policy that replication protocol violation error reports don't need to be translated. Hence, all the new messages here use errmsg_internal(). A couple of old messages are changed likewise for consistency. Along the way, fix some non-idiomatic or outright wrong uses of hash_search(). Most of these mistakes are new with the "streaming replication" patch (commit `464824323`), but a couple go back a long way. Back-patch as appropriate. Discussion: https://postgr.es/m/1719083.1623351052@sss.pgh.pa.us	2021-06-12 12:59:15 -04:00
Andrew Dunstan	1730a3334e	Fix new recovery test for use under msys Commit `caba8f0d43` wasn't quite right for msys, as demonstrated by several buildfarm animals, including jacana and fairywren. We need to use the msys perl in the archive command, but call it in such a way that Windows will understand the path. Furthermore, inside the copy script we need to convert a Windows path to an msys path.	2021-06-12 08:54:28 -04:00
Michael Paquier	b56ef31bf9	Remove PGSSLCRLDIR from the list of variables ignored in TAP tests This variable was present in the list added by `9d660670`, but it is not supported by this branch. Issue noticed while diving into a similar change for pg_regress.c. Backpatch-through: 9.6	2021-06-12 10:39:28 +09:00
Alvaro Herrera	0c7efd9757	Report sort phase progress in parallel btree build We were already reporting it, but only after the parallel workers were finished, which is visibly much later than what happens in a serial build. With this change we report it when the leader starts its own sort phase when participating in the build (the normal case). Now this might happen a little later than when the workers start their sorting phases, but a) communicating the actual phase start from workers is likely to be a hassle, and b) the sort phase start is pretty fuzzy anyway, since sorting per se is actually initiated by tuplesort.c internally earlier than tuplesort_performsort() is called. Backpatch to pg12, where the progress reporting code for CREATE INDEX went in. Reported-by: Tomas Vondra <tomas.vondra@enterprisedb.com> Author: Matthias van de Meent <boekewurm+postgres@gmail.com> Reviewed-by: Greg Nancarrow <gregn4422@gmail.com> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/1128176d-1eee-55d4-37ca-e63644422adb	2021-06-11 19:07:32 -04:00
Tom Lane	26383da7d4	Rearrange logrep worker's snapshot handling some more. It turns out that worker.c's code path for TRUNCATE was also careless about establishing a snapshot while executing user-defined code, allowing the checks added by commit `84f5c2908` to fail when a trigger is fired in that context. We could just wrap Push/PopActiveSnapshot around the truncate call, but it seems better to establish a policy of holding a snapshot throughout execution of a replication step. To help with that and possible future requirements, replace the previous ensure_transaction calls with pairs of begin/end_replication_step calls. Per report from Mark Dilger. Back-patch to v11, like the previous changes. Discussion: https://postgr.es/m/B4A3AF82-79ED-4F4C-A4E5-CD2622098972@enterprisedb.com	2021-06-10 12:27:27 -04:00
Robert Haas	2208d71a00	Adjust new test case to set wal_keep_segments. Per buildfarm member conchuela and Kyotaro Horiguchi, it's possible for the WAL segment that the cascading standby needs to be removed too quickly. Hopefully this will prevent that. Kyotaro Horiguchi Discussion: http://postgr.es/m/20210610.101240.1270925505780628275.horikyota.ntt@gmail.com	2021-06-10 09:34:39 -04:00
Robert Haas	5d950c721d	Fix corner case failure of new standby to follow new primary. This only happens if (1) the new standby has no WAL available locally, (2) the new standby is starting from the old timeline, (3) the promotion happened in the WAL segment from which the new standby is starting, (4) the timeline history file for the new timeline is available from the archive but the WAL files for are not (i.e. this is a race), (5) the WAL files for the new timeline are available via streaming, and (6) recovery_target_timeline='latest'. Commit `ee994272ca` introduced this logic and was an improvement over the previous code, but it mishandled this case. If recovery_target_timeline='latest' and restore_command is set, validateRecoveryParameters() can change recoveryTargetTLI to be different from receiveTLI. If streaming is then tried afterward, expectedTLEs gets initialized with the history of the wrong timeline. It's supposed to be a list of entries explaining how to get to the target timeline, but in this case it ends up with a list of entries explaining how to get to the new standby's original timeline, which isn't right. Dilip Kumar and Robert Haas, reviewed by Kyotaro Horiguchi. Discussion: http://postgr.es/m/CAFiTN-sE-jr=LB8jQuxeqikd-Ux+jHiXyh4YDiZMPedgQKup0g@mail.gmail.com	2021-06-09 16:19:52 -04:00
Robert Haas	6fc2febc28	Allow PostgresNode.pm's backup method to accept backup_options. Partial back-port of commit `081876d75e`. A test case for a pending bug fix needs this capability, but the code on 9.6 is significantly different, so I'm only back-patching this change as far as v10. We'll have to work around the problem another way in v9.6. Discussion: http://postgr.es/m/CAFiTN-tcivNvL0Rg6rD7_CErNfE75H7+gh9WbMxjbgsattja1Q@mail.gmail.com	2021-06-09 12:29:00 -04:00
Michael Paquier	01e9e935a1	Fix inconsistencies in psql --help=commands The set of subcommands supported by \dAp, \do and \dy was described incorrectly in psql's --help. The documentation was already consistent with the code. Reported-by: inoas, from IRC Author: Matthijs van der Vleuten Reviewed-by: Neil Chen Discussion: https://postgr.es/m/6a984e24-2171-4039-9050-92d55e7b23fe@www.fastmail.com Backpatch-through: 9.6	2021-06-09 16:25:57 +09:00
Tom Lane	1823233006	Force NO SCROLL for plpgsql's implicit cursors. Further thought about bug #17050 suggests that it's a good idea to use CURSOR_OPT_NO_SCROLL for the implicit cursor opened by a plpgsql FOR-over-query loop. This ensures that, if somebody commits inside the loop, PersistHoldablePortal won't try to rewind and re-read the cursor. While we'd have selected NO_SCROLL anyway if FOR UPDATE/SHARE appears in the query, there are other hazards with volatile functions; and in any case, it's silly to expend effort storing rows that we know for certain won't be needed. (While here, improve the comment in exec_run_select, which was a bit confused about the rationale for when we can use parallel mode. Cursor operations aren't a hazard for nameless portals.) This wasn't an issue until v11, which introduced the possibility of persisting such cursors. Hence, back-patch to v11. Per bug #17050 from Алексей Булгаков. Discussion: https://postgr.es/m/17050-f77aa827dc85247c@postgresql.org	2021-06-08 18:40:06 -04:00

1 2 3 4 5 ...

48679 Commits