postgres

mirror of https://github.com/postgres/postgres.git synced 2025-05-11 05:41:32 +03:00

Author	SHA1	Message	Date
Michael Paquier	834cb72691	Remove dry-run mode from isolationtester The original purpose of the dry-run mode is to be able to print all the possible permutations from a spec file, but it has become less useful since isolation tests have improved regarding deadlock detection as one step not wanted by the author could block indefinitely now (originally the step blocked would have been detected rather quickly). Per discussion, let's remove it. This is a backpatch of 9903338 for 9.6~12. It is proving to become useful to have on those branches so as the code gets consistent across all supported versions, as a matter of improving the output generated by isolationtester. Author: Michael Paquier Reviewed-by: Asim Praveen, Melanie Plageman Discussion: https://postgr.es/m/20190819080820.GG18166@paquier.xyz Discussion: https://postgr.es/m/794820.1623872009@sss.pgh.pa.us Backpatch-through: 9.6	2021-06-17 11:01:20 +09:00
Tom Lane	235eab52c0	Improve isolationtester's timeout management. isolationtester.c had a hard-wired limit of 3 minutes per test step. It now emerges that this isn't quite enough for some of the slowest buildfarm animals. This isn't the first time we've had to raise this limit (cf. 1db439ad4), so let's make it configurable. This patch raises the default to 5 minutes, and introduces an environment variable PGISOLATIONTIMEOUT that can be set if more time is needed, following the precedent of PGCTLTIMEOUT. Also, modify isolationtester so that when the timeout is hit, it explicitly reports having sent a cancel. This makes the regression failure log considerably more intelligible. (In the worst case, a timed-out test might actually be reported as "passing" without this extra output, so arguably this is a bug fix in itself.) In passing, update the README file, which had apparently not gotten touched when we added "make check" support here. Back-patch to 9.6; older versions don't have comparable timeout logic. Discussion: https://postgr.es/m/22964.1575842935@sss.pgh.pa.us	2019-12-09 14:31:57 -05:00
Tom Lane	295054411e	Improve test coverage for LISTEN/NOTIFY. Back-patch commit b10f40bf0 into older branches. This adds reporting of NOTIFY messages to isolationtester.c, and extends the async-notify test to include direct tests of basic NOTIFY functionality. This provides useful infrastructure for testing a bug fix I'm about to back-patch, and there seems no good reason not to have better tests of LISTEN/NOTIFY in the back branches. The commit's survived long enough in HEAD to make it unlikely that it will cause problems. Back-patch as far as 9.6. isolationtester.c changed too much in 9.6 to make it sane to try to fix older branches this way, and I don't really want to back-patch those changes too. Discussion: https://postgr.es/m/31304.1564246011@sss.pgh.pa.us	2019-11-23 17:30:00 -05:00
Tom Lane	b5784042ae	Sync isolationtester's handling of notice/warning messages with HEAD. Back-patch relevant parts of these commits: 30717637c Fix isolationtester race condition for notices sent before blocking ebd499282 Don't drop NOTICE messages in isolation tests a28e10e82 Indicate session name in isolationtester notices This ensures that older versions of the isolationtester will handle NOTICE/WARNING messages the same way as HEAD and v12 do. While this isn't fixing any critical problem right now, it seems like a prudent change to prevent surprises (like we had yesterday...) with back-patches of future isolation test changes. Back-patch as far as 9.6. Due to the significant changes we made in isolationtester in 9.6, back-patching isolation tests further than that is going to be risky anyway; besides, this patch doesn't apply cleanly before that. Discussion: https://postgr.es/m/E1i7IqC-0000Uc-5H@gemulon.postgresql.org	2019-09-10 12:45:32 -04:00
Noah Misch	73822b8c97	Raise some timeouts to 180s, in test code. Slow runs of buildfarm members chipmunk, hornet and mandrill saw the shorter timeouts expire. The 180s timeout in poll_query_until has been trouble-free since 2a0f89cd717ce6d49cdc47850577823682167e87 introduced it two years ago, so use 180s more widely. Back-patch to 9.6, where the first of these timeouts was introduced. Reviewed by Michael Paquier. Discussion: https://postgr.es/m/20181209001601.GC2973271@rfd.leadboat.com	2018-12-10 20:15:55 -08:00
Tom Lane	0a21c6d9e5	Fix minor bug in isolationtester. If the lock wait query failed, isolationtester would report the PQerrorMessage from some other connection, meaning there would be no message or an unrelated one. This seems like a pretty unlikely occurrence, but if it did happen, this bug could make it really difficult/confusing to figure out what happened. That seems to justify patching all the way back. In passing, clean up another place where the "wrong" conn was used for an error report. That one's not actually buggy because it's a different alias for the same connection, but it's still confusing to the reader.	2018-10-17 15:06:38 -04:00
Tom Lane	0772c152b9	Mark some more functions as pg_attribute_noreturn(). Doing this suppresses Coverity warnings and might allow improved code in some cases. The prospects of that are not so bright as to warrant back-patching, though. Michael Paquier, per Coverity	2017-11-27 20:56:46 -05:00
Tom Lane	382ceffdf7	Phase 3 of pgindent updates. Don't move parenthesized lines to the left, even if that means they flow past the right margin. By default, BSD indent lines up statement continuation lines that are within parentheses so that they start just to the right of the preceding left parenthesis. However, traditionally, if that resulted in the continuation line extending to the right of the desired right margin, then indent would push it left just far enough to not overrun the margin, if it could do so without making the continuation line start to the left of the current statement indent. That makes for a weird mix of indentations unless one has been completely rigid about never violating the 80-column limit. This behavior has been pretty universally panned by Postgres developers. Hence, disable it with indent's new -lpl switch, so that parenthesized lines are always lined up with the preceding left paren. This patch is much less interesting than the first round of indent changes, but also bulkier, so I thought it best to separate the effects. Discussion: https://postgr.es/m/E1dAmxK-0006EE-1r@gemulon.postgresql.org Discussion: https://postgr.es/m/30527.1495162840@sss.pgh.pa.us	2017-06-21 15:35:54 -04:00
Tom Lane	511540dadf	Move isolationtester's is-blocked query into C code for speed. Commit 4deb41381 modified isolationtester's query to see whether a session is blocked to also check for waits occurring in GetSafeSnapshot. However, it did that in a way that enormously increased the query's runtime under CLOBBER_CACHE_ALWAYS, causing the buildfarm members that use that to run about four times slower than before, and in some cases fail entirely. To fix, push the entire logic into a dedicated backend function. This should actually reduce the CLOBBER_CACHE_ALWAYS runtime from what it was previously, though I've not checked that. In passing, expose a SQL function to check for safe-snapshot blockage, comparable to pg_blocking_pids. This is more or less free given the infrastructure built to solve the other problem, so we might as well. Thomas Munro Discussion: https://postgr.es/m/20170407165749.pstcakbc637opkax@alap3.anarazel.de	2017-04-10 10:26:54 -04:00
Kevin Grittner	4deb413813	Add isolation test for SERIALIZABLE READ ONLY DEFERRABLE. This improves code coverage and lays a foundation for testing similar issues in a distributed environment. Author: Thomas Munro <thomas.munro@enterprisedb.com> Reviewed-by: Michael Paquier <michael.paquier@gmail.com>	2017-04-05 10:04:36 -05:00
Tom Lane	9e3755ecb2	Remove useless duplicate inclusions of system header files. c.h #includes a number of core libc header files, such as <stdio.h>. There's no point in re-including these after having read postgres.h, postgres_fe.h, or c.h; so remove code that did so. While at it, also fix some places that were ignoring our standard pattern of "include postgres[_fe].h, then system header files, then other Postgres header files". While there's not any great magic in doing it that way rather than system headers last, it's silly to have just a few files deviating from the general pattern. (But I didn't attempt to enforce this globally, only in files I was touching anyway.) I'd be the first to say that this is mostly compulsive neatnik-ism, but over time it might save enough compile cycles to be useful.	2017-02-25 16:12:55 -05:00
Tom Lane	052cc223d5	Fix a bunch of places that called malloc and friends with no NULL check. Where possible, use palloc or pg_malloc instead; otherwise, insert explicit NULL checks. Generally speaking, these are places where an actual OOM is quite unlikely, either because they're in client programs that don't allocate all that much, or they're very early in process startup so that we'd likely have had a fork() failure instead. Hence, no back-patch, even though this is nominally a bug fix. Michael Paquier, with some adjustments by me Discussion: <CAB7nPqRu07Ot6iht9i9KRfYLpDaF2ZuUv5y_+72uP23ZAGysRg@mail.gmail.com>	2016-08-30 18:22:43 -04:00
Tom Lane	ad520ec4ac	Use memmove() not memcpy() to slide some pointers down. The previous coding here was formally undefined, though it seems to accidentally work on most platforms in the buildfarm. Caught by some OpenBSD platforms in which libc contains an assertion check for overlapping areas passed to memcpy(). Thomas Munro	2016-04-27 18:19:28 -04:00
Peter Eisentraut	a40814d7aa	Handle invalid libpq sockets in more places Also, make error messages consistent. From: Michael Paquier <michael.paquier@gmail.com>	2016-03-08 21:10:33 -05:00
Tom Lane	52f5d578d6	Create a function to reliably identify which sessions block which others. This patch introduces "pg_blocking_pids(int) returns int[]", which returns the PIDs of any sessions that are blocking the session with the given PID. Historically people have obtained such information using a self-join on the pg_locks view, but it's unreasonably tedious to do it that way with any modicum of correctness, and the addition of parallel queries has pretty much broken that approach altogether. (Given some more columns in the view than there are today, you could imagine handling parallel-query cases with a 4-way join; but ugh.) The new function has the following behaviors that are painful or impossible to get right via pg_locks: 1. Correctly understands which lock modes block which other ones. 2. In soft-block situations (two processes both waiting for conflicting lock modes), only the one that's in front in the wait queue is reported to block the other. 3. In parallel-query cases, reports all sessions blocking any member of the given PID's lock group, and reports a session by naming its leader process's PID, which will be the pg_backend_pid() value visible to clients. The motivation for doing this right now is mostly to fix the isolation tests. Commit 38f8bdcac4982215beb9f65a19debecaf22fd470 lobotomized isolationtester's is-it-waiting query by removing its ability to recognize nonconflicting lock modes, as a crude workaround for the inability to handle soft-block situations properly. But even without the lock mode tests, the old query was excessively slow, particularly in CLOBBER_CACHE_ALWAYS builds; some of our buildfarm animals fail the new deadlock-hard test because the deadlock timeout elapses before they can probe the waiting status of all eight sessions. Replacing the pg_locks self-join with use of pg_blocking_pids() is not only much more correct, but a lot faster: I measure it at about 9X faster in a typical dev build with Asserts, and 3X faster in CLOBBER_CACHE_ALWAYS builds. That should provide enough headroom for the slower CLOBBER_CACHE_ALWAYS animals to pass the test, without having to lengthen deadlock_timeout yet more and thus slow down the test for everyone else.	2016-02-22 14:31:43 -05:00
Tom Lane	dca369320f	Revert "isolationtester: don't repeat the is-it-waiting query when retrying a step." This mostly reverts commit 9c9782f066e0ce5424b8706df2cce147cb78170f. I left in the parts that rearranged removal of completed waiting steps; but the idea of not rechecking a step's blocked-ness isn't working.	2016-02-12 17:12:23 -05:00
Tom Lane	9c9782f066	isolationtester: don't repeat the is-it-waiting query when retrying a step. If we're retrying a step, then we already decided it was blocked on a lock, and there's no need to recheck that. The original coding of commit 38f8bdcac4982215beb9f65a19debecaf22fd470 resulted in a large number of is-it-waiting queries when dealing with multiple concurrently-blocked sessions, which is fairly pointless and also results in test failures in CLOBBER_CACHE_ALWAYS builds, where the is-it-waiting query is quite slow. This definition also permits appending pg_sleep() calls to steps where it's needed to control the order of finish of concurrent steps. Before, that did not work nicely because we'd decide that a step performing a sleep was not blocked and hang up waiting for it to finish, rather than noticing the completion of the concurrent step we're supposed to notice first. In passing, revise handling of removal of completed waiting steps to make it a bit less messy.	2016-02-12 14:10:36 -05:00
Tom Lane	a361490806	Re-pgindent isolationtester.c. Need to do some more hacking on this, and got annoyed that it's not indent clean.	2016-02-12 13:36:13 -05:00
Peter Eisentraut	29b4b7bda6	Fix whitespace	2016-02-12 12:08:40 -05:00
Tom Lane	d9dc2b4149	Code review for isolationtester changes. Fix a few oversights in 38f8bdcac4982215beb9f65a19debecaf22fd470: don't leak memory in run_permutation(), remember when we've issued a cancel rather than issuing another one every 10ms, fix some typos in comments.	2016-02-11 11:30:52 -05:00
Robert Haas	38f8bdcac4	Modify the isolation tester so that multiple sessions can wait. This allows testing of deadlock scenarios. Scenarios that would previously have been considered invalid are now simply taken as a scenario in which more than one backend will wait.	2016-02-11 08:36:30 -05:00
Robert Haas	43b4a16817	Reject isolation test specifications with duplicate step names. alter-table-1.spec has such a case, so change one instance of step rx1 to rx3 instead.	2015-08-14 22:10:46 -04:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Tom Lane	60ff2fdd99	Centralize getopt-related declarations in a new header file pg_getopt.h. We used to have externs for getopt() and its API variables scattered all over the place. Now that we find we're going to need to tweak the variable declarations for Cygwin, it seems like a good idea to have just one place to tweak. In this commit, the variables are declared "#ifndef HAVE_GETOPT_H". That may or may not work everywhere, but we'll soon find out. Andres Freund	2014-02-15 14:31:30 -05:00
Alvaro Herrera	6eda3e9c27	isolationtester: Ensure stderr is unbuffered, too	2013-12-19 22:09:30 -03:00
Alvaro Herrera	73bcb76b77	Make stdout unbuffered This ensures that all stdout output is flushed immediately, to match stderr. This eliminates the need for fflush(stdout) calls sprinkled all over the place. Per Daniel Wood in message 519A79C6.90308@salesforce.com	2013-12-19 17:26:27 -03:00
Heikki Linnakangas	32ceba3ea7	Replace appendPQExpBuffer(..., <constant>) with appendPQExpBufferStr Arguably makes the code a bit more readable, and might give a small performance gain. David Rowley	2013-11-18 18:34:51 +02:00
Robert Haas	9b4d52f209	Fix pg_isolation_regress to work outside its build directory. This makes it possible to, for example, use the isolation tester to test a contrib module. Andres Freund	2013-11-08 14:40:41 -05:00
Tom Lane	2c66f9924c	Replace pg_asprintf() with psprintf(). This eliminates an awkward coding pattern that's also unnecessarily inconsistent with backend coding. psprintf() is now the thing to use everywhere.	2013-10-22 19:40:26 -04:00
Peter Eisentraut	5b6d08cd29	Add use of asprintf() Add asprintf(), pg_asprintf(), and psprintf() to simplify string allocation and composition. Replacement implementations taken from NetBSD. Reviewed-by: Álvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Asif Naeem <anaeem.it@gmail.com>	2013-10-13 00:09:18 -04:00
Alvaro Herrera	4f0777ba0f	isolationtester: Allow tuples to be returned in more places Previously, isolationtester would forbid returning tuples in session-specific teardown (but not global teardown), as well as in global setup. Allow these places to return tuples, too.	2013-10-04 14:54:55 -03:00
Tom Lane	faf4726c9f	In isolationtester, retry after EINTR return from select(2). Per report from Jaime Casanova. Very curious that no one else has seen this failure ... but the code is clearly wrong as-is.	2013-04-06 22:28:49 -04:00
Tom Lane	845d335a90	Minor robustness improvements for isolationtester. Notice and complain about PQcancel() failures. Also, don't dump core if an error PGresult doesn't contain severity and message subfields, as it might not if it was generated by libpq itself. (We have a longstanding TODO item to improve that, but in the meantime isolationtester had better cope.) I tripped across the latter item while investigating a trouble report on buildfarm member spoonbill. As for the former, there's no evidence that PQcancel failure is actually involved in spoonbill's problem, but it still seems like a bad idea to ignore an error return code.	2013-04-02 21:15:37 -04:00
Andrew Dunstan	63d283ecd0	Flush stderr and stdout in isolation tester. This is a possibly vain attempt to fix a buffering issue observed for some MSVC builds.	2013-02-27 19:13:07 -05:00
Alvaro Herrera	ca5db759b8	isolationtester: add a few fflush(stderr) calls The lack of them is causing failures in some BF members. Per Andrew Dunstan.	2013-01-23 13:30:14 -03:00
Alvaro Herrera	0ac5ad5134	Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov	2013-01-23 12:04:59 -03:00
Kevin Grittner	c63f309cca	Allow isolation tests to specify multiple setup blocks. Each setup block is run as a single PQexec submission, and some statements such as VACUUM cannot be combined with others in such a block. Backpatch to 9.2. Kevin Grittner and Tom Lane	2012-09-04 19:31:06 -05:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Tom Lane	759d9d6769	Add simple tests of EvalPlanQual using the isolationtester infrastructure. Much more could be done here, but at least now we have some automated test coverage of that mechanism. In particular this tests the writable-CTE case reported by Phil Sorber. In passing, remove isolationtester's arbitrary restriction on the number of steps in a permutation list. I used this so that a single spec file could be used to run several related test scenarios, but there are other possible reasons to want a step series that's not exactly a permutation. Improve documentation and fix a couple other nits as well.	2012-01-28 17:55:08 -05:00
Alvaro Herrera	7064fd0648	Detect invalid permutations in isolationtester isolationtester is now able to continue running other permutations when it detects that one of them is invalid, which is useful during initial development of spec files. Author: Alexander Shulgin	2012-01-14 19:36:39 -03:00
Alvaro Herrera	d2a75837cc	Avoid NULL pointer dereference in isolationtester	2012-01-14 19:01:32 -03:00
Alvaro Herrera	50363c8f86	Validate number of steps specified in permutation A permutation that specifies more steps than defined causes isolationtester to crash, so avoid that. Using less steps than defined should probably not be a problem, but no spec currently does that.	2012-01-11 18:48:59 -03:00
Alvaro Herrera	e145891c98	Unbreak isolationtester on Win32 I broke it in a previous commit because I neglected to install the necessary incantations to have getopt() work on Windows. Per red blots in buildfarm.	2011-11-04 00:33:48 -02:00
Alvaro Herrera	7ed3605675	Implement a dry-run mode for isolationtester This mode prints out the permutations that would be run by the given spec file, in the same format used by the permutation lines in spec files. This helps in building new spec files. Author: Alexander Shulgin, with some tweaks by me	2011-11-03 15:20:10 -02:00
Alvaro Herrera	90d8e8ff7e	Add debugging aid in isolationtester	2011-10-24 22:14:22 -03:00
Alvaro Herrera	bbd38af3a8	Remove dependency on error ordering in isolation tests We now report errors reported by the just-unblocked and unblocking transactions identically; this should fix relatively common buildfarm failures reported by animals that are failing the "wrong" session.	2011-09-27 16:53:35 -03:00
Alvaro Herrera	1734992738	Fix typo	2011-09-27 16:50:27 -03:00
Heikki Linnakangas	af35737313	Add an SSI regression test that tests all interesting permutations in the order of begin, prepare, and commit of three concurrent transactions that have conflicts between them. The test runs for a quite long time, and the expected output file is huge, but this test caught some serious bugs during development, so seems worthwhile to keep. The test uses prepared transactions, so it fails if the server has max_prepared_transactions=0. Because of that, it's marked as "ignore" in the schedule file. Dan Ports	2011-08-18 17:09:58 +03:00
Alvaro Herrera	c8dfc89232	Make isolationtester more robust on locked commands Noah Misch diagnosed the buildfarm problems in the isolation tests partly as failure to differentiate backends properly; the old code was using backend IDs, which is not good enough because a new backend might use an already used ID. Use PIDs instead. Also, the code was purposely careless about other concurrent activity, because it isn't expected; and in fact, it doesn't affect the vast majority of the time. However, it can be observed that autovacuum can block tables for long enough to cause sporadic failures. The new code accounts for that by ignoring locks held by processes not explicitly declared in our spec file. Author: Noah Misch	2011-07-19 14:22:42 -04:00
Alvaro Herrera	846af54dd5	Add support for blocked commands in isolationtester This enables us to test that blocking commands (such as foreign keys checks that conflict with some other lock) act as intended. The set of tests that this adds is pretty minimal, but can easily be extended by adding new specs. The intention is that this will serve as a basis for ensuring that further tweaks of locking implementation preserve (or improve) existing behavior. Author: Noah Misch	2011-07-12 17:24:17 -04:00

1 2

53 Commits