postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-06 07:49:08 +03:00

Author	SHA1	Message	Date
Masahiko Sawada	bd88404d3c	Fix crash with RemoveFromWaitQueue() when detecting a deadlock. Commit `5764f611e` used dclist_delete_from() to remove the proc from the wait queue. However, since it doesn't clear dist_node's next/prev to NULL, it could call RemoveFromWaitQueue() twice: when the process detects a deadlock and then when cleaning up locks on aborting the transaction. The waiting lock information is cleared in the first call, so it led to a crash in the second call. Backpatch to v16, where the change was introduced. Bug: #18031 Reported-by: Justin Pryzby, Alexander Lakhin Reviewed-by: Andres Freund Discussion: https://postgr.es/m/ZKy4AdrLEfbqrxGJ%40telsasoft.com Discussion: https://postgr.es/m/18031-ebe2d08cb405f6cc@postgresql.org Backpatch-through: 16	2023-07-26 14:41:26 +09:00
Amit Kapila	cc32ec24fd	Revert the commits related to allowing page lock to conflict among parallel group members. This commit reverts the work done by commits `3ba59ccc89` and `72e78d831a`. Those commits were incorrect in asserting that we never acquire any other heavy-weight lock after acquring page lock other than relation extension lock. We can acquire a lock on catalogs while doing catalog look up after acquring page lock. This won't impact any existing feature but we need to think some other way to achieve this before parallelizing other write operations or even improving the parallelism in vacuum (like allowing multiple workers for an index). Reported-by: Jaime Casanova Author: Amit Kapila Backpatch-through: 13 Discussion: https://postgr.es/m/CAJKUy5jffnRKNvRHKQ0LynRb0RJC-o4P8Ku3x9vGAVLwDBWumQ@mail.gmail.com	2023-07-06 08:52:10 +05:30
Peter Eisentraut	3ad5f07c0f	Error message refactoring Take some untranslatable things out of the message and replace by format placeholders, to reduce translatable strings and reduce translation mistakes.	2023-06-23 16:36:17 +02:00
Tom Lane	0245f8db36	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. This set of diffs is a bit larger than typical. We've updated to pg_bsd_indent 2.1.2, which properly indents variable declarations that have multi-line initialization expressions (the continuation lines are now indented one tab stop). We've also updated to perltidy version 20230309 and changed some of its settings, which reduces its desire to add whitespace to lines to make assignments etc. line up. Going forward, that should make for fewer random-seeming changes to existing code. Discussion: https://postgr.es/m/20230428092545.qfb3y5wcu4cm75ur@alvherre.pgsql	2023-05-19 17:24:48 -04:00
David Rowley	2cb82e2acf	Speedup and increase usability of set proc title functions The setting of the process title could be seen on profiles of very fast-to-execute queries. In many locations where we call set_ps_display() we pass along a string constant, the length of which is known during compilation. Here we effectively rename set_ps_display() to set_ps_display_with_len() and then add a static inline function named set_ps_display() which calls strlen() on the given string. This allows the compiler to optimize away the strlen() call when dealing with call sites passing a string constant. We can then also use memcpy() instead of strlcpy() to copy the string into the destination buffer. That's significantly faster than strlcpy's byte-at-a-time way of copying. Here we also take measures to improve some code which was adjusting the process title to add a " waiting" suffix to it. Call sites which require this can now just call set_ps_display_suffix() to add or adjust the suffix and call set_ps_display_remove_suffix() to remove it again. Reviewed-by: Andres Freund Discussion: https://postgr.es/m/CAApHDvocBvvk-0gWNA2Gohe+sv9fMcv+fK_G+siBKJrgDG4O7g@mail.gmail.com	2023-02-20 16:18:27 +13:00
Peter Eisentraut	54a177a948	Remove useless casts to (void ) in hash_search() calls Some of these appear to be leftovers from when hash_search() took a char argument (changed in `5999e78fc4`). Since after this there is some more horizontal space available, do some light reformatting where suitable. Reviewed-by: Corey Huinker <corey.huinker@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/fd9adf5d-b1aa-e82f-e4c7-263c30145807%40enterprisedb.com	2023-02-06 09:41:01 +01:00
Andres Freund	5764f611e1	Use dlist/dclist instead of PROC_QUEUE / SHM_QUEUE for heavyweight locks Part of a series to remove SHM_QUEUE. ilist.h style lists are more widely used and have an easier to use interface. As PROC_QUEUE is now unused, remove it. Reviewed-by: Thomas Munro <thomas.munro@gmail.com> (in an older version) Discussion: https://postgr.es/m/20221120055930.t6kl3tyivzhlrzu2@awork3.anarazel.de Discussion: https://postgr.es/m/20200211042229.msv23badgqljrdg2@alap3.anarazel.de	2023-01-18 11:41:14 -08:00
Bruce Momjian	c8e1ba736b	Update copyright for 2023 Backpatch-through: 11	2023-01-02 15:00:37 -05:00
David Rowley	2d0bbedda7	Rename shadowed local variables In a similar effort to `f01592f91`, here we mostly rename shadowed local variables to remove the warnings produced when compiling with -Wshadow=compatible-local. This fixes 63 warnings and leaves just 5. Author: Justin Pryzby, David Rowley Reviewed-by: Justin Pryzby Discussion https://postgr.es/m/20220817145434.GC26426%40telsasoft.com	2022-10-05 21:01:41 +13:00
Robert Haas	7fc0e7de9f	Revert the addition of GetMaxBackends() and related stuff. This reverts commits `0147fc7`, `4567596`, `aa64f23`, and `5ecd018`. There is no longer agreement that introducing this function was the right way to address the problem. The consensus now seems to favor trying to make a correct value for MaxBackends available to mdules executing their _PG_init() functions. Nathan Bossart Discussion: http://postgr.es/m/20220323045229.i23skfscdbvrsuxa@jrouhaud	2022-04-12 14:45:23 -04:00
Michael Paquier	4567596316	Reduce more the number of calls to GetMaxBackends() Some of the code paths changed by `aa64f23` can reduce the number of times GetMaxBackends() is called. The performance gain is marginal, but most of the code changed by this commit already did that. Hence, let's be clean and apply the same rule everywhere, for consistency. Some of the code paths, like in deadlock.c, involve only assertions. These are left unchanged. Reviewed-by: Nathan Bossart, Robert Haas Discussion: https://postgr.es/m/YgMpGZhPOjNfS7er@paquier.xyz	2022-02-10 10:27:29 +09:00
Robert Haas	aa64f23b02	Remove MaxBackends variable in favor of GetMaxBackends() function. Previously, it was really easy to write code that accessed MaxBackends before we'd actually initialized it, especially when coding up an extension. To make this less error-prune, introduce a new function GetMaxBackends() which should be used to obtain the correct value. This will ERROR if called too early. Demote the global variable to a file-level static, so that nobody can peak at it directly. Nathan Bossart. Idea by Andres Freund. Review by Greg Sabino Mullane, by Michael Paquier (who had doubts about the approach), and by me. Discussion: http://postgr.es/m/20210802224204.bckcikl45uezv5e4@alap3.anarazel.de	2022-02-08 15:53:19 -05:00
Bruce Momjian	27b77ecf9f	Update copyright for 2022 Backpatch-through: 10	2022-01-07 19:04:57 -05:00
Tom Lane	ba2bc4a7ba	Use MaxLockMode symbol in more places. As long as we have this macro, it makes sense to use it in the LockMethodData structures. Julien Rouhaud Discussion: https://postgr.es/m/20220103064722.ewdv4evlez5m7mdn@jrouhaud	2022-01-03 12:24:44 -05:00
Noah Misch	3cd9c3b921	Fix CREATE INDEX CONCURRENTLY for the newest prepared transactions. The purpose of commit `8a54e12a38` was to fix this, and it sufficed when the PREPARE TRANSACTION completed before the CIC looked for lock conflicts. Otherwise, things still broke. As before, in a cluster having used CIC while having enabled prepared transactions, queries that use the resulting index can silently fail to find rows. It may be necessary to reindex to recover from past occurrences; REINDEX CONCURRENTLY suffices. Fix this for future index builds by making CIC wait for arbitrarily-recent prepared transactions and for ordinary transactions that may yet PREPARE TRANSACTION. As part of that, have PREPARE TRANSACTION transfer locks to its dummy PGPROC before it calls ProcArrayClearTransaction(). Back-patch to 9.6 (all supported versions). Andrey Borodin, reviewed (in earlier versions) by Andres Freund. Discussion: https://postgr.es/m/01824242-AA92-4FE9-9BA7-AEBAFFEA3D0C@yandex-team.ru	2021-10-23 18:36:38 -07:00
Heikki Linnakangas	317632f307	Move InRecovery and standbyState global vars to xlogutils.c. They are used in code that runs both during normal operation and during WAL replay, and needs to behave differently during replay. Move them to xlogutils.c, because that's where we have other helper functions used by redo routines. Reviewed-by: Andres Freund Discussion: https://www.postgresql.org/message-id/b3b71061-4919-e882-4857-27e370ab134a%40iki.fi	2021-07-31 09:50:26 +03:00
Tom Lane	6310809c4a	Fix check for conflicting session- vs transaction-level locks. We have an implementation restriction that PREPARE TRANSACTION can't handle cases where both session-lifespan and transaction-lifespan locks are held on the same lockable object. (That's because we'd otherwise need to acquire a new PROCLOCK entry during post-prepare cleanup, which is an operation that might fail. The situation can only arise with odd usages of advisory locks, so removing the restriction is probably not worth the amount of effort it would take.) AtPrepare_Locks attempted to enforce this, but its logic was many bricks shy of a load, because it only detected cases where the session and transaction locks had the same lockmode. Locks of different modes on the same object would lead to the rather unhelpful message "PANIC: we seem to have dropped a bit somewhere". To fix, build a transient hashtable with one entry per locktag, not one per locktag + mode, and use that to detect conflicts. Per bug #17122 from Alexander Pyhalov. This bug is ancient, so back-patch to all supported branches. Discussion: https://postgr.es/m/17122-04f3c32098a62233@postgresql.org	2021-07-24 18:35:52 -04:00
Fujii Masao	46d6e5f567	Display the time when the process started waiting for the lock, in pg_locks, take 2 This commit adds new column "waitstart" into pg_locks view. This column reports the time when the server process started waiting for the lock if the lock is not held. This information is useful, for example, when examining the amount of time to wait on a lock by subtracting "waitstart" in pg_locks from the current time, and identify the lock that the processes are waiting for very long. This feature uses the current time obtained for the deadlock timeout timer as "waitstart" (i.e., the time when this process started waiting for the lock). Since getting the current time newly can cause overhead, we reuse the already-obtained time to avoid that overhead. Note that "waitstart" is updated without holding the lock table's partition lock, to avoid the overhead by additional lock acquisition. This can cause "waitstart" in pg_locks to become NULL for a very short period of time after the wait started even though "granted" is false. This is OK in practice because we can assume that users are likely to look at "waitstart" when waiting for the lock for a long time. The first attempt of this patch (commit `3b733fcd04`) caused the buildfarm member "rorqual" (built with --disable-atomics --disable-spinlocks) to report the failure of the regression test. It was reverted by commit `890d2182a2`. The cause of this failure was that the atomic variable for "waitstart" in the dummy process entry created at the end of prepare transaction was not initialized. This second attempt fixes that issue. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com	2021-02-15 15:13:37 +09:00
Fujii Masao	890d2182a2	Revert "Display the time when the process started waiting for the lock, in pg_locks." This reverts commit `3b733fcd04`. Per buildfarm members prion and rorqual.	2021-02-09 18:30:40 +09:00
Fujii Masao	3b733fcd04	Display the time when the process started waiting for the lock, in pg_locks. This commit adds new column "waitstart" into pg_locks view. This column reports the time when the server process started waiting for the lock if the lock is not held. This information is useful, for example, when examining the amount of time to wait on a lock by subtracting "waitstart" in pg_locks from the current time, and identify the lock that the processes are waiting for very long. This feature uses the current time obtained for the deadlock timeout timer as "waitstart" (i.e., the time when this process started waiting for the lock). Since getting the current time newly can cause overhead, we reuse the already-obtained time to avoid that overhead. Note that "waitstart" is updated without holding the lock table's partition lock, to avoid the overhead by additional lock acquisition. This can cause "waitstart" in pg_locks to become NULL for a very short period of time after the wait started even though "granted" is false. This is OK in practice because we can assume that users are likely to look at "waitstart" when waiting for the lock for a long time. Bump catalog version. Author: Atsushi Torikoshi Reviewed-by: Ian Lawrence Barwick, Robert Haas, Justin Pryzby, Fujii Masao Discussion: https://postgr.es/m/a96013dc51cdc56b2a2b84fa8a16a993@oss.nttdata.com	2021-02-09 18:10:19 +09:00
Noah Misch	8a54e12a38	Fix CREATE INDEX CONCURRENTLY for simultaneous prepared transactions. In a cluster having used CREATE INDEX CONCURRENTLY while having enabled prepared transactions, queries that use the resulting index can silently fail to find rows. Fix this for future CREATE INDEX CONCURRENTLY by making it wait for prepared transactions like it waits for ordinary transactions. This expands the VirtualTransactionId structure domain to admit prepared transactions. It may be necessary to reindex to recover from past occurrences. Back-patch to 9.5 (all supported versions). Andrey Borodin, reviewed (in earlier versions) by Tom Lane and Michael Paquier. Discussion: https://postgr.es/m/2E712143-97F7-4890-B470-4A35142ABC82@yandex-team.ru	2021-01-30 00:00:27 -08:00
Bruce Momjian	ca3b37487b	Update copyright for 2021 Backpatch-through: 9.5	2021-01-02 13:06:25 -05:00
Tom Lane	b3817f5f77	Improve hash_create()'s API for some added robustness. Invent a new flag bit HASH_STRINGS to specify C-string hashing, which was formerly the default; and add assertions insisting that exactly one of the bits HASH_STRINGS, HASH_BLOBS, and HASH_FUNCTION be set. This is in hopes of preventing recurrences of the type of oversight fixed in commit `a1b8aa1e4` (i.e., mistakenly omitting HASH_BLOBS). Also, when HASH_STRINGS is specified, insist that the keysize be more than 8 bytes. This is a heuristic, but it should catch accidental use of HASH_STRINGS for integer or pointer keys. (Nearly all existing use-cases set the keysize to NAMEDATALEN or more, so there's little reason to think this restriction should be problematic.) Tweak hash_create() to insist that the HASH_ELEM flag be set, and remove the defaults it had for keysize and entrysize. Since those defaults were undocumented and basically useless, no callers omitted HASH_ELEM anyway. Also, remove memset's zeroing the HASHCTL parameter struct from those callers that had one. This has never been really necessary, and while it wasn't a bad coding convention it was confusing that some callers did it and some did not. We might as well save a few cycles by standardizing on "not". Also improve the documentation for hash_create(). In passing, improve reinit.c's usage of a hash table by storing the key as a binary Oid rather than a string; and, since that's a temporary hash table, allocate it in CurrentMemoryContext for neatness. Discussion: https://postgr.es/m/590625.1607878171@sss.pgh.pa.us	2020-12-15 11:38:53 -05:00
Andres Freund	941697c3c1	snapshot scalability: Introduce dense array of in-progress xids. The new array contains the xids for all connected backends / in-use PGPROC entries in a dense manner (in contrast to the PGPROC/PGXACT arrays which can have unused entries interspersed). This improves performance because GetSnapshotData() always needs to scan the xids of all live procarray entries and now there's no need to go through the procArray->pgprocnos indirection anymore. As the set of running top-level xids changes rarely, compared to the number of snapshots taken, this substantially increases the likelihood of most data required for a snapshot being in l2 cache. In read-mostly workloads scanning the xids[] array will sufficient to build a snapshot, as most backends will not have an xid assigned. To keep the xid array dense ProcArrayRemove() needs to move entries behind the to-be-removed proc's one further up in the array. Obviously moving array entries cannot happen while a backend sets it xid. I.e. locking needs to prevent that array entries are moved while a backend modifies its xid. To avoid locking ProcArrayLock in GetNewTransactionId() - a fairly hot spot already - ProcArrayAdd() / ProcArrayRemove() now needs to hold XidGenLock in addition to ProcArrayLock. Adding / Removing a procarray entry is not a very frequent operation, even taking 2PC into account. Due to the above, the dense array entries can only be read or modified while holding ProcArrayLock and/or XidGenLock. This prevents a concurrent ProcArrayRemove() from shifting the dense array while it is accessed concurrently. While the new dense array is very good when needing to look at all xids it is less suitable when accessing a single backend's xid. In particular it would be problematic to have to acquire a lock to access a backend's own xid. Therefore a backend's xid is not just stored in the dense array, but also in PGPROC. This also allows a backend to only access the shared xid value when the backend had acquired an xid. The infrastructure added in this commit will be used for the remaining PGXACT fields in subsequent commits. They are kept separate to make review easier. Author: Andres Freund <andres@anarazel.de> Reviewed-By: Robert Haas <robertmhaas@gmail.com> Reviewed-By: Thomas Munro <thomas.munro@gmail.com> Reviewed-By: David Rowley <dgrowleyml@gmail.com> Discussion: https://postgr.es/m/20200301083601.ews6hz5dduc3w2se@alap3.anarazel.de	2020-08-14 15:33:35 -07:00
Peter Eisentraut	a513f1dfbf	Remove STATUS_WAITING Add a separate enum for use in the locking APIs, which were the only user. Discussion: https://www.postgresql.org/message-id/flat/a6f91ead-0ce4-2a34-062b-7ab9813ea308%402ndquadrant.com	2020-06-17 09:14:37 +02:00
Peter Eisentraut	0fd2a79a63	Spelling adjustments	2020-06-07 15:06:51 +02:00
Tom Lane	36ac359d36	Rename assorted LWLock tranches. Choose names that fit into the conventions for wait event names (particularly, that multi-word names are in the style MultiWordName) and hopefully convey more information to non-hacker users than the previous names did. Also rename SerializablePredicateLockListLock to SerializablePredicateListLock; the old name was long enough to cause table formatting problems, plus the double occurrence of "Lock" seems confusing/error-prone. Also change a couple of particularly opaque LWLock field names. Discussion: https://postgr.es/m/28683.1589405363@sss.pgh.pa.us	2020-05-15 18:11:07 -04:00
Tom Lane	5cbfce562f	Initial pgindent and pgperltidy run for v13. Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit `c9d297751`, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.	2020-05-14 13:06:50 -04:00
Noah Misch	c6b92041d3	Skip WAL for new relfilenodes, under wal_level=minimal. Until now, only selected bulk operations (e.g. COPY) did this. If a given relfilenode received both a WAL-skipping COPY and a WAL-logged operation (e.g. INSERT), recovery could lose tuples from the COPY. See src/backend/access/transam/README section "Skipping WAL for New RelFileNode" for the new coding rules. Maintainers of table access methods should examine that section. To maintain data durability, just before commit, we choose between an fsync of the relfilenode and copying its contents to WAL. A new GUC, wal_skip_threshold, guides that choice. If this change slows a workload that creates small, permanent relfilenodes under wal_level=minimal, try adjusting wal_skip_threshold. Users setting a timeout on COMMIT may need to adjust that timeout, and log_min_duration_statement analysis will reflect time consumption moving to COMMIT from commands like COPY. Internally, this requires a reliable determination of whether RollbackAndReleaseCurrentSubTransaction() would unlink a relation's current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the specification of rd_createSubid such that the field is zero when a new rel has an old rd_node. Make relcache.c retain entries for certain dropped relations until end of transaction. Bump XLOG_PAGE_MAGIC, since this introduces XLOG_GIST_ASSIGN_LSN. Future servers accept older WAL, so this bump is discretionary. Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert Haas. Heikki Linnakangas and Michael Paquier implemented earlier designs that materially clarified the problem. Reviewed, in earlier designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane, Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout. Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org	2020-04-04 12:25:34 -07:00
Peter Eisentraut	552fcebff0	Revert "Improve handling of parameter differences in physical replication" This reverts commit `246f136e76`. That patch wasn't quite complete enough. Discussion: https://www.postgresql.org/message-id/flat/E1jIpJu-0007Ql-CL%40gemulon.postgresql.org	2020-04-04 09:08:12 +02:00
Peter Eisentraut	246f136e76	Improve handling of parameter differences in physical replication When certain parameters are changed on a physical replication primary, this is communicated to standbys using the XLOG_PARAMETER_CHANGE WAL record. The standby then checks whether its own settings are at least as big as the ones on the primary. If not, the standby shuts down with a fatal error. The correspondence of settings between primary and standby is required because those settings influence certain shared memory sizings that are required for processing WAL records that the primary might send. For example, if the primary sends a prepared transaction, the standby must have had max_prepared_transaction set appropriately or it won't be able to process those WAL records. However, fatally shutting down the standby immediately upon receipt of the parameter change record might be a bit of an overreaction. The resources related to those settings are not required immediately at that point, and might never be required if the activity on the primary does not exhaust all those resources. If we just let the standby roll on with recovery, it will eventually produce an appropriate error when those resources are used. So this patch relaxes this a bit. Upon receipt of XLOG_PARAMETER_CHANGE, we still check the settings but only issue a warning and set a global flag if there is a problem. Then when we actually hit the resource issue and the flag was set, we issue another warning message with relevant information. At that point we pause recovery, so a hot standby remains usable. We also repeat the last warning message once a minute so it is harder to miss or ignore. Reviewed-by: Sergei Kornilov <sk@zsrv.org> Reviewed-by: Masahiko Sawada <masahiko.sawada@2ndquadrant.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/4ad69a4c-cc9b-0dfe-0352-8b1b0cd36c7b@2ndquadrant.com	2020-03-30 09:53:45 +02:00
Noah Misch	de9396326e	Revert "Skip WAL for new relfilenodes, under wal_level=minimal." This reverts commit `cb2fd7eac2`. Per numerous buildfarm members, it was incompatible with parallel query, and a test case assumed LP64. Back-patch to 9.5 (all supported versions). Discussion: https://postgr.es/m/20200321224920.GB1763544@rfd.leadboat.com	2020-03-22 09:24:09 -07:00
Noah Misch	cb2fd7eac2	Skip WAL for new relfilenodes, under wal_level=minimal. Until now, only selected bulk operations (e.g. COPY) did this. If a given relfilenode received both a WAL-skipping COPY and a WAL-logged operation (e.g. INSERT), recovery could lose tuples from the COPY. See src/backend/access/transam/README section "Skipping WAL for New RelFileNode" for the new coding rules. Maintainers of table access methods should examine that section. To maintain data durability, just before commit, we choose between an fsync of the relfilenode and copying its contents to WAL. A new GUC, wal_skip_threshold, guides that choice. If this change slows a workload that creates small, permanent relfilenodes under wal_level=minimal, try adjusting wal_skip_threshold. Users setting a timeout on COMMIT may need to adjust that timeout, and log_min_duration_statement analysis will reflect time consumption moving to COMMIT from commands like COPY. Internally, this requires a reliable determination of whether RollbackAndReleaseCurrentSubTransaction() would unlink a relation's current relfilenode. Introduce rd_firstRelfilenodeSubid. Amend the specification of rd_createSubid such that the field is zero when a new rel has an old rd_node. Make relcache.c retain entries for certain dropped relations until end of transaction. Back-patch to 9.5 (all supported versions). This introduces a new WAL record type, XLOG_GIST_ASSIGN_LSN, without bumping XLOG_PAGE_MAGIC. As always, update standby systems before master systems. This changes sizeof(RelationData) and sizeof(IndexStmt), breaking binary compatibility for affected extensions. (The most recent commit to affect the same class of extensions was 089e4d405d0f3b94c74a2c6a54357a84a681754b.) Kyotaro Horiguchi, reviewed (in earlier, similar versions) by Robert Haas. Heikki Linnakangas and Michael Paquier implemented earlier designs that materially clarified the problem. Reviewed, in earlier designs, by Andrew Dunstan, Andres Freund, Alvaro Herrera, Tom Lane, Fujii Masao, and Simon Riggs. Reported by Martijn van Oosterhout. Discussion: https://postgr.es/m/20150702220524.GA9392@svana.org	2020-03-21 09:38:26 -07:00
Amit Kapila	3ba59ccc89	Allow page lock to conflict among parallel group members. This is required as it is no safer for two related processes to perform clean up in gin indexes at a time than for unrelated processes to do the same. After acquiring page locks, we can acquire relation extension lock but reverse never happens which means these will also not participate in deadlock. So, avoid checking wait edges from this lock. Currently, the parallel mode is strictly read-only, but after this patch we have the infrastructure to allow parallel inserts and parallel copy. Author: Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com	2020-03-21 08:48:06 +05:30
Amit Kapila	85f6b49c2c	Allow relation extension lock to conflict among parallel group members. This is required as it is no safer for two related processes to extend the same relation at a time than for unrelated processes to do the same. We don't acquire a heavyweight lock on any other object after relation extension lock which means such a lock can never participate in the deadlock cycle. So, avoid checking wait edges from this lock. This provides an infrastructure to allow parallel operations like insert, copy, etc. which were earlier not possible as parallel group members won't conflict for relation extension lock. Author: Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com	2020-03-20 08:20:56 +05:30
Amit Kapila	72e78d831a	Add assert to ensure that page locks don't participate in deadlock cycle. Assert that we don't acquire any other heavyweight lock while holding the page lock except for relation extension. However, these locks are never taken in reverse order which implies that page locks will never participate in the deadlock cycle. Similar to relation extension, page locks are also held for a short duration, so imposing such a restriction won't hurt. Author: Dilip Kumar, with few changes by Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com	2020-03-19 08:11:45 +05:30
Amit Kapila	15ef6ff4b9	Assert that we don't acquire a heavyweight lock on another object after relation extension lock. The only exception to the rule is that we can try to acquire the same relation extension lock more than once. This is allowed as we are not creating any new lock for this case. This restriction implies that the relation extension lock won't ever participate in the deadlock cycle because we can never wait for any other heavyweight lock after acquiring this lock. Such a restriction is okay for relation extension locks as unlike other heavyweight locks these are not held till the transaction end. These are taken for a short duration to extend a particular relation and then released. Author: Dilip Kumar, with few changes by Amit Kapila Reviewed-by: Amit Kapila, Kuntal Ghosh and Sawada Masahiko Discussion: https://postgr.es/m/CAD21AoCmT3cFQUN4aVvzy5chw7DuzXrJCbrjTU05B+Ss=Gn1LA@mail.gmail.com	2020-03-18 07:20:17 +05:30
Peter Eisentraut	bf68b79e50	Refactor ps_status.c API The init_ps_display() arguments were mostly lies by now, so to match typical usage, just use one argument and let the caller assemble it from multiple sources if necessary. The only user of the additional arguments is BackendInitialize(), which was already doing string assembly on the caller side anyway. Remove the second argument of set_ps_display() ("force") and just handle that in init_ps_display() internally. BackendInitialize() also used to set the initial status as "authentication", but that was very far from where authentication actually happened. So now it's set to "initializing" and then "authentication" just before the actual call to ClientAuthentication(). Reviewed-by: Julien Rouhaud <rjuju123@gmail.com> Reviewed-by: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Discussion: https://www.postgresql.org/message-id/flat/c65e5196-4f04-4ead-9353-6088c19615a3@2ndquadrant.com	2020-03-11 16:38:31 +01:00
Peter Eisentraut	c096a804d9	Remove STATUS_FOUND Replace the solitary use with a bool. Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://www.postgresql.org/message-id/flat/a6f91ead-0ce4-2a34-062b-7ab9813ea308%402ndquadrant.com	2020-01-11 07:48:57 +01:00
Bruce Momjian	7559d8ebfa	Update copyrights for 2020 Backpatch-through: update all files in master, backpatch legal files through 9.4	2020-01-01 12:21:45 -05:00
Michael Paquier	8548ddc61b	Fix inconsistencies and typos in the tree, take 9 This addresses more issues with code comments, variable names and unreferenced variables. Author: Alexander Lakhin Discussion: https://postgr.es/m/7ab243e0-116d-3e44-d120-76b3df7abefd@gmail.com	2019-08-05 12:14:58 +09:00
Michael Paquier	0896ae561b	Fix inconsistencies and typos in the tree This is numbered take 7, and addresses a set of issues around: - Fixes for typos and incorrect reference names. - Removal of unneeded comments. - Removal of unreferenced functions and structures. - Fixes regarding variable name consistency. Author: Alexander Lakhin Discussion: https://postgr.es/m/10bfd4ac-3e7c-40ab-2b2e-355ed15495e8@gmail.com	2019-07-16 13:23:53 +09:00
Tom Lane	8255c7a5ee	Phase 2 pgindent run for v12. Switch to 2.1 version of pg_bsd_indent. This formats multiline function declarations "correctly", that is with additional lines of parameter declarations indented to match where the first line's left parenthesis is. Discussion: https://postgr.es/m/CAEepm=0P3FeTXRcU5B2W3jv3PgRVZ-kGUXLGfd42FFhUROO3ug@mail.gmail.com	2019-05-22 13:04:48 -04:00
Alvaro Herrera	ab0dfc961b	Report progress of CREATE INDEX operations This uses the progress reporting infrastructure added by `c16dc1aca5`, adding support for CREATE INDEX and CREATE INDEX CONCURRENTLY. There are two pieces to this: one is index-AM-agnostic, and the other is AM-specific. The latter is fairly elaborate for btrees, including reportage for parallel index builds and the separate phases that btree index creation uses; other index AMs, which are much simpler in their building procedures, have simplistic reporting only, but that seems sufficient, at least for non-concurrent builds. The index-AM-agnostic part is fairly complete, providing insight into the CONCURRENTLY wait phases as well as block-based progress during the index validation table scan. (The index validation index scan requires patching each AM, which has not been included here.) Reviewers: Rahila Syed, Pavan Deolasee, Tatsuro Yamada Discussion: https://postgr.es/m/20181220220022.mg63bhk26zdpvmcj@alvherre.pgsql	2019-04-02 15:18:08 -03:00
Michael Paquier	effe7d9552	Make release of 2PC identifier and locks consistent in COMMIT PREPARED When preparing a transaction in two-phase commit, a dummy PGPROC entry holding the GID used for the transaction is registered, which gets released once COMMIT PREPARED is run. Prior releasing its shared memory state, all the locks taken in the prepared transaction are released using a dedicated set of callbacks (pgstat and multixact having similar callbacks), which may cause the locks to be released before the GID is set free. Hence, there is a small window where lock conflicts could happen, for example: - Transaction A releases its locks, still holding its GID in shared memory. - Transaction B held a lock which conflicted with locks of transaction A. - Transaction B continues its processing, reusing the same GID as transaction A. - Transaction B fails because of a conflicting GID, already in use by transaction A. This commit changes the shared memory state release so as post-commit callbacks and predicate lock cleanup happen consistently with the shared memory state cleanup for the dummy PGPROC entry. The race window is small and 2PC had this issue from the start, so no backpatch is done. On top if that fixes discussed involved ABI breakages, which are not welcome in stable branches. Reported-by: Oleksii Kliukin, Ildar Musin Diagnosed-by: Oleksii Kliukin, Ildar Musin Author: Michael Paquier Reviewed-by: Masahiko Sawada, Oleksii Kliukin Discussion: https://postgr.es/m/BF9B38A4-2BFF-46E8-BA87-A2D00A8047A6@hintbits.com	2019-02-25 14:19:34 +09:00
Bruce Momjian	97c39498e5	Update copyright for 2019 Backpatch-through: certain files through 9.4	2019-01-02 12:44:25 -05:00
Tom Lane	b04aeb0a05	Add assertions that we hold some relevant lock during relation open. Opening a relation with no lock at all is unsafe; there's no guarantee that we'll see a consistent state of the relevant catalog entries. While use of MVCC scans to read the catalogs partially addresses that complaint, it's still possible to switch to a new catalog snapshot partway through loading the relcache entry. Moreover, whether or not you trust the reasoning behind sometimes using less than AccessExclusiveLock for ALTER TABLE, that reasoning is certainly not valid if concurrent users of the table don't hold a lock corresponding to the operation they want to perform. Hence, add some assertion-build-only checks that require any caller of relation_open(x, NoLock) to hold at least AccessShareLock. This isn't a full solution, since we can't verify that the lock level is semantically appropriate for the action --- but it's definitely of some use, because it's already caught two bugs. We can also assert that callers of addRangeTableEntryForRelation() hold at least the lock level specified for the new RTE. Amit Langote and Tom Lane Discussion: https://postgr.es/m/16565.1538327894@sss.pgh.pa.us	2018-10-01 12:43:21 -04:00
Tom Lane	8f0de712c3	Don't ignore locktable-full failures in StandbyAcquireAccessExclusiveLock. Commit `37c54863c` removed the code in StandbyAcquireAccessExclusiveLock that checked the return value of LockAcquireExtended. That created a bug, because it's still passing reportMemoryError = false to LockAcquireExtended, meaning that LOCKACQUIRE_NOT_AVAIL will be returned if we're out of shared memory for the lock table. In such a situation, the startup process would believe it had acquired an exclusive lock even though it hadn't, with potentially dire consequences. To fix, just drop the use of reportMemoryError = false, which allows us to simplify the call into a plain LockAcquire(). It's unclear that the locktable-full situation arises often enough that it's worth having a better recovery method than crash-and-restart. (I strongly suspect that the only reason the code path existed at all was that it was relatively simple to do in the pre-37c54863c implementation. But now it's not.) LockAcquireExtended's reportMemoryError parameter is now dead code and could be removed. I refrained from doing so, however, because there was some interest in resurrecting the behavior if we do get reports of locktable-full failures in the field. Also, it seems unwise to remove the parameter concurrently with shipping commit `f868a8143`, which added a parameter; if there are any third-party callers of LockAcquireExtended, we want them to get a wrong-number-of-parameters compile error rather than a possibly-silent misinterpretation of its last parameter. Back-patch to 9.6 where the bug was introduced. Discussion: https://postgr.es/m/6202.1536359835@sss.pgh.pa.us	2018-09-19 12:43:51 -04:00
Tom Lane	f868a8143a	Fix longstanding recursion hazard in sinval message processing. LockRelationOid and sibling routines supposed that, if our session already holds the lock they were asked to acquire, they could skip calling AcceptInvalidationMessages on the grounds that we must have already read any remote sinval messages issued against the relation being locked. This is normally true, but there's a critical special case where it's not: processing inside AcceptInvalidationMessages might attempt to access system relations, resulting in a recursive call to acquire a relation lock. Hence, if the outer call had acquired that same system catalog lock, we'd fall through, despite the possibility that there's an as-yet-unread sinval message for that system catalog. This could, for example, result in failure to access a system catalog or index that had just been processed by VACUUM FULL. This is the explanation for buildfarm failures we've been seeing intermittently for the past three months. The bug is far older than that, but commits `a54e1f158` et al added a new recursion case within AcceptInvalidationMessages that is apparently easier to hit than any previous case. To fix this, we must not skip calling AcceptInvalidationMessages until we have finished a call to it since acquiring a relation lock, not merely acquired the lock. (There's already adequate logic inside AcceptInvalidationMessages to deal with being called recursively.) Fortunately, we can implement that at trivial cost, by adding a flag to LOCALLOCK hashtable entries that tracks whether we know we have completed such a call. There is an API hazard added by this patch for external callers of LockAcquire: if anything is testing for LOCKACQUIRE_ALREADY_HELD, it might be fooled by the new return code LOCKACQUIRE_ALREADY_CLEAR into thinking the lock wasn't already held. This should be a fail-soft condition, though, unless something very bizarre is being done in response to the test. Also, I added an additional output argument to LockAcquireExtended, assuming that that probably isn't called by any outside code given the very limited usefulness of its additional functionality. Back-patch to all supported branches. Discussion: https://postgr.es/m/12259.1532117714@sss.pgh.pa.us	2018-09-07 18:04:54 -04:00
Bruce Momjian	9d4649ca49	Update copyright for 2018 Backpatch-through: certain files through 9.3	2018-01-02 23:30:12 -05:00

1 2 3 4 5 ...

317 Commits