postgres

mirror of https://github.com/postgres/postgres.git synced 2025-10-16 17:07:43 +03:00

Author	SHA1	Message	Date
Peter Eisentraut	2b5f57977f	Add const qualifiers to XLogRegister*() functions Add const qualifiers to XLogRegisterData() and XLogRegisterBufData(). Several unconstify() calls can be removed. Reviewed-by: Aleksander Alekseev <aleksander@timescale.com> Discussion: https://www.postgresql.org/message-id/dd889784-9ce7-436a-b4f1-52e4a5e577bd@eisentraut.org	2024-09-03 08:06:03 +02:00
Michael Paquier	4236825197	Fix typos and grammar in code comments and docs Author: Alexander Lakhin Discussion: https://postgr.es/m/f7e514cf-2446-21f1-a5d2-8c089a6e2168@gmail.com	2024-09-03 14:49:04 +09:00
Michael Paquier	c39afc38cf	Define PG_LOGICAL_DIR for path pg_logical/ in data folder This is similar to `2065ddf5e3`, but this time for pg_logical/ itself and its contents, like the paths for snapshots, mappings or origin checkpoints. Author: Bertrand Drouvot Reviewed-by: Ashutosh Bapat, Yugo Nagata, Michael Paquier Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal	2024-08-30 15:25:12 +09:00
Michael Paquier	2065ddf5e3	Define PG_REPLSLOT_DIR for path pg_replslot/ in data folder This commit replaces most of the hardcoded values of "pg_replslot" by a new PG_REPLSLOT_DIR #define. This makes the style more consistent with the existing PG_STAT_TMP_DIR, for example. More places will follow a similar change. Author: Bertrand Drouvot Reviewed-by: Ashutosh Bapat, Yugo Nagata, Michael Paquier Discussion: https://postgr.es/m/ZryVvjqS9SnV1GPP@ip-10-97-1-34.eu-west-3.compute.internal	2024-08-30 10:42:21 +09:00
Peter Eisentraut	edee0c621d	Message style improvements	2024-08-29 14:43:34 +02:00
Amit Kapila	640178c92e	Rename the conflict types for the origin differ cases. The conflict types 'update_differ' and 'delete_differ' indicate that a row to be modified was previously altered by another origin. Rename those to 'update_origin_differs' and 'delete_origin_differs' to clarify their meaning. Author: Hou Zhijie Reviewed-by: Shveta Malik, Peter Smith Discussion: https://postgr.es/m/CAA4eK1+HEKwG_UYt4Zvwh5o_HoCKCjEGesRjJX38xAH3OxuuYA@mail.gmail.com	2024-08-29 09:12:12 +05:30
Peter Eisentraut	dc26ff2f22	Fix misplaced translator comments They did not immediately precede the code they were applying to.	2024-08-27 16:15:28 +02:00
Masahiko Sawada	7229ebe011	Fix identation.	2024-08-26 16:16:12 -07:00
Masahiko Sawada	52f1d6730b	Fix memory counter update in ReorderBuffer. Commit `5bec1d6bc5` changed the memory usage updates of the ReorderBufferTXN to zero all at once by subtracting txn->size, rather than updating it for each change. However, if TOAST reconstruction data remained in the transaction when freeing it, there were cases where it further subtracted the memory counter from zero, resulting in an assertion failure. This change calculates the memory size for each change and updates the memory usage to precisely the amount that has been freed. Backpatch to v17, where this was introducd. Reviewed-by: Amit Kapila, Shlok Kyal Discussion: https://postgr.es/m/CAD21AoAqkNUvicgKPT_dXzNoOwpPkVTg0QPPxEcWmzT0moCJ1g%40mail.gmail.com Backpatch-through: 17	2024-08-26 11:00:07 -07:00
Amit Kapila	3f28b2fcac	Don't advance origin during apply failure. We advance origin progress during abort on successful streaming and application of ROLLBACK in parallel streaming mode. But the origin shouldn't be advanced during an error or unsuccessful apply due to shutdown. Otherwise, it will result in a transaction loss as such a transaction won't be sent again by the server. Reported-by: Hou Zhijie Author: Hayato Kuroda and Shveta Malik Reviewed-by: Amit Kapila Backpatch-through: 16 Discussion: https://postgr.es/m/TYAPR01MB5692FAC23BE40C69DA8ED4AFF5B92@TYAPR01MB5692.jpnprd01.prod.outlook.com	2024-08-21 09:22:32 +05:30
Amit Kapila	9758174e2e	Log the conflicts while applying changes in logical replication. This patch provides the additional logging information in the following conflict scenarios while applying changes: insert_exists: Inserting a row that violates a NOT DEFERRABLE unique constraint. update_differ: Updating a row that was previously modified by another origin. update_exists: The updated row value violates a NOT DEFERRABLE unique constraint. update_missing: The tuple to be updated is missing. delete_differ: Deleting a row that was previously modified by another origin. delete_missing: The tuple to be deleted is missing. For insert_exists and update_exists conflicts, the log can include the origin and commit timestamp details of the conflicting key with track_commit_timestamp enabled. update_differ and delete_differ conflicts can only be detected when track_commit_timestamp is enabled on the subscriber. We do not offer additional logging for exclusion constraint violations because these constraints can specify rules that are more complex than simple equality checks. Resolving such conflicts won't be straightforward. This area can be further enhanced if required. Author: Hou Zhijie Reviewed-by: Shveta Malik, Amit Kapila, Nisha Moond, Hayato Kuroda, Dilip Kumar Discussion: https://postgr.es/m/OS0PR01MB5716352552DFADB8E9AD1D8994C92@OS0PR01MB5716.jpnprd01.prod.outlook.com	2024-08-20 08:35:11 +05:30
Amit Kapila	701cf1e317	Change the misleading local end_lsn for prepared transactions. The apply worker was using XactLastCommitEnd as local end_lsn for applying prepare and rollback_prepare. The XactLastCommitEnd value is the end lsn of the last commit applied before the prepare transaction which makes no sense. This LSN is used to decide whether we can send the acknowledgment of the corresponding remote LSN to the server. It is okay not to set the local_end LSN with the actual WAL position for the prepare because we always flush the prepare record. So, we can send the acknowledgment of the remote_end LSN as soon as prepare is finished. The current code is misleading but as such doesn't create any problem, so decided not to backpatch. Author: Hayato Kuroda Reviewed-by: Shveta Malik, Amit Kapila Discussion: https://postgr.es/m/TYAPR01MB5692FA4926754B91E9D7B5F0F5AA2@TYAPR01MB5692.jpnprd01.prod.outlook.com	2024-08-09 10:23:57 +05:30
Peter Eisentraut	9fb855fe1a	Include bison header files into implementation files Before Bison 3.4, the generated parser implementation files run afoul of -Wmissing-variable-declarations (in spite of commit `ab61c40bfa`) because declarations for yylval and possibly yylloc are missing. The generated header files contain an extern declaration, but the implementation files don't include the header files. Since Bison 3.4, the generated implementation files automatically include the generated header files, so then it works. To make this work with older Bison versions as well, include the generated header file from the .y file. (With older Bison versions, the generated implementation file contains effectively a copy of the header file pasted in, so including the header file is redundant. But we know this works anyway because the core grammar uses this arrangement already.) Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org	2024-08-02 10:25:11 +02:00
Amit Kapila	a67da49e1d	Avoid duplicate table scans for cross-partition updates during logical replication. When performing a cross-partition update in the apply worker, it needlessly scans the old partition twice, resulting in noticeable overhead. This commit optimizes it by removing the redundant table scan. Author: Hou Zhijie Reviewed-by: Hayato Kuroda, Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB571623E39984D94CBB5341D994AB2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2024-08-01 10:11:06 +05:30
Peter Eisentraut	ab61c40bfa	Add extern declarations for Bison global variables This adds extern declarations for some global variables produced by Bison that are not already declared in its generated header file. This is a workaround to be able to add -Wmissing-variable-declarations to the global set of warning options in the near future. Another longer-term solution would be to convert these grammars to "pure" parsers in Bison, to avoid global variables altogether. Note that the core grammar is already pure, so this patch did not need to touch it. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org	2024-07-25 09:26:08 +02:00
Amit Kapila	1462aad2e4	Allow altering of two_phase option of a SUBSCRIPTION. The two_phase option is controlled by both the publisher (as a slot option) and the subscriber (as a subscription option), so the slot option must also be modified. Changing the 'two_phase' option for a subscription from 'true' to 'false' is permitted only when there are no pending prepared transactions corresponding to that subscription. Otherwise, the changes of already prepared transactions can be replicated again along with their corresponding commit leading to duplicate data or errors. To avoid data loss, the 'two_phase' option for a subscription can only be changed from 'false' to 'true' once the initial data synchronization is completed. Therefore this is performed later by the logical replication worker. Author: Hayato Kuroda, Ajin Cherian, Amit Kapila Reviewed-by: Peter Smith, Hou Zhijie, Amit Kapila, Vitaly Davydov, Vignesh C Discussion: https://postgr.es/m/8fab8-65d74c80-1-2f28e880@39088166	2024-07-24 10:13:36 +05:30
Nathan Bossart	a99cc6c6b4	Use PqMsg_* macros in more places. Commit `f4b54e1ed9`, which introduced macros for protocol characters, missed updating a few places. It also did not introduce macros for messages sent from parallel workers to their leader processes. This commit adds a new section in protocol.h for those. Author: Aleksander Alekseev Discussion: https://postgr.es/m/CAJ7c6TNTd09AZq8tGaHS3LDyH_CCnpv0oOz2wN1dGe8zekxrdQ%40mail.gmail.com Backpatch-through: 17	2024-07-17 10:51:00 -05:00
Amit Kapila	63909da978	Fix a typo in logicalrep_write_typ(). Author: ChangAo Chen Discussion: https://postgr.es/m/tencent_CDECB843B30A8B6B5152FA6458F0F00FDE09@qq.com	2024-07-12 10:20:59 +05:30
Tom Lane	0d8bd0a72e	Improve logical replication connection-failure messages. These messages mostly said "could not connect to the publisher: %s" which is lacking context. Add some verbiage to indicate which subscription or worker process is failing. Nisha Moond Discussion: https://postgr.es/m/CABdArM7q1=zqL++cYd0hVMg3u_tc0S=0Of=Um-KvDhLony0cSg@mail.gmail.com	2024-07-11 13:21:13 -04:00
Masahiko Sawada	bb19b70081	Fix possibility of logical decoding partial transaction changes. When creating and initializing a logical slot, the restart_lsn is set to the latest WAL insertion point (or the latest replay point on standbys). Subsequently, WAL records are decoded from that point to find the start point for extracting changes in the DecodingContextFindStartpoint() function. Since the initial restart_lsn could be in the middle of a transaction, the start point must be a consistent point where we won't see the data for partial transactions. Previously, when not building a full snapshot, serialized snapshots were restored, and the SnapBuild jumps to the consistent state even while finding the start point. Consequently, the slot's restart_lsn and confirmed_flush could be set to the middle of a transaction. This could lead to various unexpected consequences. Specifically, there were reports of logical decoding decoding partial transactions, and assertion failures occurred because only subtransactions were decoded without decoding their top-level transaction until decoding the commit record. To resolve this issue, the changes prevent restoring the serialized snapshot and jumping to the consistent state while finding the start point. On v17 and HEAD, a flag indicating whether snapshot restores should be skipped has been added to the SnapBuild struct, and SNAPBUILD_VERSION has been bumpded. On backbranches, the flag is stored in the LogicalDecodingContext instead, preserving on-disk compatibility. Backpatch to all supported versions. Reported-by: Drew Callahan Reviewed-by: Amit Kapila, Hayato Kuroda Discussion: https://postgr.es/m/2444AA15-D21B-4CCE-8052-52C7C2DAFE5C%40amazon.com Backpatch-through: 12	2024-07-11 22:48:23 +09:00
Fujii Masao	629520be5f	Fix comment in libpqrcv_check_conninfo(). Previously, the comment incorrectly stated that libpqrcv_check_conninfo() returns true or false based on the connection string check. However, this function actually has a void return type and raises an error if the check fails. Author: Rintaro Ikeda Reviewed-by: Jelte Fennema-Nio, Fujii Masao Discussion: https://postgr.es/m/6a1ca81b27fec4da0ccdfaaaec787982@oss.nttdata.com	2024-07-09 21:30:18 +09:00
Amit Kapila	571f7f7086	To improve the code, move the error check in logical_read_xlog_page(). Commit `0fdab27ad6` changed the code to wait for WAL to be available before determining the timeline but forgot to move the failure check. This change is to make the related code easier to understand and enhance otherwise there is no bug in the current code. In the passing, improve the nearby comments to explain why we determine am_cascading_walsender after waiting for the required WAL. Author: Peter Smith Reviewed-by: Bertrand Drouvot, Amit Kapila Discussion: https://postgr.es/m/CAHut+PvqX49fusLyXspV1Mmd_EekPtXG0oT146vZjcb9XDvNgw@mail.gmail.com	2024-07-09 09:00:45 +05:30
Michael Paquier	b81a71aa05	Assign error codes where missing for user-facing failures All the errors triggered in the code paths patched here would cause the backend to issue an internal_error errcode, which is a state that should be used only for "can't happen" situations. However, these code paths are reachable by the regression tests, and could be seen by users in valid cases. Some regression tests expect internal errcodes as they manipulate the backend state to cause corruption (like checksums), or use elog() because it is more convenient (like injection points), these have no need to change. This reduces the number of internal failures triggered in a check-world by more than half, while providing correct errcodes for these valid cases. Reviewed-by: Robert Haas Discussion: https://postgr.es/m/Zic_GNgos5sMxKoa@paquier.xyz	2024-07-04 09:48:40 +09:00
Heikki Linnakangas	eb21f5bc67	Remove redundant SetProcessingMode(InitProcessing) calls After several refactoring iterations, auxiliary processes are no longer initialized from the bootstrapper. Using the InitProcessing mode for initializing auxiliary processes is more appropriate. Since the global variable Mode is initialized to InitProcessing, we can just remove the redundant calls of SetProcessingMode(InitProcessing). Author: Xing Guo <higuoxing@gmail.com> Discussion: https://www.postgresql.org/message-id/CACpMh%2BDBHVT4xPGimzvex%3DwMdMLQEu9PYhT%2BkwwD2x2nu9dU_Q%40mail.gmail.com	2024-07-02 20:14:40 +03:00
Peter Eisentraut	720b0eaae9	Convert some extern variables to static These probably should have been static all along, it was only forgotten out of sloppiness. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org	2024-07-02 07:26:22 +02:00
Tom Lane	1afe31f03c	Preserve CurrentMemoryContext across Start/CommitTransactionCommand. Up to now, committing a transaction has caused CurrentMemoryContext to get set to TopMemoryContext. Most callers did not pay any particular heed to this, which is problematic because TopMemoryContext is a long-lived context that never gets reset. If the caller assumes it can leak memory because it's running in a limited-lifespan context, that behavior translates into a session-lifespan memory leak. The first-reported instance of this involved ProcessIncomingNotify, which is called from the main processing loop that normally runs in MessageContext. That outer-loop code assumes that whatever it allocates will be cleaned up when we're done processing the current client message --- but if we service a notify interrupt, then whatever gets allocated before the next switch to MessageContext will be permanently leaked in TopMemoryContext. sinval catchup interrupts have a similar problem, and I strongly suspect that some places in logical replication do too. To fix this in a generic way, let's redefine the behavior as "CommitTransactionCommand restores the memory context that was current at entry to StartTransactionCommand". This clearly fixes the issue for the notify and sinval cases, and it seems to match the mental model that's in use in the logical replication code, to the extent that anybody thought about it there at all. For consistency, likewise make subtransaction exit restore the context that was current at subtransaction start (rather than always selecting the CurTransactionContext of the parent transaction level). This case has less risk of resulting in a permanent leak than the outer-level behavior has, but it would not meet the principle of least surprise for some CommitTransactionCommand calls to restore the previous context while others don't. While we're here, also change xact.c so that we reset TopTransactionContext at transaction exit and then re-use it in later transactions, rather than dropping and recreating it in each cycle. This probably doesn't save a lot given the context recycling mechanism in aset.c, but it should save a little bit. Per suggestion from David Rowley. (Parenthetically, the text in src/backend/utils/mmgr/README implies that this is how I'd planned to implement it as far back as commit `1aebc3618` --- but the code actually added in that commit just drops and recreates it each time.) This commit also cleans up a few places outside xact.c that were needlessly making TopMemoryContext current, and thus risking more leaks of the same kind. I don't think any of them represent significant leak risks today, but let's deal with them while the issue is top-of-mind. Per bug #18512 from wizardbrony. Commit to HEAD only, as this change seems to have some risk of breaking things for some callers. We'll apply a narrower fix for the known-broken cases in the back branches. Discussion: https://postgr.es/m/3478884.1718656625@sss.pgh.pa.us	2024-07-01 11:55:19 -04:00
Amit Kapila	2357c9223b	Rename standby_slot_names to synchronized_standby_slots. The standby_slot_names GUC allows the specification of physical standby slots that must be synchronized before the logical walsenders associated with logical failover slots. However, for this purpose, the GUC name is too generic. Author: Hou Zhijie Reviewed-by: Bertrand Drouvot, Masahiko Sawada Backpatch-through: 17 Discussion: https://postgr.es/m/ZnWeUgdHong93fQN@momjian.us	2024-07-01 11:36:56 +05:30
Michael Paquier	797adaf0fe	Format better code for xact_decode()'s XLOG_XACT_INVALIDATIONS This makes the code more consistent with the surroundings. Author: ChangAo Chen Reviewed-by: Ashutosh Bapat Discussion: https://postgr.es/m/CAExHW5tNTevUh58SKddTtcX3yU_5_PDSC8Mdp-Q2hc9PpZHRJg@mail.gmail.com	2024-07-01 10:08:00 +09:00
Amit Kapila	3e53492aa7	Drop the temporary tuple slots allocated by pgoutput. In pgoutput, when converting the child table's tuple format to match the parent table's, we temporarily create a new slot to store the converted tuple. However, we missed to drop such temporary slots, leading to resource leakage. Reported-by: Bowen Shi Author: Hou Zhijie Reviewed-by: Amit Kapila Backpatch-through: 15 Discussion: https://postgr.es/m/CAM_vCudv8dc3sjWiPkXx5F2b27UV7_YRKRbtSCcE-pv=cVACGA@mail.gmail.com	2024-06-27 11:35:00 +05:30
Peter Geoghegan	6207f08f70	Harmonize function parameter names for Postgres 17. Make sure that function declarations use names that exactly match the corresponding names from function definitions in a few places. These inconsistencies were all introduced during Postgres 17 development. pg_bsd_indent still has a couple of similar inconsistencies, which I (pgeoghegan) have left untouched for now. This commit was written with help from clang-tidy, by mechanically applying the same rules as similar clean-up commits (the earliest such commit was commit `035ce1fe`).	2024-06-12 17:01:51 -04:00
Amit Kapila	d1ffcc7fa3	Fix an assert in CheckPointReplicationSlots(). Commit `e0b2eed047` assumed that the confirmed_flush LSN can't go backward. However, it is possible that confirmed_flush LSN can go backward temporarily when the client acknowledges a prior value of flush location. This can happen when the client (subscriber in this case) acknowledges an LSN it doesn't have to do anything for (say for DDLs) and thus didn't store persistently. After restart, the client sends the prior value of flush LSN which it had stored persistently and the server updates the confirmed_flush LSN with that value. The fix is to remove the assumption and not allow the prior value of confirmed_flush LSN to be flushed to the disk. Author: Vignesh C Reviewed-by: Amit Kapila, Shlok Kyal Discussion: https://postgr.es/m/CALDaNm3hgow2+oEov5jBk4iYP5eQrUCF1yZtW7+dV3J__p4KLQ@mail.gmail.com	2024-06-11 10:51:34 +05:30
Peter Eisentraut	e9b7aee272	A few follow-up fixes for GUC name quoting Fixups for `17974ec259`: Some messages were missed (and some were new since the patch was originally proposed), and there was a typo introduced.	2024-05-17 13:48:31 +02:00
Peter Eisentraut	17974ec259	Revise GUC names quoting in messages again After further review, we want to move in the direction of always quoting GUC names in error messages, rather than the previous (PG16) wildly mixed practice or the intermittent (mid-PG17) idea of doing this depending on how possibly confusing the GUC name is. This commit applies appropriate quotes to (almost?) all mentions of GUC names in error messages. It partially supersedes `a243569bf6` and `8d9978a717`, which had moved things a bit in the opposite direction but which then were abandoned in a partial state. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/CAHut%2BPv-kSN8SkxSdoHano_wPubqcg5789ejhCDZAcLFceBR-w%40mail.gmail.com	2024-05-17 11:44:26 +02:00
Peter Eisentraut	6d716adf85	Fix incorrect format placeholder	2024-05-08 08:37:46 +02:00
David Rowley	a42fc1c903	Fix an assortment of typos Author: Alexander Lakhin Discussion: https://postgr.es/m/ae9f2fcb-4b24-5bb0-4240-efbbbd944ca1@gmail.com	2024-05-04 02:33:25 +12:00
David Rowley	310cd8ab38	Fix duplicated consecutive words in comments Also, fix a comment incorrectly referencing the "streaming read API". This was renamed to "read stream" shortly before being committed. Discussion: https://postgr.es/m/CAApHDvq-2Zdqytm_Hf3RmVf0qg5PS9jTFAJ5QTc9xH9pwvwDTA@mail.gmail.com	2024-04-28 20:03:34 +12:00
Amit Kapila	db08e8c6fa	Post-commit review fixes for slot synchronization. Allow pg_sync_replication_slots() to error out during promotion of standby. This makes the behavior of the SQL function consistent with the slot sync worker. We also ensured that pg_sync_replication_slots() cannot be executed if sync_replication_slots is enabled and the slotsync worker is already running to perform the synchronization of slots. Previously, it would have succeeded in cases when the worker is idle and failed when it is performing sync which could confuse users. This patch fixes another issue in the slot sync worker where SignalHandlerForShutdownRequest() needs to be registered before setting SlotSyncCtx->pid, otherwise, the slotsync worker could miss handling SIGINT sent by the startup process(ShutDownSlotSync) if it is sent before worker could register SignalHandlerForShutdownRequest(). To be consistent, all signal handlers' registration is moved to a prior location before we set the worker's pid. Ensure that we clean up synced temp slots at the end of pg_sync_replication_slots() to avoid such slots being left over after promotion. Ensure that ShutDownSlotSync() captures SlotSyncCtx->pid under spinlock to avoid accessing invalid value as it can be reset by concurrent slot sync exit due to an error. Author: Shveta Malik Reviewed-by: Hou Zhijie, Bertrand Drouvot, Amit Kapila, Masahiko Sawada Discussion: https://postgr.es/m/CAJpy0uBefXUS_TSz=oxmYKHdg-fhxUT0qfjASW3nmqnzVC3p6A@mail.gmail.com	2024-04-25 14:01:44 +05:30
Amit Kapila	aa79bde725	Fix the missing table sync due to improper invalidation handling. We missed performing table sync if the invalidation happened while the non-ready tables list was being prepared. This occurs because the sync state was set to valid at the end of non-ready table list preparation irrespective of the invalidations processed while the list is being prepared. Fix it by changing the boolean variable to a tri-state enum and by setting table state to valid only if no invalidations have occurred while the list is being prepared. Reprted-by: Alexander Lakhin Diagnosed-by: Alexander Lakhin Author: Vignesh C Reviewed-by: Hou Zhijie, Alexander Lakhin, Ajin Cherian, Amit Kapila Backpatch-through: 15 Discussion: https://postgr.es/m/711a6afe-edb7-1211-cc27-1bef8239eec7@gmail.com	2024-04-25 10:40:52 +05:30
Daniel Gustafsson	950d4a2cb1	Fix typos and duplicate words This fixes various typos, duplicated words, and tiny bits of whitespace mainly in code comments but also in docs. Author: Daniel Gustafsson <daniel@yesql.se> Author: Heikki Linnakangas <hlinnaka@iki.fi> Author: Alexander Lakhin <exclusion@gmail.com> Author: David Rowley <dgrowleyml@gmail.com> Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Discussion: https://postgr.es/m/3F577953-A29E-4722-98AD-2DA9EFF2CBB8@yesql.se	2024-04-18 21:28:07 +02:00
Amit Kapila	3741f2a09d	Fix the review comments and a bug in the slot sync code. Ensure that when updating the catalog_xmin of the synced slots, it is first written to disk before changing the in-memory value (effective_catalog_xmin). This is to prevent a scenario where the in-memory value change triggers a vacuum to remove catalog tuples before the catalog_xmin is written to disk. In the event of a crash before the catalog_xmin is persisted, we would not know that some required catalog tuples have been removed and the synced slot would be invalidated. Change the sanity check to ensure that remote_slot's confirmed_flush LSN can't precede the local/synced slot during slot sync. Note that the restart_lsn of the synced/local slot can be ahead of remote_slot. This can happen when slot advancing machinery finds a running xacts record after reaching the consistent state at a later point than the primary where it serializes the snapshot and updates the restart_lsn. Make the check to sync slots robust by allowing to sync only when the confirmed_lsn, restart_lsn, or catalog_xmin of the remote slot is ahead of the synced/local slot. Reported-by: Amit Kapila and Shveta Malik Author: Hou Zhijie, Shveta Malik Reviewed-by: Amit Kapila, Bertrand Drouvot Discussion: https://postgr.es/m/OS0PR01MB57162B67D3CB01B2756FBA6D94062@OS0PR01MB5716.jpnprd01.prod.outlook.com Discussion: https://postgr.es/m/CAJpy0uCSS5zmdyUXhvw41HSdTbRqX1hbYqkOfHNj7qQ+2zn0AQ@mail.gmail.com	2024-04-12 15:10:41 +05:30
Michael Paquier	8f136af3c4	Use correct datatype for xmin variables in slot.c Two variables storing a slot's effective_xmin and effective_catalog_xmin were saved as XLogRecPtr, which is incorrect as these should be TransactionIds. Oversight in `818fefd8fd`. Author: Bharath Rupireddy Discussion: https://postgr.es/m/CALj2ACVPSB74mrDTFezz-LV3Oi6F3SN71QA0oUHvndzi5dwTNg@mail.gmail.com Backpatch-through: 16	2024-04-11 17:19:20 +09:00
Masahiko Sawada	810f64a015	Revert indexed and enlargable binary heap implementation. This reverts commit `b840508644` and `bcb14f4abc`. These commits were made for commit `5bec1d6bc5` (Improve eviction algorithm in ReorderBuffer using max-heap for many subtransactions). However, per discussion, commit `efb8acc0d0` replaced binary heap + index with pairing heap, and made these commits unnecessary. Reported-by: Jeff Davis Discussion: https://postgr.es/m/12747c15811d94efcc5cda72d6b35c80d7bf3443.camel%40j-davis.com	2024-04-11 17:18:05 +09:00
Masahiko Sawada	efb8acc0d0	Replace binaryheap + index with pairingheap in reorderbuffer.c A pairing heap can perform the same operations as the binary heap + index, with as good or better algorithmic complexity, and that's an existing data structure so that we don't need to invent anything new compared to v16. This commit makes the new binaryheap functionality that was added in commits `b840508644` and `bcb14f4abc` unnecessary, but they will be reverted separately. Remove the optimization to only build and maintain the heap when the amount of memory used is close to the limit, becuase the bookkeeping overhead with the pairing heap seems to be small enough that it doesn't matter in practice. Reported-by: Jeff Davis Author: Heikki Linnakangas Reviewed-by: Michael Paquier, Hayato Kuroda, Masahiko Sawada Discussion: https://postgr.es/m/12747c15811d94efcc5cda72d6b35c80d7bf3443.camel%40j-davis.com	2024-04-11 17:04:38 +09:00
David Rowley	8461424fd7	Fixup various StringInfo function usages This adjusts various appendStringInfo* function calls to use a more appropriate and efficient function with the same behavior. For example, use appendStringInfoChar() when appending a single character rather than appendStringInfo() and appendStringInfoString() when no formatting is required rather than using appendStringInfo(). All adjustments made here are in code that's new to v17, so it makes sense to fix these now rather than wait a few years and make backpatching harder. Discussion: https://postgr.es/m/CAApHDvojY2UvMiO+9_55ArTj10P1LBNJyyoGB+C65BLDNT0GsQ@mail.gmail.com Reviewed-by: Nathan Bossart, Tom Lane	2024-04-10 11:53:32 +12:00
Tom Lane	4643a2b265	Support retrieval of results in chunks with libpq. This patch generalizes libpq's existing single-row mode to allow individual partial-result PGresults to contain up to N rows, rather than always one row. This reduces malloc overhead compared to plain single-row mode, and it is very useful for psql's FETCH_COUNT feature, since otherwise we'd have to add code (and cycles) to either merge single-row PGresults into a bigger one or teach psql's results-printing logic to accept arrays of PGresults. To avoid API breakage, PQsetSingleRowMode() remains the same, and we add a new function PQsetChunkedRowsMode() to invoke the more general case. Also, PGresults obtained the old way continue to carry the PGRES_SINGLE_TUPLE status code, while if PQsetChunkedRowsMode() is used then their status code is PGRES_TUPLES_CHUNK. The underlying logic is the same either way, though. Daniel Vérité, reviewed by Laurenz Albe and myself (and whacked around a bit by me, so any remaining bugs are my fault) Discussion: https://postgr.es/m/CAKZiRmxsVTkO928CM+-ADvsMyePmU3L9DQCa9NwqjvLPcEe5QA@mail.gmail.com	2024-04-06 20:45:11 -04:00
Amit Kapila	6f132ed693	Allow synced slots to have their inactive_since. This commit does two things: 1) Maintains inactive_since for sync slots whenever the slot is released just like any other regular slot. 2) Ensures the value is set to the current timestamp during the promotion of standby to help correctly interpret the time after promotion. We don't want the slots to appear inactive for a long time after promotion if they haven't been synchronized recently. This would also avoid the invalidation of such slots immediately after promotion if tomorrow we have a feature that invalidates slots based on their inactivity time. Whoever acquires the slot i.e. makes the slot active will reset it to NULL. Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Amit Kapila, Shveta Malik, Masahiko Sawada Discussion: https://postgr.es/m/CAA4eK1KrPGwfZV9LYGidjxHeW+rxJ=E2ThjXvwRGLO=iLNuo=Q@mail.gmail.com Discussion: https://postgr.es/m/CALj2ACW4aUe-_uFQOjdWCEN-xXoLGhmvRFnL8SNw_TZ5nJe+aw@mail.gmail.com Discussion: https://postgr.es/m/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com	2024-04-05 09:48:49 +05:30
Amit Kapila	2ec005b4e2	Ensure that the sync slots reach a consistent state after promotion without losing data. We were directly copying the LSN locations while syncing the slots on the standby. Now, it is possible that at some particular restart_lsn there are some running xacts, which means if we start reading the WAL from that location after promotion, we won't reach a consistent snapshot state at that point. However, on the primary, we would have already been in a consistent snapshot state at that restart_lsn so we would have just serialized the existing snapshot. To avoid this problem we will use the advance_slot functionality unless the snapshot already exists at the synced restart_lsn location. This will help us to ensure that snapbuilder/slot statuses are updated properly without generating any changes. Note that the synced slot will remain as RS_TEMPORARY till the decoding from corresponding restart_lsn can reach a consistent snapshot state after which they will be marked as RS_PERSISTENT. Per buildfarm Author: Hou Zhijie Reviewed-by: Bertrand Drouvot, Shveta Malik, Bharath Rupireddy, Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB5716B3942AE49F3F725ACA92943B2@OS0PR01MB5716.jpnprd01.prod.outlook.com	2024-04-03 14:04:59 +05:30
Masahiko Sawada	5bec1d6bc5	Improve eviction algorithm in ReorderBuffer using max-heap for many subtransactions. Previously, when selecting the transaction to evict during logical decoding, we check all transactions to find the largest transaction. This could lead to a significant replication lag especially in the case where there are many subtransactions. This commit improves the eviction algorithm in ReorderBuffer using the max-heap with transaction size as the key to efficiently find the largest transaction. The max-heap starts with empty. While the max-heap is empty, we don't do anything for the max-heap when updating the memory counter. Therefore, we get the largest transaction in O(N) time, where N is the number of transactions including top-level transactions and subtransactions. We build the max-heap just before selecting the largest transactions if the number of transactions being decoded is higher than the threshold, MAX_HEAP_TXN_COUNT_THRESHOLD. After building the max-heap, we also update the max-heap when updating the memory counter. The intention is to efficiently find the largest transaction in O(1) time instead of incurring the cost of memory counter updates (O(log N)). Once the number of transactions got lower than the threshold, we reset the max-heap. The performance benchmark results showed significant speed up (more than x30 speed up on my machine) in decoding a transaction with 100k subtransactions, whereas there is no visible overhead in other cases. Reviewed-by: Amit Kapila, Hayato Kuroda, Vignesh C, Ajin Cherian, Tomas Vondra, Shubham Khanna, Peter Smith, Álvaro Herrera, Euler Taveira Discussion: https://postgr.es/m/CAD21AoAfKTgrBrLq96GcTv9d6k97zaQcDM-rxfKEt4GSe0qnaQ%40mail.gmail.com	2024-04-03 11:40:42 +09:00
Masahiko Sawada	b840508644	Add functions to binaryheap for efficient key removal and update. Previously, binaryheap didn't support updating a key and removing a node in an efficient way. For example, in order to remove a node from the binaryheap, the caller had to pass the node's position within the array that the binaryheap internally has. Removing a node from the binaryheap is done in O(log n) but searching for the key's position is done in O(n). This commit adds a hash table to binaryheap in order to track the position of each nodes in the binaryheap. That way, by using newly added functions such as binaryheap_update_up() etc., both updating a key and removing a node can be done in O(1) on an average and O(log n) in worst case. This is known as the indexed binary heap. The caller can specify to use the indexed binaryheap by passing indexed = true. The current code does not use the new indexing logic, but it will be used by an upcoming patch. Reviewed-by: Vignesh C, Peter Smith, Hayato Kuroda, Ajin Cherian, Tomas Vondra, Shubham Khanna Discussion: https://postgr.es/m/CAD21AoDffo37RC-eUuyHJKVEr017V2YYDLyn1xF_00ofptWbkg%40mail.gmail.com	2024-04-03 10:44:21 +09:00
Amit Kapila	6d49c8d4b4	Change last_inactive_time to inactive_since in pg_replication_slots. Commit `a11f330b55` added last_inactive_time to show the last time the slot was inactive. But, it tells the last time that a currently-inactive slot previously WAS active. This could be unclear, so we changed the name to inactive_since. Reported-by: Robert Haas Author: Bharath Rupireddy Reviewed-by: Bertrand Drouvot, Shveta Malik, Amit Kapila Discussion: https://postgr.es/m/CA+Tgmob_Ta-t2ty8QrKHBGnNLrf4ZYcwhGHGFsuUoFrAEDw4sA@mail.gmail.com Discussion: https://postgr.es/m/CALj2ACUXS0SfbHzsX8bqo+7CZhocsV52Kiu7OWGb5HVPAmJqnA@mail.gmail.com	2024-03-27 09:27:44 +05:30

1 2 3 4 5 ...

1581 Commits