postgres

mirror of https://github.com/postgres/postgres.git synced 2025-08-25 20:23:07 +03:00

Author	SHA1	Message	Date
Alvaro Herrera	515e3d84a0	Avoid creating archive status ".ready" files too early WAL records may span multiple segments, but XLogWrite() does not wait for the entire record to be written out to disk before creating archive status files. Instead, as soon as the last WAL page of the segment is written, the archive status file is created, and the archiver may process it. If PostgreSQL crashes before it is able to write and flush the rest of the record (in the next WAL segment), the wrong version of the first segment file lingers in the archive, which causes operations such as point-in-time restores to fail. To fix this, keep track of records that span across segments and ensure that segments are only marked ready-for-archival once such records have been completely written to disk. This has always been wrong, so backpatch all the way back. Author: Nathan Bossart <bossartn@amazon.com> Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: Ryo Matsumura <matsumura.ryo@fujitsu.com> Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru> Discussion: https://postgr.es/m/CBDDFA01-6E40-46BB-9F98-9340F4379505@amazon.com	2021-08-23 15:50:35 -04:00
Michael Paquier	a3fcbcda75	Fix backup manifests to generate correct WAL-Ranges across timelines In a backup manifest, WAL-Ranges stores the range of WAL that is required for the backup to be valid. pg_verifybackup would then internally use pg_waldump for the checks based on this data. When the timeline where the backup started was more than 1 with a history file looked at for the manifest data generation, the calculation of the WAL range for the first timeline to check was incorrect. The previous logic used as start LSN the start position of the first timeline, but it needs to use the start LSN of the backup. This would cause failures with pg_verifybackup, or any tools making use of the backup manifests. This commit adds a test based on a logic using a self-promoted node, making it rather cheap. Author: Kyotaro Horiguchi Discussion: https://postgr.es/m/20210818.143031.1867083699202617521.horikyota.ntt@gmail.com Backpatch-through: 13	2021-08-23 11:09:33 +09:00
Amit Kapila	4cd7a18968	Rename LOGICAL_REP_MSG_STREAM_END to LOGICAL_REP_MSG_STREAM_STOP. In the code, most places used the term "Stream Stop" for the logical stream message. This commit improves consistency by renaming LogicalRepMsgType "LOGICAL_REP_MSG_STREAM_END" to "LOGICAL_REP_MSG_STREAM_STOP". Author: Masahiko Sawada Reviewed-by: Hou Zhijie, Amit Kapila Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com	2021-08-19 09:34:26 +05:30
Michael Paquier	2576dcfb76	Revert refactoring of hex code to src/common/ This is a combined revert of the following commits: - `c3826f8`, a refactoring piece that moved the hex decoding code to src/common/. This code was cleaned up by `aef8948`, as it originally included no overflow checks in the same way as the base64 routines in src/common/ used by SCRAM, making it unsafe for its purpose. - `aef8948`, a more advanced refactoring of the hex encoding/decoding code to src/common/ that added sanity checks on the result buffer for hex decoding and encoding. As reported by Hans Buschmann, those overflow checks are expensive, and it is possible to see a performance drop in the decoding/encoding of bytea or LOs the longer they are. Simple SQLs working on large bytea values show a clear difference in perf profile. - `ccf4e27`, a cleanup made possible by `aef8948`. The reverts of all those commits bring back the performance of hex decoding and encoding back to what it was in ~13. Fow now and post-beta3, this is the simplest option. Reported-by: Hans Buschmann Discussion: https://postgr.es/m/1629039545467.80333@nidsa.net Backpatch-through: 14	2021-08-19 09:20:13 +09:00
Fujii Masao	93d573d865	Remove unused argument "txn" in maybe_send_schema(). Commit `464824323e` added the argument "txn" into maybe_send_schema() though it was not used. Author: Hou Zhijie Reviewed-by: Fujii Masao Discussion: https://postgr.es/m/OS0PR01MB5716146E43928FB92D45D29794EC9@OS0PR01MB5716.jpnprd01.prod.outlook.com	2021-08-05 17:49:51 +09:00
Amit Kapila	63cf61cdeb	Add prepare API support for streaming transactions in logical replication. Commit `a8fd13cab0` added support for prepared transactions to built-in logical replication via a new option "two_phase" for a subscription. The "two_phase" option was not allowed with the existing streaming option. This commit permits the combination of "streaming" and "two_phase" subscription options. It extends the pgoutput plugin and the subscriber side code to add the prepare API for streaming transactions which will apply the changes accumulated in the spool-file at prepare time. Author: Peter Smith and Ajin Cherian Reviewed-by: Vignesh C, Amit Kapila, Greg Nancarrow Tested-By: Haiying Tang Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru Discussion: https://postgr.es/m/CAMGcDxeqEpWj3fTXwqhSwBdXd2RS9jzwWscO-XbeCfso6ts3+Q@mail.gmail.com	2021-08-04 07:47:06 +05:30
Jeff Davis	14d474e079	Improve documentation for START_REPLICATION ... LOGICAL. The starting point may not be exactly what the client requested; it may be at the slot's confirmed_flush_lsn. Also, upgrade the message from DEBUG1 to LOG when this happens. Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/c5c861d576f2511732f8002c76245da587110b1c.camel%40j-davis.com	2021-07-30 15:12:20 -07:00
Amit Kapila	16bd4becee	Remove unused argument in apply_handle_commit_internal(). Oversight in commit `0926e96c49`. Author: Masahiko Sawada Reviewed-By: Amit Kapila Backpatch-through: 14, where it was introduced Discussion: https://postgr.es/m/CAD21AoDeScrsHhLyEPYqN3sydg6PxAPVBboK=30xJfUVihNZDA@mail.gmail.com	2021-07-30 08:17:38 +05:30
Amit Kapila	91f9861242	Refactor to make common functions in proto.c and worker.c. This is a non-functional change only to refactor code to extract some replication logic into static functions. This is done as preparation for the 2PC streaming patch which also shares this common logic. Author: Peter Smith Reviewed-By: Amit Kapila Discussion: https://postgr.es/m/CAHut+PuiSA8AiLcE2N5StzSKs46SQEP_vDOUD5fX2XCVtfZ7mQ@mail.gmail.com	2021-07-29 15:51:45 +05:30
Michael Paquier	7b7fbe1e8b	Clarify some comments making use of leetspeak term "up2date" Most of these are new, as of `a8fd13c`, and "up-to-date" is much easier to parse for the average reader. Author: Peter Smith Discussion: https://postgr.es/m/CAHut+PtHbHvgOjs_R9LyDF21j-Wn8SxoTtWMQNP2ifXN6t2cSg@mail.gmail.com	2021-07-28 10:31:24 +09:00
Amit Kapila	01c3adcdd8	Fix potential buffer overruns in proto.c. Prevent potential buffer overruns when using strcpy to gid buffer. This has been introduced by commit `a8fd13cab0`. Reported-by: Tom Lane as per coverity Author: Peter Smith Reviewed-by: Amit Kapila Discussion: https://www.postgresql.org/message-id/161029.1626639923%40sss.pgh.pa.us	2021-07-20 08:15:01 +05:30
Alvaro Herrera	ead9e51e82	Advance old-segment horizon properly after slot invalidation When some slots are invalidated due to the max_slot_wal_keep_size limit, the old segment horizon should move forward to stay within the limit. However, in commit `c655077639` we forgot to call KeepLogSeg again to recompute the horizon after invalidating replication slots. In cases where other slots remained, the limits would be recomputed eventually for other reasons, but if all slots were invalidated, the limits would not move at all afterwards. Repair. Backpatch to 13 where the feature was introduced. Author: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reported-by: Marcin Krupowicz <mk@071.ovh> Discussion: https://postgr.es/m/17103-004130e8f27782c9@postgresql.org	2021-07-16 12:07:30 -04:00
Amit Kapila	a8fd13cab0	Add support for prepared transactions to built-in logical replication. To add support for streaming transactions at prepare time into the built-in logical replication, we need to do the following things: * Modify the output plugin (pgoutput) to implement the new two-phase API callbacks, by leveraging the extended replication protocol. * Modify the replication apply worker, to properly handle two-phase transactions by replaying them on prepare. * Add a new SUBSCRIPTION option "two_phase" to allow users to enable two-phase transactions. We enable the two_phase once the initial data sync is over. We however must explicitly disable replication of two-phase transactions during replication slot creation, even if the plugin supports it. We don't need to replicate the changes accumulated during this phase, and moreover, we don't have a replication connection open so we don't know where to send the data anyway. The streaming option is not allowed with this new two_phase option. This can be done as a separate patch. We don't allow to toggle two_phase option of a subscription because it can lead to an inconsistent replica. For the same reason, we don't allow to refresh the publication once the two_phase is enabled for a subscription unless copy_data option is false. Author: Peter Smith, Ajin Cherian and Amit Kapila based on previous work by Nikhil Sontakke and Stas Kelvich Reviewed-by: Amit Kapila, Sawada Masahiko, Vignesh C, Dilip Kumar, Takamichi Osumi, Greg Nancarrow Tested-By: Haiying Tang Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru Discussion: https://postgr.es/m/CAA4eK1+opiV4aFTmWWUF9h_32=HfPOW9vZASHarT0UA5oBrtGw@mail.gmail.com	2021-07-14 07:33:50 +05:30
Jeff Davis	8e7811e952	Eliminate replication protocol error related to IDENTIFY_SYSTEM. The requirement that IDENTIFY_SYSTEM be run before START_REPLICATION was both undocumented and unnecessary. Remove the error and ensure that ThisTimeLineID is initialized in START_REPLICATION. Elect not to backport because this requirement was expected behavior (even if inconsistently enforced), and is not likely to cause any major problem. Author: Jeff Davis Reviewed-by: Andres Freund Discussion: https://postgr.es/m/de4bbf05b7cd94227841c433ea6ff71d2130c713.camel%40j-davis.com	2021-07-09 11:37:45 -07:00
Daniel Gustafsson	387925893e	Fix typos in pgstat.c, reorderbuffer.c and pathnodes.h Reviewed-by: Thomas Munro <thomas.munro@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/50250765-5B87-4AD7-9770-7FCED42A6175@yesql.se	2021-07-08 12:45:09 +02:00
Tom Lane	50371df266	Don't try to print data type names in slot_store_error_callback(). The existing code tried to do syscache lookups in an already-failed transaction, which is problematic to say the least. After some consideration of alternatives, the best fix seems to be to just drop type names from the error message altogether. The table and column names seem like sufficient localization. If the user is unsure what types are involved, she can check the local and remote table definitions. Having done that, we can also discard the LogicalRepTypMap hash table, which had no other use. Arguably, LOGICAL_REP_MSG_TYPE replication messages are now obsolete as well; but we should probably keep them in case some other use emerges. (The complexity of removing something from the replication protocol would likely outweigh any savings anyhow.) Masahiko Sawada and Bharath Rupireddy, per complaint from Andres Freund. Back-patch to v10 where this code originated. Discussion: https://postgr.es/m/20210106020229.ne5xnuu6wlondjpe@alap3.anarazel.de	2021-07-02 16:04:54 -04:00
Amit Kapila	52d26d560e	Allow streaming the changes after speculative aborts. Until now, we didn't allow to stream the changes in logical replication till we receive speculative confirm or the next DML change record after speculative inserts. The reason was that we never use to process speculative aborts but after commit `4daa140a2f` it is possible to process them so we can allow streaming once we receive speculative abort after speculative insertion. We decided to backpatch to 14 where the feature for streaming in progress transactions have been introduced as this is a minor change and makes that functionality better. Author: Amit Kapila Reviewed-By: Dilip Kumar Backpatch-through: 14 Discussion: https://postgr.es/m/CAA4eK1KdqmTCtrBR6oFfGELrLLbDLDedL6zACcsUOQuTJBj1vw@mail.gmail.com	2021-06-30 09:37:59 +05:30
Amit Kapila	cda03cfed6	Allow enabling two-phase option via replication protocol. Extend the replication command CREATE_REPLICATION_SLOT to support the TWO_PHASE option. This will allow decoding commands like PREPARE TRANSACTION, COMMIT PREPARED and ROLLBACK PREPARED for slots created with this option. The decoding of the transaction happens at prepare command. This patch also adds support of two-phase in pg_recvlogical via a new option --two-phase. This option will also be used by future patches that allow streaming of transactions at prepare time for built-in logical replication. With this, the out-of-core logical replication solutions can enable replication of two-phase transactions via replication protocol. Author: Ajin Cherian Reviewed-By: Jeff Davis, Vignesh C, Amit Kapila Discussion: https://postgr.es/m/02DA5F5E-CECE-4D9C-8B4B-418077E2C010@postgrespro.ru https://postgr.es/m/64b9f783c6e125f18f88fbc0c0234e34e71d8639.camel@j-davis.com	2021-06-30 08:45:47 +05:30
Noah Misch	2b3e4672f7	Don't ERROR on PreallocXlogFiles() race condition. Before a restartpoint finishes PreallocXlogFiles(), a startup process KeepFileRestoredFromArchive() call can unlink the preallocated segment. If a CHECKPOINT sql command had elicited the restartpoint experiencing the race condition, that sql command failed. Moreover, the restartpoint omitted its log_checkpoints message and some inessential resource reclamation. Prevent the ERROR by skipping open() of the segment. Since these consequences are so minor, no back-patch. Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com	2021-06-28 18:34:56 -07:00
Noah Misch	421484f79c	Remove XLogFileInit() ability to unlink a pre-existing file. Only initdb used it. initdb refuses to operate on a non-empty directory and generally does not cope with pre-existing files of other kinds. Hence, use the opportunity to simplify. Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com	2021-06-28 18:34:56 -07:00
Noah Misch	c53c6b98d3	Remove XLogFileInit() ability to skip ControlFileLock. Cold paths, initdb and end-of-recovery, used it. Don't optimize them. Discussion: https://postgr.es/m/20210202151416.GB3304930@rfd.leadboat.com	2021-06-28 18:34:55 -07:00
Andrew Dunstan	e1c1c30f63	Pre branch pgindent / pgperltidy run Along the way make a slight adjustment to src/include/utils/queryjumble.h to avoid an unused typedef.	2021-06-28 11:05:54 -04:00
Peter Eisentraut	c302a61390	Error message refactoring Take some untranslatable things out of the message and replace by format placeholders, to reduce translatable strings and reduce translation mistakes.	2021-06-27 09:41:16 +02:00
Tom Lane	6b787d9e32	Improve SQLSTATE reporting in some replication-related code. I started out with the goal of reporting ERRCODE_CONNECTION_FAILURE when walrcv_connect() fails, but as I looked around I realized that whoever wrote this code was of the opinion that errcodes are purely optional. That's not my understanding of our project policy. Hence, make sure that an errcode is provided in each ereport that (a) is ERROR or higher level and (b) isn't arguably an internal logic error. Also fix some very dubious existing errcode assignments. While this is not per policy, it's also largely cosmetic, since few of these cases could get reported to applications. So I don't feel a need to back-patch. Discussion: https://postgr.es/m/2189704.1623512522@sss.pgh.pa.us	2021-06-16 11:52:05 -04:00
Amit Kapila	4daa140a2f	Fix decoding of speculative aborts. During decoding for speculative inserts, we were relying for cleaning toast hash on confirmation records or next change records. But that could lead to multiple problems (a) memory leak if there is neither a confirmation record nor any other record after toast insertion for a speculative insert in the transaction, (b) error and assertion failures if the next operation is not an insert/update on the same table. The fix is to start queuing spec abort change and clean up toast hash and change record during its processing. Currently, we are queuing the spec aborts for both toast and main table even though we perform cleanup while processing the main table's spec abort record. Later, if we have a way to distinguish between the spec abort record of toast and the main table, we can avoid queuing the change for spec aborts of toast tables. Reported-by: Ashutosh Bapat Author: Dilip Kumar Reviewed-by: Amit Kapila Backpatch-through: 9.6, where it was introduced Discussion: https://postgr.es/m/CAExHW5sPKF-Oovx_qZe4p5oM6Dvof7_P+XgsNAViug15Fm99jA@mail.gmail.com	2021-06-15 08:28:36 +05:30
Alvaro Herrera	33c5099567	Fix logic bug in `1632ea4368` I overlooked that one condition was logically inverted. The fix is a little bit more involved than simply negating the condition, to make the code easier to read. Fix some outdated comments left by the same commit, while at it. Author: Masahiko Sawada <sawada.mshk@gmail.com> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Reviewed-by: Amit Kapila <amit.kapila16@gmail.com> Discussion: https://postgr.es/m/YMRlmB3/lZw8YBH+@paquier.xyz	2021-06-14 16:31:12 -04:00
Tom Lane	fe6a20ce54	Don't use Asserts to check for violations of replication protocol. Using an Assert to check the validity of incoming messages is an extremely poor decision. In a debug build, it should not be that easy for a broken or malicious remote client to crash the logrep worker. The consequences could be even worse in non-debug builds, which will fail to make such checks at all, leading to who-knows-what misbehavior. Hence, promote every Assert that could possibly be triggered by wrong or out-of-order replication messages to a full test-and-ereport. To avoid bloating the set of messages the translation team has to cope with, establish a policy that replication protocol violation error reports don't need to be translated. Hence, all the new messages here use errmsg_internal(). A couple of old messages are changed likewise for consistency. Along the way, fix some non-idiomatic or outright wrong uses of hash_search(). Most of these mistakes are new with the "streaming replication" patch (commit `464824323`), but a couple go back a long way. Back-patch as appropriate. Discussion: https://postgr.es/m/1719083.1623351052@sss.pgh.pa.us	2021-06-12 12:59:15 -04:00
Tom Lane	ab55d742eb	Fix multiple crasher bugs in partitioned-table replication logic. apply_handle_tuple_routing(), having detected and reported that the tuple it needed to update didn't exist, tried to update that tuple anyway, leading to a null-pointer dereference. logicalrep_partition_open() failed to ensure that the LogicalRepPartMapEntry it built for a partition was fully independent of that for the partition root, leading to trouble if the root entry was later freed or rebuilt. Meanwhile, on the publisher's side, pgoutput_change() sometimes attempted to apply execute_attr_map_tuple() to a NULL tuple. The first of these was reported by Sergey Bernikov in bug #17055; I found the other two while developing some test cases for this sadly under-tested code. Diagnosis and patch for the first issue by Amit Langote; patches for the others by me; new test cases by me. Back-patch to v13 where this logic came in. Discussion: https://postgr.es/m/17055-9ba800ec8522668b@postgresql.org	2021-06-11 16:12:41 -04:00
Alvaro Herrera	1632ea4368	Return ReplicationSlotAcquire API to its original form Per 96540f80f833; the awkward API introduced by `c655077639` is no longer needed. Author: Andres Freund <andres@anarazel.de> Reviewed-by: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20210408020913.zzprrlvqyvlt5cyy@alap3.anarazel.de	2021-06-11 15:48:26 -04:00
Alvaro Herrera	96540f80f8	Fix race condition in invalidating obsolete replication slots The code added to mark replication slots invalid in commit `c655077639` had the race condition that a slot can be dropped or advanced concurrently with checkpointer trying to invalidate it. Rewrite the code to close those races. The changes to ReplicationSlotAcquire's API added with `c655077639` are not necessary anymore. To avoid an ABI break in released branches, this commit leaves that unchanged; it'll be changed in a master-only commit separately. Backpatch to 13, where this code first appeared. Reported-by: Andres Freund <andres@anarazel.de> Author: Andres Freund <andres@anarazel.de> Author: Álvaro Herrera <alvherre@alvh.no-ip.org> Discussion: https://postgr.es/m/20210408001037.wfmk6jud36auhfqm@alap3.anarazel.de	2021-06-11 12:16:14 -04:00
Tom Lane	3a09d75b4f	Rearrange logrep worker's snapshot handling some more. It turns out that worker.c's code path for TRUNCATE was also careless about establishing a snapshot while executing user-defined code, allowing the checks added by commit `84f5c2908` to fail when a trigger is fired in that context. We could just wrap Push/PopActiveSnapshot around the truncate call, but it seems better to establish a policy of holding a snapshot throughout execution of a replication step. To help with that and possible future requirements, replace the previous ensure_transaction calls with pairs of begin/end_replication_step calls. Per report from Mark Dilger. Back-patch to v11, like the previous changes. Discussion: https://postgr.es/m/B4A3AF82-79ED-4F4C-A4E5-CD2622098972@enterprisedb.com	2021-06-10 12:27:27 -04:00
Peter Eisentraut	b29fa951ec	Add some const decorations One of these functions is new in PostgreSQL 14; might as well start it out right.	2021-06-10 16:21:48 +02:00
Amit Kapila	be57f21650	Remove two_phase variable from CreateReplicationSlotCmd struct. Commit `19890a064e` added the option to enable two_phase commits via pg_create_logical_replication_slot but didn't extend the support of same in replication protocol. However, by mistake, it added the two_phase variable in CreateReplicationSlotCmd which is required only when we extend the replication protocol. Reported-by: Jeff Davis Author: Ajin Cherian Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/64b9f783c6e125f18f88fbc0c0234e34e71d8639.camel@j-davis.com	2021-06-07 09:32:06 +05:30
Amit Kapila	eb89cb43a0	pgoutput: Fix memory leak due to RelationSyncEntry.map. Release memory allocated when creating the tuple-conversion map and its component TupleDescs when its owning sync entry is invalidated. TupleDescs must also be freed when no map is deemed necessary, to begin with. Reported-by: Andres Freund Author: Amit Langote Reviewed-by: Takamichi Osumi, Amit Kapila Backpatch-through: 13, where it was introduced Discussion: https://postgr.es/m/MEYP282MB166933B1AB02B4FE56E82453B64D9@MEYP282MB1669.AUSP282.PROD.OUTLOOK.COM	2021-06-01 14:27:14 +05:30
Amit Kapila	6f4bdf8152	Fix assertion during streaming of multi-insert toast changes. While decoding the multi-insert WAL we can't clean the toast untill we get the last insert of that WAL record. Now if we stream the changes before we get the last change, the memory for toast chunks won't be released and we expect the txn to have streamed all changes after streaming. This restriction is mainly to ensure the correctness of streamed transactions and it doesn't seem worth uplifting such a restriction just to allow this case because anyway we will stream the transaction once such an insert is complete. Previously we were using two different flags (one for toast tuples and another for speculative inserts) to indicate partial changes. Now instead we replaced both of them with a single flag to indicate partial changes. Reported-by: Pavan Deolasee Author: Dilip Kumar Reviewed-by: Pavan Deolasee, Amit Kapila Discussion: https://postgr.es/m/CABOikdN-_858zojYN-2tNcHiVTw-nhxPwoQS4quExeweQfG1Ug@mail.gmail.com	2021-05-27 07:59:43 +05:30
Tom Lane	b39630fd41	Fix access to no-longer-open relcache entry in logical-rep worker. If we redirected a replicated tuple operation into a partition child table, and then tried to fire AFTER triggers for that event, the relation cache entry for the child table was already closed. This has no visible ill effects as long as the entry is still there and still valid, but an unluckily-timed cache flush could result in a crash or other misbehavior. To fix, postpone the ExecCleanupTupleRouting call (which is what closes the child table) until after we've fired triggers. This requires a bit of refactoring so that the cleanup function can have access to the necessary state. In HEAD, I took the opportunity to simplify some of worker.c's function APIs based on use of the new ApplyExecutionData struct. However, it doesn't seem safe/practical to back-patch that aspect, at least not without a lot of analysis of possible interactions with `a04daa97a`. In passing, add an Assert to afterTriggerInvokeEvents to catch such cases. This seems worthwhile because we've grown a number of fairly unstructured ways of calling AfterTriggerEndQuery. Back-patch to v13, where worker.c grew the ability to deal with partitioned target tables. Discussion: https://postgr.es/m/3382681.1621381328@sss.pgh.pa.us	2021-05-22 21:25:29 -04:00
Tom Lane	84f5c2908d	Restore the portal-level snapshot after procedure COMMIT/ROLLBACK. COMMIT/ROLLBACK necessarily destroys all snapshots within the session. The original implementation of intra-procedure transactions just cavalierly did that, ignoring the fact that this left us executing in a rather different environment than normal. In particular, it turns out that handling of toasted datums depends rather critically on there being an outer ActiveSnapshot: otherwise, when SPI or the core executor pop whatever snapshot they used and return, it's unsafe to dereference any toasted datums that may appear in the query result. It's possible to demonstrate "no known snapshots" and "missing chunk number N for toast value" errors as a result of this oversight. Historically this outer snapshot has been held by the Portal code, and that seems like a good plan to preserve. So add infrastructure to pquery.c to allow re-establishing the Portal-owned snapshot if it's not there anymore, and add enough bookkeeping support that we can tell whether it is or not. We can't, however, just re-establish the Portal snapshot as part of COMMIT/ROLLBACK. As in normal transaction start, acquiring the first snapshot should wait until after SET and LOCK commands. Hence, teach spi.c about doing this at the right time. (Note that this patch doesn't fix the problem for any PLs that try to run intra-procedure transactions without using SPI to execute SQL commands.) This makes SPI's no_snapshots parameter rather a misnomer, so in HEAD, rename that to allow_nonatomic. replication/logical/worker.c also needs some fixes, because it wasn't careful to hold a snapshot open around AFTER trigger execution. That code doesn't use a Portal, which I suspect someday we're gonna have to fix. But for now, just rearrange the order of operations. This includes back-patching the recent addition of finish_estate() to centralize the cleanup logic there. This also back-patches commit `2ecfeda3e` into v13, to improve the test coverage for worker.c (it was that test that exposed that worker.c's snapshot management is wrong). Per bug #15990 from Andreas Wicht. Back-patch to v11 where intra-procedure COMMIT was added. Discussion: https://postgr.es/m/15990-eee2ac466b11293d@postgresql.org	2021-05-21 14:03:59 -04:00
Amit Kapila	6d0eb38557	Fix deadlock for multiple replicating truncates of the same table. While applying the truncate change, the logical apply worker acquires RowExclusiveLock on the relation being truncated. This allowed truncate on the relation at a time by two apply workers which lead to a deadlock. The reason was that one of the workers after updating the pg_class tuple tries to acquire SHARE lock on the relation and started to wait for the second worker which has acquired RowExclusiveLock on the relation. And when the second worker tries to update the pg_class tuple, it starts to wait for the first worker which leads to a deadlock. Fix it by acquiring AccessExclusiveLock on the relation before applying the truncate change as we do for normal truncate operation. Author: Peter Smith, test case by Haiying Tang Reviewed-by: Dilip Kumar, Amit Kapila Backpatch-through: 11 Discussion: https://postgr.es/m/CAHut+PsNm43p0jM+idTvWwiGZPcP0hGrHMPK9TOAkc+a4UpUqw@mail.gmail.com	2021-05-21 07:54:27 +05:30
Alvaro Herrera	db16c65647	Rename the logical replication global "wrconn" The worker.c global wrconn is only meant to be used by logical apply/ tablesync workers, but there are other variables with the same name. To reduce future confusion rename the global from "wrconn" to "LogRepWorkerWalRcvConn". While this is just cosmetic, it seems better to backpatch it all the way back to 10 where this code appeared, to avoid future backpatching issues. Author: Peter Smith <smithpb2250@gmail.com> Discussion: https://postgr.es/m/CAHut+Pu7Jv9L2BOEx_Z0UtJxfDevQSAUW2mJqWU+CtmDrEZVAg@mail.gmail.com	2021-05-12 19:13:54 -04:00
Tom Lane	def5b065ff	Initial pgindent and pgperltidy run for v14. Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.	2021-05-12 13:14:10 -04:00
Thomas Munro	c2dc19342e	Revert recovery prefetching feature. This set of commits has some bugs with known fixes, but at this late stage in the release cycle it seems best to revert and resubmit next time, along with some new automated test coverage for this whole area. Commits reverted: `dc88460c`: Doc: Review for "Optionally prefetch referenced data in recovery." `1d257577`: Optionally prefetch referenced data in recovery. `f003d9f8`: Add circular WAL decoding buffer. `323cbe7c`: Remove read_page callback from XLogReader. Remove the new GUC group WAL_RECOVERY recently added by `a55a9847`, as the corresponding section of config.sgml is now reverted. Discussion: https://postgr.es/m/CAOuzzgrn7iKnFRsB4MHp3UisEQAGgZMbk_ViTN4HV4-Ksq8zCg%40mail.gmail.com	2021-05-10 16:06:09 +12:00
Amit Kapila	592f00f8de	Update replication statistics after every stream/spill. Currently, replication slot statistics are updated at prepare, commit, and rollback. Now, if the transaction is interrupted the stats might not get updated. Fixed this by updating replication statistics after every stream/spill. In passing update the docs to change the description of some of the slot stats. Author: Vignesh C, Sawada Masahiko Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de	2021-05-06 11:21:26 +05:30
Amit Kapila	2ce353fc19	Tighten the concurrent abort check during decoding. During decoding of an in-progress or prepared transaction, we detect concurrent abort with an error code ERRCODE_TRANSACTION_ROLLBACK. That is not sufficient because a callback can decide to throw that error code at other times as well. Reported-by: Tom Lane Author: Amit Kapila Reviewed-by: Dilip Kumar Discussion: https://postgr.es/m/CAA4eK1KCjPRS4aZHB48QMM4J8XOC1+TD8jo-4Yu84E+MjwqVhA@mail.gmail.com	2021-05-06 08:26:42 +05:30
Amit Kapila	205f466282	Fix the computation of slot stats for 'total_bytes'. Previously, we were using the size of all the changes present in ReorderBuffer to compute total_bytes after decoding a transaction and that can lead to counting some of the transactions' changes more than once. Fix it by using the size of the changes decoded for a transaction to compute 'total_bytes'. Author: Sawada Masahiko Reviewed-by: Vignesh C, Amit Kapila Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de	2021-05-03 07:22:08 +05:30
Amit Kapila	ee4ba01dbb	Fix the bugs in selecting the transaction for streaming. There were two problems: a. We were always selecting the next available txn instead of selecting it when it is larger than the previous transaction. b. We were selecting the transactions which haven't made any changes to the database (base snapshot is not set). Later it was hitting an Assert because we don't decode such transactions and the changes in txn remain as it is. It is better not to choose such transactions for streaming in the first place. Reported-by: Haiying Tang Author: Dilip Kumar Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/OS0PR01MB61133B94E63177040F7ECDA1FB429@OS0PR01MB6113.jpnprd01.prod.outlook.com	2021-04-30 10:49:52 +05:30
Tom Lane	9626325da5	Add heuristic incoming-message-size limits in the server. We had a report of confusing server behavior caused by a client bug that sent junk to the server: the server thought the junk was a very long message length and waited patiently for data that would never come. We can reduce the risk of that by being less trusting about message lengths. For a long time, libpq has had a heuristic rule that it wouldn't believe large message size words, except for a small number of message types that are expected to be (potentially) long. This provides some defense against loss of message-boundary sync and other corrupted-data cases. The server does something similar, except that up to now it only limited the lengths of messages received during the connection authentication phase. Let's do the same as in libpq and put restrictions on the allowed length of all messages, while distinguishing between message types that are expected to be long and those that aren't. I used a limit of 10000 bytes for non-long messages. (libpq's corresponding limit is 30000 bytes, but given the asymmetry of the FE/BE protocol, there's no good reason why the numbers should be the same.) Experimentation suggests that this is at least a factor of 10, maybe a factor of 100, more than we really need; but plenty of daylight seems desirable to avoid false positives. In any case we can adjust the limit based on beta-test results. For long messages, set a limit of MaxAllocSize - 1, which is the most that we can absorb into the StringInfo buffer that the message is collected in. This just serves to make sure that a bogus message size is reported as such, rather than as a confusing gripe about not being able to enlarge a string buffer. While at it, make sure that non-mainline code paths (such as COPY FROM STDIN) are as paranoid as SocketBackend is, and validate the message type code before believing the message length. This provides an additional guard against getting stuck on corrupted input. Discussion: https://postgr.es/m/2003757.1619373089@sss.pgh.pa.us	2021-04-28 15:50:46 -04:00
Fujii Masao	8e9ea08bae	Don't pass "ONLY" options specified in TRUNCATE to foreign data wrapper. Commit `8ff1c94649` allowed TRUNCATE command to truncate foreign tables. Previously the information about "ONLY" options specified in TRUNCATE command were passed to the foreign data wrapper. Then postgres_fdw constructed the TRUNCATE command to issue the remote server and included "ONLY" options in it based on the passed information. On the other hand, "ONLY" options specified in SELECT, UPDATE or DELETE have no effect when accessing or modifying the remote table, i.e., are not passed to the foreign data wrapper. So it's inconsistent to make only TRUNCATE command pass the "ONLY" options to the foreign data wrapper. Therefore this commit changes the TRUNCATE command so that it doesn't pass the "ONLY" options to the foreign data wrapper, for the consistency with other statements. Also this commit changes postgres_fdw so that it always doesn't include "ONLY" options in the TRUNCATE command that it constructs. Author: Fujii Masao Reviewed-by: Bharath Rupireddy, Kyotaro Horiguchi, Justin Pryzby, Zhihong Yu Discussion: https://postgr.es/m/551ed8c1-f531-818b-664a-2cecdab99cd8@oss.nttdata.com	2021-04-27 14:41:27 +09:00
Amit Kapila	3fa17d3771	Use HTAB for replication slot statistics. Previously, we used to use the array of size max_replication_slots to store stats for replication slots. But that had two problems in the cases where a message for dropping a slot gets lost: 1) the stats for the new slot are not recorded if the array is full and 2) writing beyond the end of the array if the user reduces the max_replication_slots. This commit uses HTAB for replication slot statistics, resolving both problems. Now, pgstat_vacuum_stat() search for all the dead replication slots in stats hashtable and tell the collector to remove them. To avoid showing the stats for the already-dropped slots, pg_stat_replication_slots view searches slot stats by the slot name taken from pg_replication_slots. Also, we send a message for creating a slot at slot creation, initializing the stats. This reduces the possibility that the stats are accumulated into the old slot stats when a message for dropping a slot gets lost. Reported-by: Andres Freund Author: Sawada Masahiko, test case by Vignesh C Reviewed-by: Amit Kapila, Vignesh C, Dilip Kumar Discussion: https://postgr.es/m/20210319185247.ldebgpdaxsowiflw@alap3.anarazel.de	2021-04-27 09:09:11 +05:30
Amit Kapila	e7eea52b2d	Fix Logical Replication of Truncate in synchronous commit mode. The Truncate operation acquires an exclusive lock on the target relation and indexes. It then waits for logical replication of the operation to finish at commit. Now because we are acquiring the shared lock on the target index to get index attributes in pgoutput while sending the changes for the Truncate operation, it leads to a deadlock. Actually, we don't need to acquire a lock on the target index as we build the cache entry using a historic snapshot and all the later changes are absorbed while decoding WAL. So, we wrote a special purpose function for logical replication to get a bitmap of replica identity attribute numbers where we get that information without locking the target index. We decided not to backpatch this as there doesn't seem to be any field complaint about this issue since it was introduced in commit `5dfd1e5a` in v11. Reported-by: Haiying Tang Author: Takamichi Osumi, test case by Li Japin Reviewed-by: Amit Kapila, Ajin Cherian Discussion: https://postgr.es/m/OS0PR01MB6113C2499C7DC70EE55ADB82FB759@OS0PR01MB6113.jpnprd01.prod.outlook.com	2021-04-27 08:28:26 +05:30
Amit Kapila	f25a4584c6	Avoid sending prepare multiple times while decoding. We send the prepare for the concurrently aborted xacts so that later when rollback prepared is decoded and sent, the downstream should be able to rollback such a xact. For 'streaming' case (when we send changes for in-progress transactions), we were sending prepare twice when concurrent abort was detected. Author: Peter Smith Reviewed-by: Amit Kapila Discussion: https://postgr.es/m/f82133c6-6055-b400-7922-97dae9f2b50b@enterprisedb.com	2021-04-26 11:27:44 +05:30

1 2 3 4 5 ...

1127 Commits