1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-15 19:21:59 +03:00

Revert pg_wal_replay_wait() stored procedure

This commit reverts 3c5db1d6b0, and subsequent improvements and fixes
including 8036d73ae3, 867d396ccd, 3ac3ec580c, 0868d7ae70, 85b98b8d5a,
2520226c95, 014f9f34d2, e658038772, e1555645d7, 5035172e4a, 6cfebfe88b,
73da6b8d1b, and e546989a26.

The reason for reverting is a set of remaining issues.  Most notably, the
stored procedure appears to need more effort than the utility statement
to turn the backend into a "snapshot-less" state.  This makes an approach
to use stored procedures questionable.

Catversion is bumped.

Discussion: https://postgr.es/m/Zyhj2anOPRKtb0xW%40paquier.xyz
This commit is contained in:
Alexander Korotkov
2024-11-04 22:43:08 +02:00
parent 3293b718a0
commit 3a7ae6b3d9
22 changed files with 9 additions and 1020 deletions

View File

@ -29000,176 +29000,6 @@ postgres=# SELECT '0/0'::pg_lsn + pd.segment_number * ps.setting::int + :offset
the pause, the rate of WAL generation and available disk space.
</para>
<para>
The procedure shown in <xref linkend="recovery-synchronization-procedure-table"/>
can be executed only during recovery.
</para>
<table id="recovery-synchronization-procedure-table">
<title>Recovery Synchronization Procedure and Function</title>
<tgroup cols="1">
<thead>
<row>
<entry role="func_table_entry"><para role="func_signature">
Procedure or Function
</para>
<para>
Type
</para>
<para>
Description
</para></entry>
</row>
</thead>
<tbody>
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
<primary>pg_wal_replay_wait</primary>
</indexterm>
<function>pg_wal_replay_wait</function> (
<parameter>target_lsn</parameter> <type>pg_lsn</type>,
<parameter>timeout</parameter> <type>bigint</type> <literal>DEFAULT</literal> <literal>0</literal>,
<parameter>no_error</parameter> <type>bool</type> <literal>DEFAULT</literal> <literal>false</literal>)
</para>
<para>
Procedure
</para>
<para>
Waits until recovery replays <literal>target_lsn</literal>.
If no <parameter>timeout</parameter> is specified or it is set to
zero, this procedure waits indefinitely for the
<literal>target_lsn</literal>. If the <parameter>timeout</parameter>
is specified (in milliseconds) and is greater than zero, the
procedure waits until <literal>target_lsn</literal> is reached or
the specified <parameter>timeout</parameter> has elapsed.
On timeout, or if the server is promoted before
<literal>target_lsn</literal> is reached, an error is emitted,
as soon as <parameter>no_error</parameter> is false.
If <parameter>no_error</parameter> is set to true, then the procedure
doesn't throw errors. The last result status could be read
with <function>pg_wal_replay_wait_status</function>.
</para></entry>
</row>
<row>
<entry role="func_table_entry"><para role="func_signature">
<indexterm>
<primary>pg_wal_replay_wait_status</primary>
</indexterm>
<function>pg_wal_replay_wait_status</function> ()
<returnvalue>text</returnvalue>
</para>
<para>
Function
</para>
<para>
Returns the last result status for
<function>pg_wal_replay_wait</function> procedure. The possible
values are <literal>success</literal>, <literal>timeout</literal>,
and <literal>not in recovery</literal>.
</para></entry>
</row>
</tbody>
</tgroup>
</table>
<para>
<function>pg_wal_replay_wait</function> waits till
<parameter>target_lsn</parameter> to be replayed on standby.
That is, after this function execution, the value returned by
<function>pg_last_wal_replay_lsn</function> should be greater or equal
to the <parameter>target_lsn</parameter> value. This is useful to achieve
read-your-writes-consistency, while using async replica for reads and
primary for writes. In that case <acronym>lsn</acronym> of the last
modification should be stored on the client application side or the
connection pooler side.
</para>
<para>
<function>pg_wal_replay_wait</function> should be called on standby.
If a user calls <function>pg_wal_replay_wait</function> on primary, it
will error out as soon as <parameter>no_error</parameter> is false.
However, if <function>pg_wal_replay_wait</function> is
called on primary promoted from standby and <literal>target_lsn</literal>
was already replayed, then <function>pg_wal_replay_wait</function> just
exits immediately.
</para>
<para>
You can use <function>pg_wal_replay_wait</function> to wait for
the <type>pg_lsn</type> value. For example, an application could update
the <literal>movie</literal> table and get the <acronym>lsn</acronym> after
changes just made. This example uses <function>pg_current_wal_insert_lsn</function>
on primary server to get the <acronym>lsn</acronym> given that
<varname>synchronous_commit</varname> could be set to
<literal>off</literal>.
<programlisting>
postgres=# UPDATE movie SET genre = 'Dramatic' WHERE genre = 'Drama';
UPDATE 100
postgres=# SELECT pg_current_wal_insert_lsn();
pg_current_wal_insert_lsn
--------------------
0/306EE20
(1 row)
</programlisting>
Then an application could run <function>pg_wal_replay_wait</function>
with the <acronym>lsn</acronym> obtained from primary. After that the
changes made on primary should be guaranteed to be visible on replica.
<programlisting>
postgres=# CALL pg_wal_replay_wait('0/306EE20');
CALL
postgres=# SELECT * FROM movie WHERE genre = 'Drama';
genre
-------
(0 rows)
</programlisting>
It may also happen that target <acronym>lsn</acronym> is not reached
within the timeout. In that case the error is thrown.
<programlisting>
postgres=# CALL pg_wal_replay_wait('0/306EE20', 100);
ERROR: timed out while waiting for target LSN 0/306EE20 to be replayed; current replay LSN 0/306EA60
</programlisting>
The same example uses <function>pg_wal_replay_wait</function> with
<parameter>no_error</parameter> set to true. In this case, the result
status must be read with <function>pg_wal_replay_wait_status</function>.
<programlisting>
postgres=# CALL pg_wal_replay_wait('0/306EE20', 100, true);
CALL
postgres=# SELECT pg_wal_replay_wait_status();
pg_wal_replay_wait_status
---------------------------
timeout
(1 row)
</programlisting>
</para>
<para>
<function>pg_wal_replay_wait</function> can't be used within
a transaction with an isolation level higher than
<literal>READ COMMITTED</literal>, another procedure, or a function.
All the cases above imply holding a snapshot, which could prevent
WAL records from replaying (see <xref linkend="hot-standby-conflict"/>)
and cause an indirect deadlock.
<programlisting>
postgres=# BEGIN;
BEGIN
postgres=*# CALL pg_wal_replay_wait('0/306EE20');
ERROR: pg_wal_replay_wait() must be only called without an active or registered snapshot
DETAIL: Make sure pg_wal_replay_wait() isn't called within a transaction with an isolation level higher than READ COMMITTED, another procedure, or a function.
</programlisting>
</para>
</sect2>
<sect2 id="functions-snapshot-synchronization">