1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-17 17:02:08 +03:00

Fix walsender failure at promotion.

If a standby server has a cascading standby server connected to it, it's
possible that WAL has already been sent up to the next WAL page boundary,
splitting a WAL record in the middle, when the first standby server is
promoted. Don't throw an assertion failure or error in walsender if that
happens.

Also, fix a variant of the same bug in pg_receivexlog: if it had already
received WAL on previous timeline up to a segment boundary, when the
upstream standby server is promoted so that the timeline switch record falls
on the previous segment, pg_receivexlog would miss the segment containing
the timeline switch. To fix that, have walsender send the position of the
timeline switch at end-of-streaming, in addition to the next timeline's ID.
It was previously assumed that the switch happened exactly where the
streaming stopped.

Note: this is an incompatible change in the streaming protocol. You might
get an error if you try to stream over timeline switches, if the client is
running 9.3beta1 and the server is more recent. It should be fine after a
reconnect, however.

Reported by Fujii Masao.
This commit is contained in:
Heikki Linnakangas
2013-05-08 20:10:17 +03:00
parent cb953d8b1b
commit 2ffa66f497
7 changed files with 142 additions and 43 deletions

View File

@ -1423,10 +1423,15 @@ The commands accepted in walsender mode are:
<para>
After streaming all the WAL on a timeline that is not the latest one,
the server will end streaming by exiting the COPY mode. When the client
acknowledges this by also exiting COPY mode, the server sends a
single-row, single-column result set indicating the next timeline in
this server's history. That is followed by a CommandComplete message,
and the server is ready to accept a new command.
acknowledges this by also exiting COPY mode, the server sends a result
set with one row and two columns, indicating the next timeline in this
server's history. The first column is the next timeline's ID, and the
second column is the XLOG position where the switch happened. Usually,
the switch position is the end of the WAL that was streamed, but there
are corner cases where the server can send some WAL from the old
timeline that it has not itself replayed before promoting. Finally, the
server sends CommandComplete message, and is ready to accept a new
command.
</para>
<para>