1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

Introduce Streaming Replication.

This includes two new kinds of postmaster processes, walsenders and
walreceiver. Walreceiver is responsible for connecting to the primary server
and streaming WAL to disk, while walsender runs in the primary server and
streams WAL from disk to the client.

Documentation still needs work, but the basics are there. We will probably
pull the replication section to a new chapter later on, as well as the
sections describing file-based replication. But let's do that as a separate
patch, so that it's easier to see what has been added/changed. This patch
also adds a new section to the chapter about FE/BE protocol, documenting the
protocol used by walsender/walreceivxer.

Bump catalog version because of two new functions,
pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for
monitoring the progress of replication.

Fujii Masao, with additional hacking by me
This commit is contained in:
Heikki Linnakangas
2010-01-15 09:19:10 +00:00
parent 4cbe473938
commit 40f908bdcd
53 changed files with 3567 additions and 220 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.135 2010/01/12 02:33:45 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/backup.sgml,v 2.136 2010/01/15 09:18:56 heikki Exp $ -->
<chapter id="backup">
<title>Backup and Restore</title>
@ -1088,6 +1088,8 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
file, and apply only for the duration of the recovery. They must be
reset for any subsequent recovery you wish to perform. They cannot be
changed once recovery has begun.
The parameters for streaming replication are described in <xref
linkend="replication-config-settings">.
</para>
<variablelist>
@ -1097,7 +1099,8 @@ restore_command = 'cp /mnt/server/archivedir/%f %p'
<listitem>
<para>
The shell command to execute to retrieve an archived segment of
the WAL file series. This parameter is required.
the WAL file series. This parameter is required for archive recovery,
but optional for streaming replication.
Any <literal>%f</> in the string is
replaced by the name of the file to retrieve from the archive,
and any <literal>%p</> is replaced by the path name to copy
@ -1843,6 +1846,227 @@ if (!triggered)
cooperation of the <varname>restore_command</> script with the data
copying program.
</para>
<para>
Starting with <productname>PostgreSQL</> version 8.5, you can use
streaming replication (see <xref linkend="streaming-replication">) to
achieve the same with less effort.
</para>
</sect2>
<sect2 id="streaming-replication">
<title>Streaming Replication</title>
<para>
<productname>PostgreSQL</> includes a simple streaming replication
mechanism, which lets the standby server to stay more up-to-date than
file-based replication allows. The standby connects to the primary
and the primary starts streaming WAL records from where the standby
left off, and continues streaming them as they are generated, without
waiting for the WAL file to be filled. So with streaming replication,
<varname>archive_timeout</> does not need to be configured.
</para>
<para>
Streaming replication relies on file-based continuous archiving for
making the base backup and for allowing a standby to catch up if it's
disconnected from the primary for long enough for the primary to
delete old WAL files still required by the standby.
</para>
<sect3 id="streaming-replication-setup">
<title>Setup</title>
<para>
The short procedure for configuring streaming replication is as follows.
For full details of each step, refer to other sections as noted.
<orderedlist>
<listitem>
<para>
Set up primary and standby systems as near identically as possible,
including two identical copies of <productname>PostgreSQL</> at the
same release level.
</para>
</listitem>
<listitem>
<para>
Set up continuous archiving from the primary to a WAL archive located
in a directory on the standby server. Ensure that
<xref linkend="guc-archive-mode">,
<xref linkend="guc-archive-command"> and
<xref linkend="guc-archive-timeout">
are set appropriately on the primary
(see <xref linkend="backup-archiving-wal">).
</para>
</listitem>
<listitem>
<para>
Set up connections and authentication so that the standby server can
successfully connect to the pseudo <literal>replication</> database of
the primary server (see
<xref linkend="streaming-replication-authentication">). Ensure that
<xref linkend="guc-listen-addresses"> and <filename>pg_hba.conf</> are
configured appropriately on the primary.
</para>
<para>
On systems that support the keepalive socket option, setting
<xref linkend="guc-tcp-keepalives-idle">,
<xref linkend="guc-tcp-keepalives-interval"> and
<xref linkend="guc-tcp-keepalives-count"> helps you to find the
troubles with replication (e.g., the network outage or the failure of
the standby server) as soon as possible.
</para>
</listitem>
<listitem>
<para>
Set the maximum number of concurrent connections from the standby servers
(see <xref linkend="guc-max-wal-senders"> for details).
</para>
</listitem>
<listitem>
<para>
Enable WAL archiving in the primary server because we need to make a base
backup of it later (see <xref linkend="guc-archive-mode"> and
<xref linkend="guc-archive-command"> for details).
</para>
</listitem>
<listitem>
<para>
Start the <productname>PostgreSQL</> server on the primary.
</para>
</listitem>
<listitem>
<para>
Make a base backup of the primary server (see
<xref linkend="backup-base-backup">), and load this data onto the
standby. Note that all files present in <filename>pg_xlog</>
and <filename>pg_xlog/archive_status</> on the <emphasis>standby</>
server should be removed because they might be obsolete.
</para>
</listitem>
<listitem>
<para>
Set up WAL archiving, connections and authentication like the primary
server, because the standby server might work as a primary server after
failover. Ensure that your settings are consistent with the
<emphasis>future</> environment after the primary and the standby
server are interchanged by failover. If you're setting up the standby
server for e.g reporting purposes, with no plans to fail over to it,
configure the standby accordingly.
</para>
</listitem>
<listitem>
<para>
Create a recovery command file <filename>recovery.conf</> in the data
directory on the standby server.
</para>
<variablelist id="replication-config-settings" xreflabel="Replication Settings">
<varlistentry id="standby-mode" xreflabel="standby_mode">
<term><varname>standby_mode</varname> (<type>boolean</type>)</term>
<listitem>
<para>
Specifies whether to start the <productname>PostgreSQL</> server as
a standby. If this parameter is <literal>on</>, the streaming
replication is enabled and the standby server will try to connect
to the primary to receive and apply WAL records continuously. The
default is <literal>off</>, which allows only an archive recovery
without replication. So, streaming replication requires this
parameter to be explicitly set to <literal>on</>.
</para>
</listitem>
</varlistentry>
<varlistentry id="primary-conninfo" xreflabel="primary_conninfo">
<term><varname>primary_conninfo</varname> (<type>string</type>)</term>
<listitem>
<para>
Specifies a connection string which is used for the standby server
to connect with the primary. This string is in the same format as
described in <xref linkend="libpq-connect">. If any option is
unspecified in this string, then the corresponding environment
variable (see <xref linkend="libpq-envars">) is checked. If the
environment variable is not set either, then the indicated built-in
defaults are used.
</para>
<para>
The built-in replication requires that a host name (or host address)
or port number which the primary server listens on should be
specified in this string, respectively. Also ensure that a role with
the <literal>SUPERUSER</> and <literal>LOGIN</> privileges on the
primary is set (see
<xref linkend="streaming-replication-authentication">). Note that
the password needs to be set if the primary demands password
authentication.
</para>
</listitem>
</varlistentry>
<varlistentry id="trigger-file" xreflabel="trigger_file">
<term><varname>trigger_file</varname> (<type>string</type>)</term>
<listitem>
<para>
Specifies a trigger file whose presence activates the standby.
If no trigger file is specified, the standby never exits
recovery.
</para>
</listitem>
</varlistentry>
</variablelist>
</listitem>
<listitem>
<para>
Start the <productname>PostgreSQL</> server on the standby. The standby
server will go into recovery mode and proceeds to receive WAL records
from the primary and apply them continuously.
</para>
</listitem>
</orderedlist>
</para>
</sect3>
<sect3 id="streaming-replication-authentication">
<title>Authentication</title>
<para>
It's very important that the access privilege for replication are set
properly so that only trusted users can read the WAL stream, because it's
easy to extract serious information from it.
</para>
<para>
Only superuser is allowed to connect to the primary as the replication
standby. So a role with the <literal>SUPERUSER</> and <literal>LOGIN</>
privileges needs to be created in the primary.
</para>
<para>
Client authentication for replication is controlled by the
<filename>pg_hba.conf</> record specifying <literal>replication</> in the
<replaceable>database</> field. For example, if the standby is running on
host IP <literal>192.168.1.100</> and the superuser's name for replication
is <literal>foo</>, the administrator can add the following line to the
<filename>pg_hba.conf</> file on the primary.
<programlisting>
# Allow the user "foo" from host 192.168.1.100 to connect to the primary
# as a replication standby if the user's password is correctly supplied.
#
# TYPE DATABASE USER CIDR-ADDRESS METHOD
host replication foo 192.168.1.100/32 md5
</programlisting>
</para>
<para>
The host name and port number of the primary, user name to connect as,
and password are specified in the <filename>recovery.conf</> file or
the corresponding environment variable on the standby.
For example, if the primary is running on host IP <literal>192.168.1.50</>,
port <literal>5432</literal>, the superuser's name for replication is
<literal>foo</>, and the password is <literal>foopass</>, the administrator
can add the following line to the <filename>recovery.conf</> file on the
standby.
<programlisting>
# The standby connects to the primary that is running on host 192.168.1.50
# and port 5432 as the user "foo" whose password is "foopass".
primary_conninfo = 'host=192.168.1.50 port=5432 user=foo password=foopass'
</programlisting>
</para>
</sect3>
</sect2>
<sect2 id="backup-incremental-updated">

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/client-auth.sgml,v 1.125 2009/12/12 21:35:21 mha Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/client-auth.sgml,v 1.126 2010/01/15 09:18:56 heikki Exp $ -->
<chapter id="client-authentication">
<title>Client Authentication</title>
@ -181,6 +181,8 @@ hostnossl <replaceable>database</replaceable> <replaceable>user</replaceable>
the requested user must be a member of the role with the same
name as the requested database. (<literal>samegroup</> is an
obsolete but still accepted spelling of <literal>samerole</>.)
The value <literal>replication</> specifies that the record
matches if streaming replication is requested.
Otherwise, this is the name of
a specific <productname>PostgreSQL</productname> database.
Multiple database names can be supplied by separating them with

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.243 2010/01/06 02:41:37 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.244 2010/01/15 09:18:58 heikki Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
@ -1746,6 +1746,51 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
</variablelist>
</sect2>
<sect2 id="runtime-config-replication">
<title>Streaming Replication</title>
<para>
These settings control the behavior of the built-in
<firstterm>streaming replication</> feature.
</para>
<variablelist>
<varlistentry id="guc-max-wal-senders" xreflabel="max_wal_senders">
<term><varname>max_wal_senders</varname> (<type>integer</type>)</term>
<indexterm>
<primary><varname>max_wal_senders</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
Specifies the maximum number of concurrent connections from standby
servers (i.e., the maximum number of simultaneously running WAL sender
processes). The default is zero. This parameter can only be set at
server start.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-wal-sender-delay" xreflabel="wal_sender_delay">
<term><varname>wal_sender_delay</varname> (<type>integer</type>)</term>
<indexterm>
<primary><varname>wal_sender_delay</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
Specifies the delay between activity rounds for the WAL sender.
In each round the WAL sender sends any WAL accumulated since last
round to the standby server. It then sleeps for
<varname>wal_sender_delay</> milliseconds, and repeats. The default
value is 200 milliseconds (<literal>200ms</>).
Note that on many systems, the effective resolution of sleep delays is
10 milliseconds; setting <varname>wal_sender_delay</> to a value that
is not a multiple of 10 might have the same results as setting it to
the next higher multiple of 10. This parameter can only be set in the
<filename>postgresql.conf</> file or on the server command line.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="runtime-config-standby">
<title>Standby Servers</title>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.495 2009/12/19 17:49:50 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/func.sgml,v 1.496 2010/01/15 09:18:58 heikki Exp $ -->
<chapter id="functions">
<title>Functions and Operators</title>
@ -12984,7 +12984,8 @@ SELECT set_config('log_statement_stats', 'off', false);
<para>
The functions shown in <xref
linkend="functions-admin-backup-table"> assist in making on-line backups.
Use of the first three functions is restricted to superusers.
Use of the first three functions is restricted to superusers. The first
five functions cannot be executed during recovery.
</para>
<table id="functions-admin-backup-table">
@ -13135,11 +13136,17 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<indexterm>
<primary>pg_is_in_recovery</primary>
</indexterm>
<indexterm>
<primary>pg_last_xlog_receive_location</primary>
</indexterm>
<indexterm>
<primary>pg_last_xlog_replay_location</primary>
</indexterm>
<para>
The functions shown in <xref
linkend="functions-recovery-info-table"> provide information
about the current status of Hot Standby.
about the current status of the standby.
These functions may be executed during both recovery and in normal running.
</para>
@ -13160,6 +13167,33 @@ postgres=# SELECT * FROM pg_xlogfile_name_offset(pg_stop_backup());
<entry>True if recovery is still in progress.
</entry>
</row>
<row>
<entry>
<literal><function>pg_last_xlog_receive_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
<entry>Get last transaction log location received and synced to disk during
streaming recovery. If streaming recovery is still in progress
this will increase monotonically. If streaming recovery has completed
then this value will remain static at the value of the last WAL record
received and synced to disk during that recovery. When the server has
been started without a streaming recovery then the return value will be
InvalidXLogRecPtr (0/0).
</entry>
</row>
<row>
<entry>
<literal><function>pg_last_xlog_replay_location</function>()</literal>
</entry>
<entry><type>text</type></entry>
<entry>Get last transaction log location replayed during recovery.
If recovery is still in progress this will increase monotonically.
If recovery has completed then this value will remain static at
the value of the last WAL record applied during that recovery.
When the server has been started normally without a recovery
then the return value will be InvalidXLogRecPtr (0/0).
</entry>
</row>
</tbody>
</tgroup>
</table>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.35 2009/04/27 16:27:35 momjian Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/high-availability.sgml,v 1.36 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="high-availability">
<title>High Availability, Load Balancing, and Replication</title>
@ -146,11 +146,16 @@ protocol to make nodes agree on a serializable transactional order.
made the new master database server. This is asynchronous and
can only be done for the entire database server.
</para>
<para>
A PITR warm standby server can be kept more up-to-date using the
streaming replication feature built into <productname>PostgreSQL</> 8.5
onwards.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Master-Slave Replication</term>
<term>Trigger-Based Master-Slave Replication</term>
<listitem>
<para>
@ -278,7 +283,7 @@ protocol to make nodes agree on a serializable transactional order.
<entry>Shared Disk Failover</entry>
<entry>File System Replication</entry>
<entry>Warm Standby Using PITR</entry>
<entry>Master-Slave Replication</entry>
<entry>Trigger-Based Master-Slave Replication</entry>
<entry>Statement-Based Replication Middleware</entry>
<entry>Asynchronous Multimaster Replication</entry>
<entry>Synchronous Multimaster Replication</entry>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.72 2009/08/07 20:54:31 alvherre Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/perform.sgml,v 1.73 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="performance-tips">
<title>Performance Tips</title>
@ -836,8 +836,9 @@ SELECT * FROM x, y, a, b, c WHERE something AND somethingelse;
needs to be written, because in case of an error, the files
containing the newly loaded data will be removed anyway.
However, this consideration does not apply when
<xref linkend="guc-archive-mode"> is on, as all commands
must write WAL in that case.
<xref linkend="guc-archive-mode"> is on or streaming replication
is allowed (i.e., <xref linkend="guc-max-wal-senders"> is more
than or equal to one), as all commands must write WAL in that case.
</para>
</sect2>

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.76 2009/12/02 04:54:10 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/protocol.sgml,v 1.77 2010/01/15 09:18:59 heikki Exp $ -->
<chapter id="protocol">
<title>Frontend/Backend Protocol</title>
@ -4140,6 +4140,66 @@ not line breaks.
</sect1>
<sect1 id="protocol-replication">
<title>Streaming Replication Protocol</title>
<para>
To initiate streaming replication, the frontend sends the "replication"
parameter in the startup message. This tells the backend to go into
walsender mode, where a small set of replication commands can be issued
instead of SQL statements. Only the simple query protocol can be used in
walsender mode.
The commands accepted in walsender mode are:
<variablelist>
<varlistentry>
<term>IDENTIFY_SYSTEM</term>
<listitem>
<para>
Requests the server to idenfity itself. Server replies with a result
set of a single row, and two fields:
systemid: The unique system identifier identifying the cluster. This
can be used to check that the base backup used to initialize the
slave came from the same cluster.
timeline: Current TimelineID. Also used to check that the slave is
consistent with the master.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>START_REPLICATION XXX/XXX</term>
<listitem>
<para>
Instructs backend to start streaming WAL, starting at point XXX/XXX.
Server can reply with an error e.g if the requested piece of WAL has
already been recycled. On success, server responds with a
CopyOutResponse message, and backend starts to stream WAL as CopyData
messages.
</para>
<para>
The payload in each CopyData message consists of an XLogRecPtr,
indicating the starting point of the WAL in the message, immediately
followed by the WAL data itself.
</para>
<para>
A single WAL record is never split across two CopyData messages. When
a WAL record crosses a WAL page boundary, however, and is therefore
already split using continuation records, it can be split at the page
boundary. In other words, the first main WAL record and its
continuation records can be split across different CopyData messages.
</para>
</listitem>
</varlistentry>
</variablelist>
</para>
</sect1>
<sect1 id="protocol-changes">
<title>Summary of Changes since Protocol 2.0</title>