max_slot_wal_keep_size that was added in v13 and wal_keep_segments are
the GUC parameters to specify how much WAL files to retain for
the standby servers. While max_slot_wal_keep_size accepts the number of
bytes of WAL files, wal_keep_segments accepts the number of WAL files.
This difference of setting units between those similar parameters could
be confusing to users.
To alleviate this situation, this commit renames wal_keep_segments to
wal_keep_size, and make users specify the WAL size in it instead of
the number of WAL files.
There was also the idea to rename max_slot_wal_keep_size to
max_slot_wal_keep_segments, in the discussion. But we have been moving
away from measuring in segments, for example, checkpoint_segments was
replaced by max_wal_size. So we concluded to rename wal_keep_segments
to wal_keep_size.
Back-patch to v13 where max_slot_wal_keep_size was added.
Author: Fujii Masao
Reviewed-by: Álvaro Herrera, Kyotaro Horiguchi, David Steele
Discussion: https://postgr.es/m/574b4ea3-e0f9-b175-ead2-ebea7faea855@oss.nttdata.com
The previous approach was to search-and-destroy all \r occurrences
no matter what. That seems more likely to hide bugs than anything
else; indeed it seems to be hiding one now. Fix things so that
we only transform \r\n to \n.
Side effects: must do this before, not after, chomp'ing if we're
going to chomp, else we'd fail to clean up a trailing \r\n. Also,
remove safe_psql's redundant repetition of what psql already did;
else it might reduce \r\r\n to \n, which is exactly the scenario
I'm hoping to expose.
Perhaps this should be back-patched, but for now I'm content to
see what happens in HEAD.
Discussion: https://postgr.es/m/412ae8da-76bb-640f-039a-f3513499e53d@gmx.net
pg_rewind needs to copy from the source cluster to the target cluster a
set of relation blocks changed from the previous checkpoint where WAL
forked up to the end of WAL on the target. Building this list of
relation blocks requires a range of WAL segments that may not be present
anymore on the target's pg_wal, causing pg_rewind to fail. It is
possible to work around this issue by copying manually the WAL segments
needed but this may lead to some extra and actually useless work.
This commit introduces a new option allowing pg_rewind to use a
restore_command while doing the rewind by grabbing the parameter value
of restore_command from the target cluster configuration. This allows
the rewind operation to be more reliable, so as only the WAL segments
needed by the rewind are restored from the archives.
In order to be able to do that, a new routine is added to src/common/ to
allow frontend tools to restore files from archives using an
already-built restore command. This version is more simple than the
backend equivalent as there is no need to handle the non-recovery case.
Author: Alexey Kondratov
Reviewed-by: Andrey Borodin, Andres Freund, Alvaro Herrera, Alexander
Korotkov, Michael Paquier
Discussion: https://postgr.es/m/a3acff50-5a0d-9a2c-b3b2-ee36168955c1@postgrespro.ru
This includes a couple of changes around the new behavior of pg_rewind
which enforces recovery to happen once on a cluster not shut down
cleanly:
- Some comments and documentation improvements.
- Shutdown the cluster to rewind with immediate mode in all the tests,
this allows to check after the forced recovery behavior which is wanted
as new default.
- Use -F for the forced recovery step, so as postgres does not use
fsync. This was useless as a final sync is done once the tool is done.
Author: Michael Paquier
Reviewed-by: Alexey Kondratov
Discussion: https://postgr.es/m/20191004083721.GA1829@paquier.xyz
Up to now the tests of pg_rewind have been using a superuser for all its
tests (which is the default of many tests actually, and something that
ought to be reviewed) when involving an online source server, still it
is possible to use a non-superuser role to do that as long as this role
is granted permissions to execute all the source-side functions used for
the rewind. This is possible since v11, and was already documented as
of bfc8068.
PostgresNode::init is extended so as callers of this routine can add
extra options to configure the authentication of a new node, which gets
used by this commit, and allows the tests to work properly on Windows
where SSPI is used.
This will allow to catch up easily any change in pg_rewind if the tool
begins to use more backend-side functions, so as the properties
introduced by v11 are kept.
Per suggestion from Peter Eisentraut.
Author: Michael Paquier
Reviewed-by: Magnus Hagander
Discussion: https://postgr.es/m/20190411041336.GM2728@paquier.xyz
This reverts commit d4e2a84, which added a new user with limited
permissions to run the TAP tests of pg_rewind. Buildfarm machine
members on Windows jacana and bowerbird have been complaining about
that, the new role not being able to run the rewind because SSPI is not
configured to allow it.
Fixing the test requires passing down directly the new user to
pg_regress with --create-role so as SSPI can work properly.
Reported-by: Andrew Dunstan
Discussion: https://postgr.es/m/3cd43d33-f415-cc41-ade3-7230ab15b2c9@2ndQuadrant.com
Up to now the tests of pg_rewind have been using a superuser for all the
tests (which is the default of many tests actually, and something that
ought to be reviewed) when involving an online source server, still it
is possible to use a non-superuser role to do that as long as this role
is granted permissions to execute all the source-side functions used for
the rewind. This is possible since v11, and was already documented as
of bfc8068.
This will allow to catch up easily any change in pg_rewind if the tool
begins to use more backend-side functions, so as the properties
introduced by v11 are kept.
Per suggestion from Peter Eisentraut.
Author: Michael Paquier
Reviewed-by: Magnus Hagander
Discussion: https://postgr.es/m/20190411041336.GM2728@paquier.xyz
This fixes an issue introduced by 266b6ac, which has added filters to
exclude file patterns on the target and source data directories to
reduce the number of files transferred. Filters get applied to both
the target and source data files, and include pg_internal.init which is
present for each database once relations are created on it. However, if
the target differed from the source with at least one new database with
relations, the rewind would fail due to the exclusion filters applied on
the target files, causing pg_internal.init to still be present on the
target database folder, while its contents should have been completely
removed so as there is nothing remaining inside at the time of the
folder deletion.
Applying exclusion filters on the source files is fine, because this way
the amount of data copied from the source to the target is reduced. And
actually, not applying the filters on the target is what pg_rewind
should do, because this causes such files to be automatically removed
during the rewind on the target. Exclusion filters apply to paths which
are removed or recreated automatically at startup, so removing all those
files on the target during the rewind is a win.
The existing set of TAP tests already stresses the rewind of databases,
but it did not include any tables on those newly-created databases.
Creating extra tables in this case is enough to reproduce the failure,
so the existing tests are extended to close the gap.
Reported-by: Mithun Cy
Author: Michael Paquier
Discussion: https://postgr.es/m/CADq3xVYt6_pO7ZzmjOqPgY9HWsL=kLd-_tNyMtdfjKqEALDyTA@mail.gmail.com
Backpatch-through: 11
When libpq is loaded in the server (for instance, by
libpqwalreceiver), it may use libpq environment variables set in the
postmaster environment for connection parameter defaults. This has
some confusing effects in our test suites. For example, the TAP test
infrastructure sets PGAPPNAME to allow identifying clients in the
server log. But this environment variable is also inherited by
temporary servers started with pg_ctl and is then in turn used by
libpqwalreceiver as the application_name for connecting to remote
servers where it then shows up in pg_stat_replication and is relevant
for things like synchronous_standby_names. Replication already has a
suitable default for application_name, and overriding that
accidentally then requires the individual test cases to re-override
that, which is all very confusing and unnecessary.
To fix, unset PGAPPNAME temporarily before running pg_ctl start or
restart in the tests.
More comprehensive approaches like unsetting all environment variables
in pg_ctl were considered but might be too complicated to achieve
portably.
The now unnecessary re-overriding of application_name by test cases is
also removed.
Reviewed-by: Noah Misch <noah@leadboat.com>
Discussion: https://www.postgresql.org/message-id/flat/33383613-690e-6f1b-d5ba-4957ff40f6ce@2ndquadrant.com
The modules RewindTest.pm and ServerSetup.pm are really only useful for
TAP tests, so they really belong in the TAP test directories. In
addition, ServerSetup.pm is renamed to SSLServer.pm.
The test scripts have their own directories added to the search path so
that the relocated modules will be found, regardless of where the tests
are run from, even on modern perl where "." is no longer in the
searchpath.
Discussion: https://postgr.es/m/e4b0f366-269c-73c3-9c90-d9cb0f4db1f9@2ndQuadrant.com
Backpatch as appropriate to 9.5