1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-19 13:42:17 +03:00
Commit Graph

137 Commits

Author SHA1 Message Date
Simon Riggs
8585ad3625 Fix error code for canceling statement due to conflict with recovery.
All retryable conflict errors now have an error code that indicates that
a retry is possible, correcting my incomplete fix of 2010/05/12

Tatsuo Ishii and Simon Riggs, input from Robert Haas and Florian Pflug
2011-01-31 19:20:23 +00:00
Bruce Momjian
5d950e3b0c Stamp copyrights for year 2011. 2011-01-01 13:18:15 -05:00
Robert Haas
8bd4b89e24 Try to save a kernel call in ResolveRecoveryConflictWithVirtualXIDs.
If there's no work to be done, just exit quickly, before initialization.
2010-12-17 11:32:02 -05:00
Robert Haas
611fed3712 Reset 'ps' display just once when resolving VXID conflicts.
This prevents the word "waiting" from briefly disappearing from the ps
status line when ResolveRecoveryConflictWithVirtualXIDs begins a new
iteration of the outer loop.

Along the way, remove some useless pgstat_report_waiting() calls;
the startup process doesn't appear in pg_stat_activity.

Fujii Masao
2010-12-17 08:30:57 -05:00
Heikki Linnakangas
5a031a5556 Fix bugs in the hot standby known-assigned-xids tracking logic. If there's
an old transaction running in the master, and a lot of transactions have
started and finished since, and a WAL-record is written in the gap between
the creating the running-xacts snapshot and WAL-logging it, recovery will fail
with "too many KnownAssignedXids" error. This bug was reported by
Joachim Wieland on Nov 19th.

In the same scenario, when fewer transactions have started so that all the
xids fit in KnownAssignedXids despite the first bug, a more serious bug
arises. We incorrectly initialize the clog code with the oldest still running
transaction, and when we see the WAL record belonging to a transaction with
an XID larger than one that committed already before the checkpoint we're
recovering from, we zero the clog page containing the already committed
transaction, leading to data loss.

In hindsight, trying to track xids in the known-assigned-xids array before
seeing the running-xacts record was too complicated. To fix that, hold
XidGenLock while the running-xacts snapshot is taken and WAL-logged. That
ensures that no transaction can begin or end in that gap, so that in recvoery
we know that the snapshot contains all transactions running at that point in
WAL.
2010-12-07 09:23:30 +01:00
Simon Riggs
ed78384acd Move call to GetTopTransactionId() earlier in LockAcquire(),
removing an infrequently occurring race condition in Hot Standby.
An xid must be assigned before a lock appears in shared memory,
rather than immediately after, else GetRunningTransactionLocks()
may see InvalidTransactionId, causing assertion failures during
lock processing on standby.

Bug report and diagnosis by Fujii Masao, fix by me.
2010-11-29 01:08:02 +00:00
Simon Riggs
52010027ef Avoid spurious Hot Standby conflicts from btree delete records.
Similar conflicts were already avoided for related record types.
Massive over-caution resulted in a usability bug. Clear theoretical
basis for doing this is now confirmed by me.
Request to remove from Heikki (twice), over-caution by me.
2010-11-15 09:30:13 +00:00
Magnus Hagander
9f2e211386 Remove cvs keywords from all files. 2010-09-20 22:08:53 +02:00
Tom Lane
79dc97a401 Bring some sanity to the trace_recovery_messages code and docs.
Per gripe from Fujii Masao, though this is not exactly his proposed patch.
Categorize as DEVELOPER_OPTIONS and set context PGC_SIGHUP, as per Fujii,
but set the default to LOG because higher values aren't really sensible
(see the code for trace_recovery()).  Fix the documentation to agree with
the code and to try to explain what the variable actually does.  Get rid
of no-op calls trace_recovery(LOG), which accomplish nothing except to
demonstrate that this option confuses even its author.
2010-08-19 22:55:01 +00:00
Robert Haas
30c22eb8fc Correct sundry errors in Hot Standby-related comments.
Fujii Masao
2010-08-12 23:24:54 +00:00
Bruce Momjian
239d769e7e pgindent run for 9.0, second run 2010-07-06 19:19:02 +00:00
Tom Lane
e76c1a0f4d Replace max_standby_delay with two parameters, max_standby_archive_delay and
max_standby_streaming_delay, and revise the implementation to avoid assuming
that timestamps found in WAL records can meaningfully be compared to clock
time on the standby server.  Instead, the delay limits are compared to the
elapsed time since we last obtained a new WAL segment from archive or since
we were last "caught up" to WAL data arriving via streaming replication.
This avoids problems with clock skew between primary and standby, as well
as other corner cases that the original coding would misbehave in, such
as the primary server having significant idle time between transactions.
Per my complaint some time ago and considerable ensuing discussion.

Do some desultory editing on the hot standby documentation, too.
2010-07-03 20:43:58 +00:00
Itagaki Takahiro
9e3cd37576 Remove max_standby_delay message from ps display of recovery process
in waiting status. The parameter is not so interesting in ps display
because it is referable in postgresql.conf.
2010-06-14 00:49:24 +00:00
Simon Riggs
f9dbac9476 HS Defer buffer pin deadlock check until deadlock_timeout has expired.
During Hot Standby we need to check for buffer pin deadlocks when the
Startup process begins to wait, in case it never wakes up again. We
previously made the deadlock check immediately on the basis it was
cheap, though clearer thinking and prima facie evidence shows that
was too simple. Refactor existing code to make it easy to add in
deferral of deadlock check until deadlock_timeout allowing a good
reduction in deadlock checks since far few buffer pins are held for
that duration. It's worth doing anyway, though major goal is to
prevent further reports of context switching with high numbers of
users on occasional tests.
2010-05-26 19:52:52 +00:00
Simon Riggs
fd34374b17 Add many new Asserts in code and fix simple bug that slipped through
without them, related to previous commit. Report by Bruce Momjian.
2010-05-14 07:11:49 +00:00
Simon Riggs
8431e296ea Cleanup initialization of Hot Standby. Clarify working with reanalysis
of requirements and documentation on LogStandbySnapshot(). Fixes
two minor bugs reported by Tom Lane that would lead to an incorrect
snapshot after transaction wraparound. Also fix two other problems
discovered that would give incorrect snapshots in certain cases.
ProcArrayApplyRecoveryInfo() substantially rewritten. Some minor
refactoring of xact_redo_apply() and ExpireTreeKnownAssignedTransactionIds().
2010-05-13 11:15:38 +00:00
Tom Lane
f9ed327f76 Clean up some awkward, inaccurate, and inefficient processing around
MaxStandbyDelay.  Use the GUC units mechanism for the value, and choose more
appropriate timestamp functions for performing tests with it.  Make the
ps_activity manipulation in ResolveRecoveryConflictWithVirtualXIDs have
behavior similar to ps_activity code elsewhere, notably not updating the
display when update_process_title is off and not truncating the display
contents at an arbitrarily-chosen length.  Improve the docs to be explicit
about what MaxStandbyDelay actually measures, viz the difference between
primary and standby servers' clocks, and the possible hazards if their clocks
aren't in sync.
2010-05-02 02:10:33 +00:00
Heikki Linnakangas
9b8a73326e Introduce wal_level GUC to explicitly control if information needed for
archival or hot standby should be WAL-logged, instead of deducing that from
other options like archive_mode. This replaces recovery_connections GUC in
the primary, where it now has no effect, but it's still used in the standby
to enable/disable hot standby.

Remove the WAL-logging of "unlogged operations", like creating an index
without WAL-logging and fsyncing it at the end. Instead, we keep a copy of
the wal_mode setting and the settings that affect how much shared memory a
hot standby server needs to track master transactions (max_connections,
max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings
change, at server restart, write a WAL record noting the new settings and
update pg_control. This allows us to notice the change in those settings in
the standby at the right moment, they used to be included in checkpoint
records, but that meant that a changed value was not reflected in the
standby until the first checkpoint after the change.

Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to
the sequence it used to follow, before hot standby and subsequent patches
changed it to 0x9003.
2010-04-28 16:10:43 +00:00
Robert Haas
33980a0640 Fix various instances of "the the".
Two of these were pointed out by Erik Rijkers; the rest I found.
2010-04-23 23:21:44 +00:00
Simon Riggs
a2555571fb Optimise btree delete processing when no active backends.
Clarify comments, downgrade a message to DEBUG and remove some
debug counters. Direct from ideas by Heikki Linnakangas.
2010-04-22 08:04:25 +00:00
Simon Riggs
0192abc4d7 Relax locking during GetCurrentVirtualXIDs(). Earlier improvements
to handling of btree delete records mean that all snapshot
conflicts on standby now have a valid, useful latestRemovedXid.
Our earlier approach using LW_EXCLUSIVE was useful when we didnt
always have a valid value, though is no longer useful or necessary.
Asserts added to code path to prove and ensure this is the case.
This will reduce contention and improve performance of larger Hot
Standby servers.
2010-04-21 19:08:14 +00:00
Simon Riggs
19c7a59b56 Change some debug ereports to elogs, as requested by translation team. 2010-04-06 10:50:57 +00:00
Heikki Linnakangas
daaeac88aa Fix comment which was apparently copy-pasted from another function. 2010-03-11 09:10:25 +00:00
Bruce Momjian
65e806cba1 pgindent run for 9.0 2010-02-26 02:01:40 +00:00
Simon Riggs
8eccf7614b Improvements to ps message of startup process during Hot Standby.
Message is reset earlier and potential bug avoided.

Andres Freund
2010-02-13 16:29:38 +00:00
Simon Riggs
b95a720a48 Re-enable max_standby_delay = -1 using deadlock detection on startup
process. If startup waits on a buffer pin we send a request to all
backends to cancel themselves if they are holding the buffer pin
required and they are also waiting on a lock. If not, startup waits
until max_standby_delay before cancelling any backend waiting for
the requested buffer pin.
2010-02-13 01:32:20 +00:00
Simon Riggs
5cbf6dceea Fix typo bug in Hot Standby from recent refactoring. Bug introduced
into code recently patched by Andres Freund, so quickly fixed by him
when bug report from Tatsuo Ishii arrived.
2010-02-11 19:35:22 +00:00
Tom Lane
ab7c49c988 Fix assorted poorly-thought-out message strings: use %u not %d for printing
OIDs, avoid random line breaks in strings somebody might grep for.
2010-02-02 22:01:53 +00:00
Simon Riggs
c85c941470 Detect early deadlock in Hot Standby when Startup is already waiting. First
stage of required deadlock detection to allow re-enabling max_standby_delay
setting of -1, which is now essential in the absence of improved relation-
specific conflict resoluton. Requested by Greg Stark et al.
2010-01-31 19:01:11 +00:00
Simon Riggs
76be0c81cc Filter recovery conflicts based upon dboid from relfilenode of WAL
records for heap and btree. Minor change, mostly API changes to
pass through the required values. This is a simple change though
also provides the refactoring required for further enhancements
to conflict processing using the relOid. Changes only have effect
during Hot Standby.
2010-01-29 17:10:05 +00:00
Simon Riggs
959ac58c04 In HS, Startup process sets SIGALRM when waiting for buffer pin. If
woken by alarm we send SIGUSR1 to all backends requesting that they
check to see if they are blocking Startup process. If so, they throw
ERROR/FATAL as for other conflict resolutions. Deadlock stop gap
removed. max_standby_delay = -1 option removed to prevent deadlock.
2010-01-23 16:37:12 +00:00
Simon Riggs
2edc31c439 Message mentions msec when it should be seconds, so use s instead of ms.
Noticed by Andres Freund
2010-01-16 10:13:04 +00:00
Simon Riggs
a8ce974cdd Teach standby conflict resolution to use SIGUSR1
Conflict reason is passed through directly to the backend, so we can
take decisions about the effect of the conflict based upon the local
state. No specific changes, as yet, though this prepares for later work.
CancelVirtualTransaction() sends signals while holding ProcArrayLock.
Introduce errdetail_abort() to give message detail explaining that the
abort was caused by conflict processing. Remove CONFLICT_MODE states
in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.
2010-01-16 10:05:59 +00:00
Simon Riggs
e99767bc28 First part of refactoring of code for ResolveRecoveryConflict. Purposes
of this are to centralise the conflict code to allow further change,
as well as to allow passing through the full reason for the conflict
through to the conflicting backends. Backend state alters how we
can handle different types of conflict so this is now required.
As originally suggested by Heikki, no longer optional.
2010-01-14 11:08:02 +00:00
Bruce Momjian
0239800893 Update copyright for the year 2010. 2010-01-02 16:58:17 +00:00
Tom Lane
bd8a35655b Suppress compiler warning (pid_t isn't int everywhere) 2009-12-31 22:07:36 +00:00
Simon Riggs
efc16ea520 Allow read only connections during recovery, known as Hot Standby.
Enabled by recovery_connections = on (default) and forcing archive recovery using a recovery.conf. Recovery processing now emulates the original transactions as they are replayed, providing full locking and MVCC behaviour for read only queries. Recovery must enter consistent state before connections are allowed, so there is a delay, typically short, before connections succeed. Replay of recovering transactions can conflict and in some cases deadlock with queries during recovery; these result in query cancellation after max_standby_delay seconds have expired. Infrastructure changes have minor effects on normal running, though introduce four new types of WAL record.

New test mode "make standbycheck" allows regression tests of static command behaviour on a standby server while in recovery. Typical and extreme dynamic behaviours have been checked via code inspection and manual testing. Few port specific behaviours have been utilised, though primary testing has been on Linux only so far.

This commit is the basic patch. Additional changes will follow in this release to enhance some aspects of behaviour, notably improved handling of conflicts, deadlock detection and query cancellation. Changes to VACUUM FULL are also required.

Simon Riggs, with significant and lengthy review by Heikki Linnakangas, including streamlined redesign of snapshot creation and two-phase commit.

Important contributions from Florian Pflug, Mark Kirkwood, Merlin Moncure, Greg Stark, Gianni Ciolli, Gabriele Bartolini, Hannu Krosing, Robert Haas, Tatsuo Ishii, Hiroyuki Yamada plus support and feedback from many other community members.
2009-12-19 01:32:45 +00:00