postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-15 03:41:20 +03:00

Author	SHA1	Message	Date
Robert Haas	105d4c5ffe	Fix assorted dtrace breakage caused by patch to include backend IDs in temp relpaths. Per buildfarm.	2010-08-13 22:54:17 +00:00
Robert Haas	debcec7dc3	Include the backend ID in the relpath of temporary relations. This allows us to reliably remove all leftover temporary relation files on cluster startup without reference to system catalogs or WAL; therefore, we no longer include temporary relations in XLOG_XACT_COMMIT and XLOG_XACT_ABORT WAL records. Since these changes require including a backend ID in each SharedInvalSmgrMsg, the size of the SharedInvalidationMessage.id field has been reduced from two bytes to one, and the maximum number of connections has been reduced from INT_MAX / 4 to 2^23-1. It would be possible to remove these restrictions by increasing the size of SharedInvalidationMessage by 4 bytes, but right now that doesn't seem like a good trade-off. Review by Jaime Casanova and Tom Lane.	2010-08-13 20:10:54 +00:00
Robert Haas	30c22eb8fc	Correct sundry errors in Hot Standby-related comments. Fujii Masao	2010-08-12 23:24:54 +00:00
Robert Haas	20be0d480a	Make log_temp_files based on kB, and revert docs & comments to match. Per extensive discussion on pgsql-hackers. We are deliberately not back-patching this even though the behavior of 8.3 and 8.4 is unquestionably broken, for fear of breaking existing users of this parameter. This incompatibility should be release-noted.	2010-07-06 22:55:26 +00:00
Bruce Momjian	239d769e7e	pgindent run for 9.0, second run	2010-07-06 19:19:02 +00:00
Tom Lane	aceedd88f6	Make vacuum_defer_cleanup_age be PGC_SIGHUP level, since it's not sensible to have different values in different processes of the primary server. Also put it into the "Streaming Replication" GUC category; it doesn't belong in "Standby Servers" because you use it on the master not the standby. In passing also correct guc.c's idea of wal_keep_segments' category.	2010-07-03 21:23:58 +00:00
Tom Lane	e76c1a0f4d	Replace max_standby_delay with two parameters, max_standby_archive_delay and max_standby_streaming_delay, and revise the implementation to avoid assuming that timestamps found in WAL records can meaningfully be compared to clock time on the standby server. Instead, the delay limits are compared to the elapsed time since we last obtained a new WAL segment from archive or since we were last "caught up" to WAL data arriving via streaming replication. This avoids problems with clock skew between primary and standby, as well as other corner cases that the original coding would misbehave in, such as the primary server having significant idle time between transactions. Per my complaint some time ago and considerable ensuing discussion. Do some desultory editing on the hot standby documentation, too.	2010-07-03 20:43:58 +00:00
Robert Haas	bb0fe9feb9	Move copydir.c from src/port to src/backend/storage/file The previous commit to make copydir() interruptible prevented postgres.exe from linking on MinGW and Cygwin, because on those platforms libpgport_srv.a can't freely reference symbols defined by the backend. Since that code is already backend-specific anyway, just move the whole file into the backend rather than adding further kludges to deal with the symbols needed by CHECK_FOR_INTERRUPTS(). This probably needs some further cleanup, but this commit just moves the file as-is, which should hopefully be enough to turn the buildfarm green again.	2010-07-02 17:03:30 +00:00
Itagaki Takahiro	9e3cd37576	Remove max_standby_delay message from ps display of recovery process in waiting status. The parameter is not so interesting in ps display because it is referable in postgresql.conf.	2010-06-14 00:49:24 +00:00
Simon Riggs	f9dbac9476	HS Defer buffer pin deadlock check until deadlock_timeout has expired. During Hot Standby we need to check for buffer pin deadlocks when the Startup process begins to wait, in case it never wakes up again. We previously made the deadlock check immediately on the basis it was cheap, though clearer thinking and prima facie evidence shows that was too simple. Refactor existing code to make it easy to add in deferral of deadlock check until deadlock_timeout allowing a good reduction in deadlock checks since far few buffer pins are held for that duration. It's worth doing anyway, though major goal is to prevent further reports of context switching with high numbers of users on occasional tests.	2010-05-26 19:52:52 +00:00
Simon Riggs	fd34374b17	Add many new Asserts in code and fix simple bug that slipped through without them, related to previous commit. Report by Bruce Momjian.	2010-05-14 07:11:49 +00:00
Simon Riggs	8431e296ea	Cleanup initialization of Hot Standby. Clarify working with reanalysis of requirements and documentation on LogStandbySnapshot(). Fixes two minor bugs reported by Tom Lane that would lead to an incorrect snapshot after transaction wraparound. Also fix two other problems discovered that would give incorrect snapshots in certain cases. ProcArrayApplyRecoveryInfo() substantially rewritten. Some minor refactoring of xact_redo_apply() and ExpireTreeKnownAssignedTransactionIds().	2010-05-13 11:15:38 +00:00
Tom Lane	f9ed327f76	Clean up some awkward, inaccurate, and inefficient processing around MaxStandbyDelay. Use the GUC units mechanism for the value, and choose more appropriate timestamp functions for performing tests with it. Make the ps_activity manipulation in ResolveRecoveryConflictWithVirtualXIDs have behavior similar to ps_activity code elsewhere, notably not updating the display when update_process_title is off and not truncating the display contents at an arbitrarily-chosen length. Improve the docs to be explicit about what MaxStandbyDelay actually measures, viz the difference between primary and standby servers' clocks, and the possible hazards if their clocks aren't in sync.	2010-05-02 02:10:33 +00:00
Tom Lane	f0488bd57c	Rename the parameter recovery_connections to hot_standby, to reduce possible confusion with streaming-replication settings. Also, change its default value to "off", because of concern about executing new and poorly-tested code during ordinary non-replicating operation. Per discussion. In passing do some minor editing of related documentation.	2010-04-29 21:36:19 +00:00
Tom Lane	77acab75df	Modify ShmemInitStruct and ShmemInitHash to throw errors internally, rather than returning NULL for some-but-not-all failures as they used to. Remove now-redundant tests for NULL from call sites. We had to do something about this because many call sites were failing to check for NULL; and changing it like this seems a lot more useful and mistake-proof than adding checks to the call sites without them.	2010-04-28 16:54:16 +00:00
Heikki Linnakangas	9b8a73326e	Introduce wal_level GUC to explicitly control if information needed for archival or hot standby should be WAL-logged, instead of deducing that from other options like archive_mode. This replaces recovery_connections GUC in the primary, where it now has no effect, but it's still used in the standby to enable/disable hot standby. Remove the WAL-logging of "unlogged operations", like creating an index without WAL-logging and fsyncing it at the end. Instead, we keep a copy of the wal_mode setting and the settings that affect how much shared memory a hot standby server needs to track master transactions (max_connections, max_prepared_xacts, max_locks_per_xact) in pg_control. Whenever the settings change, at server restart, write a WAL record noting the new settings and update pg_control. This allows us to notice the change in those settings in the standby at the right moment, they used to be included in checkpoint records, but that meant that a changed value was not reflected in the standby until the first checkpoint after the change. Bump PG_CONTROL_VERSION and XLOG_PAGE_MAGIC. Whack XLOG_PAGE_MAGIC back to the sequence it used to follow, before hot standby and subsequent patches changed it to 0x9003.	2010-04-28 16:10:43 +00:00
Tom Lane	2871b4618a	Replace the KnownAssignedXids hash table with a sorted-array data structure, and be more tense about the locking requirements for it, to improve performance in Hot Standby mode. In passing fix a few bugs and improve a number of comments in the existing HS code. Simon Riggs, with some editorialization by Tom	2010-04-28 00:09:05 +00:00
Robert Haas	33980a0640	Fix various instances of "the the". Two of these were pointed out by Erik Rijkers; the rest I found.	2010-04-23 23:21:44 +00:00
Simon Riggs	a2555571fb	Optimise btree delete processing when no active backends. Clarify comments, downgrade a message to DEBUG and remove some debug counters. Direct from ideas by Heikki Linnakangas.	2010-04-22 08:04:25 +00:00
Simon Riggs	0192abc4d7	Relax locking during GetCurrentVirtualXIDs(). Earlier improvements to handling of btree delete records mean that all snapshot conflicts on standby now have a valid, useful latestRemovedXid. Our earlier approach using LW_EXCLUSIVE was useful when we didnt always have a valid value, though is no longer useful or necessary. Asserts added to code path to prove and ensure this is the case. This will reduce contention and improve performance of larger Hot Standby servers.	2010-04-21 19:08:14 +00:00
Simon Riggs	7bc76d51fb	Check RecoveryInProgress() while holding ProcArrayLock during snapshots. This prevents a rare, yet possible race condition at the exact moment of transition from recovery to normal running.	2010-04-19 18:03:38 +00:00
Simon Riggs	21d6a6a128	Tune GetSnapshotData() during Hot Standby by avoiding loop through normal backends. Makes code clearer also, since we avoid various Assert()s. Performance of snapshots taken during recovery no longer depends upon number of read-only backends.	2010-04-18 18:06:07 +00:00
Simon Riggs	19c7a59b56	Change some debug ereports to elogs, as requested by translation team.	2010-04-06 10:50:57 +00:00
Peter Eisentraut	c248d17120	Message tuning	2010-03-21 00:17:59 +00:00
Tom Lane	f784f05e95	Clear error_context_stack and debug_query_string at the beginning of proc_exit, so that we won't try to attach any context printouts to messages that get emitted while exiting. Per report from Dennis Koegel, the context functions won't necessarily work after we've started shutting down the backend, and it seems possible that debug_query_string could be pointing at freed storage as well. The context information doesn't seem particularly relevant to such messages anyway, so there's little lost by suppressing it. Back-patch to all supported branches. I can only demonstrate a crash with log_disconnections messages back to 8.1, but the risk seems real in 8.0 and before anyway.	2010-03-20 00:58:09 +00:00
Heikki Linnakangas	e0f9e2b648	Fix bug in KnownAssignedXidsMany(). I saw this when looking at the assertion failure reported by Erik Rijkers, but this alone doesn't explain the failure.	2010-03-11 09:26:59 +00:00
Heikki Linnakangas	daaeac88aa	Fix comment which was apparently copy-pasted from another function.	2010-03-11 09:10:25 +00:00
Bruce Momjian	65e806cba1	pgindent run for 9.0	2010-02-26 02:01:40 +00:00
Tom Lane	e9a383303c	Adjust pg_fsync_writethrough so that it will set errno when failing on a platform that doesn't support this operation. The former coding would allow an unrelated errno to be reported, which would be quite misleading. Not sure if this has anything to do with the current buildfarm failures, but it's certainly bogus as-is.	2010-02-22 15:26:14 +00:00
Tom Lane	d1e027221d	Replace the pg_listener-based LISTEN/NOTIFY mechanism with an in-memory queue. In addition, add support for a "payload" string to be passed along with each notify event. This implementation should be significantly more efficient than the old one, and is also more compatible with Hot Standby usage. There is not yet any facility for HS slaves to receive notifications generated on the master, although such a thing is possible in future. Joachim Wieland, reviewed by Jeff Davis; also hacked on by me.	2010-02-16 22:34:57 +00:00
Greg Stark	f8c183a1ac	Speed up CREATE DATABASE by deferring the fsyncs until after copying all the data and using posix_fadvise to nudge the OS into flushing it earlier. This also hopefully makes CREATE DATABASE avoid spamming the cache. Tests show a big speedup on Linux at least on some filesystems. Idea and patch from Andres Freund.	2010-02-15 00:50:57 +00:00
Simon Riggs	8eccf7614b	Improvements to ps message of startup process during Hot Standby. Message is reset earlier and potential bug avoided. Andres Freund	2010-02-13 16:29:38 +00:00
Simon Riggs	b95a720a48	Re-enable max_standby_delay = -1 using deadlock detection on startup process. If startup waits on a buffer pin we send a request to all backends to cancel themselves if they are holding the buffer pin required and they are also waiting on a lock. If not, startup waits until max_standby_delay before cancelling any backend waiting for the requested buffer pin.	2010-02-13 01:32:20 +00:00
Simon Riggs	5cbf6dceea	Fix typo bug in Hot Standby from recent refactoring. Bug introduced into code recently patched by Andres Freund, so quickly fixed by him when bug report from Tatsuo Ishii arrived.	2010-02-11 19:35:22 +00:00
Tom Lane	cbe9d6beb4	Fix up rickety handling of relation-truncation interlocks. Move rd_targblock, rd_fsm_nblocks, and rd_vm_nblocks from relcache to the smgr relation entries, so that they will get reset to InvalidBlockNumber whenever an smgr-level flush happens. Because we now send smgr invalidation messages immediately (not at end of transaction) when a relation truncation occurs, this ensures that other backends will reset their values before they next access the relation. We no longer need the unreliable assumption that a VACUUM that's doing a truncation will hold its AccessExclusive lock until commit --- in fact, we can intentionally release that lock as soon as we've completed the truncation. This patch therefore reverts (most of) Alvaro's patch of 2009-11-10, as well as my marginal hacking on it yesterday. We can also get rid of assorted no-longer-needed relcache flushes, which are far more expensive than an smgr flush because they kill a lot more state. In passing this patch fixes smgr_redo's failure to perform visibility-map truncation, and cleans up some rather dubious assumptions in freespace.c and visibilitymap.c about when rd_fsm_nblocks and rd_vm_nblocks can be out of date.	2010-02-09 21:43:30 +00:00
Tom Lane	16e5859cd2	Allow free space map vacuuming to be interrupted.	2010-02-09 00:28:57 +00:00
Tom Lane	0a469c8769	Remove old-style VACUUM FULL (which was known for a little while as VACUUM FULL INPLACE), along with a boatload of subsidiary code and complexity. Per discussion, the use case for this method of vacuuming is no longer large enough to justify maintaining it; not to mention that we don't wish to invest the work that would be needed to make it play nicely with Hot Standby. Aside from the code directly related to old-style VACUUM FULL, this commit removes support for certain WAL record types that could only be generated within VACUUM FULL, redirect-pointer removal in heap_page_prune, and nontransactional generation of cache invalidation sinval messages (the last being the sticking point for Hot Standby). We still have to retain all code that copes with finding HEAP_MOVED_OFF and HEAP_MOVED_IN flag bits on existing tuples. This can't be removed as long as we want to support in-place update from pre-9.0 databases.	2010-02-08 04:33:55 +00:00
Tom Lane	70a2b05a59	Assorted cleanups in preparation for using a map file to support altering the relfilenode of currently-not-relocatable system catalogs. 1. Get rid of inval.c's dependency on relfilenode, by not having it emit smgr invalidations as a result of relcache flushes. Instead, smgr sinval messages are sent directly from smgr.c when an actual relation delete or truncate is done. This makes considerably more structural sense and allows elimination of a large number of useless smgr inval messages that were formerly sent even in cases where nothing was changing at the physical-relation level. Note that this reintroduces the concept of nontransactional inval messages, but that's okay --- because the messages are sent by smgr.c, they will be sent in Hot Standby slaves, just from a lower logical level than before. 2. Move setNewRelfilenode out of catalog/index.c, where it never logically belonged, into relcache.c; which is a somewhat debatable choice as well but better than before. (I considered catalog/storage.c, but that seemed too low level.) Rename to RelationSetNewRelfilenode. 3. Cosmetic cleanups of some other relfilenode manipulations.	2010-02-03 01:14:17 +00:00
Tom Lane	ab7c49c988	Fix assorted poorly-thought-out message strings: use %u not %d for printing OIDs, avoid random line breaks in strings somebody might grep for.	2010-02-02 22:01:53 +00:00
Simon Riggs	c85c941470	Detect early deadlock in Hot Standby when Startup is already waiting. First stage of required deadlock detection to allow re-enabling max_standby_delay setting of -1, which is now essential in the absence of improved relation- specific conflict resoluton. Requested by Greg Stark et al.	2010-01-31 19:01:11 +00:00
Simon Riggs	29eedd3122	Adjust GetLockConflicts() so that it uses TopMemoryContext when executed InHotStandby. Cleaner solution than using malloc or palloc depending upon situation, as proposed by Tom.	2010-01-29 19:45:12 +00:00
Simon Riggs	76be0c81cc	Filter recovery conflicts based upon dboid from relfilenode of WAL records for heap and btree. Minor change, mostly API changes to pass through the required values. This is a simple change though also provides the refactoring required for further enhancements to conflict processing using the relOid. Changes only have effect during Hot Standby.	2010-01-29 17:10:05 +00:00
Simon Riggs	bcd8528f00	Use malloc() in GetLockConflicts() when called InHotStandby to avoid repeated palloc calls. Current code assumed this was already true, so this is a bug fix.	2010-01-28 10:05:37 +00:00
Simon Riggs	959ac58c04	In HS, Startup process sets SIGALRM when waiting for buffer pin. If woken by alarm we send SIGUSR1 to all backends requesting that they check to see if they are blocking Startup process. If so, they throw ERROR/FATAL as for other conflict resolutions. Deadlock stop gap removed. max_standby_delay = -1 option removed to prevent deadlock.	2010-01-23 16:37:12 +00:00
Simon Riggs	58565d78db	Better internal documentation of locking for Hot Standby conflict resolution. Discuss the reasons for the lock type we hold on ProcArrayLock while deriving the conflict list. Cover the idea of false positive conflicts and seemingly strange effects on snapshot derivation.	2010-01-21 00:53:58 +00:00
Tom Lane	e319e6799a	Fix bogus initialization of KnownAssignedXids shared memory state --- didn't work in EXEC_BACKEND case.	2010-01-16 17:17:26 +00:00
Simon Riggs	2edc31c439	Message mentions msec when it should be seconds, so use s instead of ms. Noticed by Andres Freund	2010-01-16 10:13:04 +00:00
Simon Riggs	a8ce974cdd	Teach standby conflict resolution to use SIGUSR1 Conflict reason is passed through directly to the backend, so we can take decisions about the effect of the conflict based upon the local state. No specific changes, as yet, though this prepares for later work. CancelVirtualTransaction() sends signals while holding ProcArrayLock. Introduce errdetail_abort() to give message detail explaining that the abort was caused by conflict processing. Remove CONFLICT_MODE states in favour of using PROCSIG_RECOVERY_CONFLICT states directly, for clarity.	2010-01-16 10:05:59 +00:00
Heikki Linnakangas	40f908bdcd	Introduce Streaming Replication. This includes two new kinds of postmaster processes, walsenders and walreceiver. Walreceiver is responsible for connecting to the primary server and streaming WAL to disk, while walsender runs in the primary server and streams WAL from disk to the client. Documentation still needs work, but the basics are there. We will probably pull the replication section to a new chapter later on, as well as the sections describing file-based replication. But let's do that as a separate patch, so that it's easier to see what has been added/changed. This patch also adds a new section to the chapter about FE/BE protocol, documenting the protocol used by walsender/walreceivxer. Bump catalog version because of two new functions, pg_last_xlog_receive_location() and pg_last_xlog_replay_location(), for monitoring the progress of replication. Fujii Masao, with additional hacking by me	2010-01-15 09:19:10 +00:00
Simon Riggs	e99767bc28	First part of refactoring of code for ResolveRecoveryConflict. Purposes of this are to centralise the conflict code to allow further change, as well as to allow passing through the full reason for the conflict through to the conflicting backends. Backend state alters how we can handle different types of conflict so this is now required. As originally suggested by Heikki, no longer optional.	2010-01-14 11:08:02 +00:00

1 2 3 4 5 ...

1179 Commits