postgres

mirror of https://github.com/postgres/postgres.git synced 2025-08-31 17:02:12 +03:00

Author	SHA1	Message	Date
Fujii Masao	65e8dbb186	Fix bug in clean shutdown of walsender that pg_receiving is connecting to. On clean shutdown, walsender waits for all WAL to be replicated to a standby, and exits. It determined whether that replication had been completed by checking whether its sent location had been equal to a standby's flush location. Unfortunately this condition never becomes true when the standby such as pg_receivexlog which always returns an invalid flush location is connecting to walsender, and then walsender waits forever. This commit changes walsender so that it just checks a standby's write location if a flush location is invalid. Back-patch to 9.1 where enough infrastructure for this exists.	2014-03-17 20:42:35 +09:00
Tom Lane	03f06ff383	Fix some more bugs in signal handlers and process shutdown logic. WalSndKill was doing things exactly backwards: it should first clear MyWalSnd (to stop signal handlers from touching MyWalSnd->latch), then disown the latch, and only then mark the WalSnd struct unused by clearing its pid field. Also, WalRcvSigUsr1Handler and worker_spi_sighup failed to preserve errno, which is surely a requirement for any signal handler. Per discussion of recent buildfarm failures. Back-patch as far as the relevant code exists.	2014-02-01 16:21:38 -05:00
Magnus Hagander	773e4d5e4d	Avoid including tablespaces inside PGDATA twice in base backups If a tablespace was crated inside PGDATA it was backed up both as part of the PGDATA backup and as the backup of the tablespace. Avoid this by skipping any directory inside PGDATA that contains one of the active tablespaces. Dimitri Fontaine and Magnus Hagander	2014-01-07 17:18:02 +01:00
Fujii Masao	3f145f6be3	Support clean switchover. In replication, when we shutdown the master, walsender tries to send all the outstanding WAL records to the standby, and then to exit. This basically means that all the WAL records are fully synced between two servers after the clean shutdown of the master. So, after promoting the standby to new master, we can restart the stopped master as new standby without the need for a fresh backup from new master. But there was one problem so far: though walsender tries to send all the outstanding WAL records, it doesn't wait for them to be replicated to the standby. Then, before receiving all the WAL records, walreceiver can detect the closure of connection and exit. We cannot guarantee that there is no missing WAL in the standby after clean shutdown of the master. In this case, backup from new master is required when restarting the stopped master as new standby. This patch fixes this problem. It just changes walsender so that it waits for all the outstanding WAL records to be replicated to the standby before closing the replication connection. Per discussion, this is a fix that needs to get backpatched rather than new feature. So, back-patch to 9.1 where enough infrastructure for this exists. Patch by me, reviewed by Andres Freund.	2013-06-26 02:20:11 +09:00
Heikki Linnakangas	2e4acef357	In base backup, only include our own tablespace version directory. If you have clusters of different versions pointing to the same tablespace location, we would incorrectly include all the data belonging to the other versions, too. Fixes bug #7986, reported by Sergey Burladyan.	2013-03-25 20:26:30 +02:00
Heikki Linnakangas	14aa55df29	Fix race condition if a file is removed while pg_basebackup is running. If a relation file was removed when the server-side counterpart of pg_basebackup was just about to open it to send it to the client, you'd get a "could not open file" error. Fix that. Backpatch to 9.1, this goes back to when pg_basebackup was introduced.	2012-12-21 15:33:38 +02:00
Heikki Linnakangas	38b38fb122	pg_stat_replication.sync_state was displayed incorrectly at page boundary. XLogRecPtrIsInvalid() only checks the xrecoff field, which is correct when checking if a WAL record could legally begin at the given position, but WAL sending can legally be paused at a page boundary, in which case xrecoff is 0. Use XLByteEQ(..., InvalidXLogRecPtr) instead, which checks that both xlogid and xrecoff are 0. 9.3 doesn't have this problem because XLogRecPtr is now a single 64-bit integer, so XLogRecPtrIsInvalid() does the right thing. Apply to 9.2, and 9.1 where pg_stat_replication view was introduced. Kyotaro HORIGUCHI, reviewed by Fujii Masao.	2012-11-23 19:14:41 +02:00
Tom Lane	dfa6eda5e4	Fix tar files emitted by pg_basebackup to be POSIX conformant. Back-patch portions of commit `05b555d12b`. There doesn't seem to be any reason not to fix pg_basebackup fully, but we can't change pg_dump's "magic" string without breaking older versions of pg_restore. Instead, just patch pg_restore to accept either version of the magic string, in hopes of avoiding compatibility problems when 9.3 comes out. I also fixed pg_dump to write the correct 2-block EOF marker, since that won't create a compatibility problem with pg_restore and it could help with some versions of tar. Brian Weaver and Tom Lane	2012-09-28 15:35:51 -04:00
Tom Lane	d066cc548d	Fix walsender processes to establish a SIGALRM handler. Walsenders must have working SIGALRM handling during InitPostgres, but they set the handler to SIG_IGN so that nothing would happen if a timeout was reached. This could result in two failure modes: * If a walsender participated in a deadlock during its authentication transaction, and was the last to wait in the deadly embrace, the deadlock would not get cleared automatically. This would require somebody to be trying to take out AccessExclusiveLock on multiple system catalogs, so it's not very probable. * If a client failed to respond to a walsender's authentication challenge, the intended disconnect after AuthenticationTimeout wouldn't happen, and the walsender would wait indefinitely for the client. For the moment, fix in back branches only, since this is fixed in a different way in the timeout-infrastructure patch that's awaiting application to HEAD. If we choose not to apply that, then we'll need to do this in HEAD as well.	2012-07-12 14:30:04 -04:00
Magnus Hagander	b8aca12d77	Always treat a standby returning an an invalid flush location as async This ensures that a standby such as pg_receivexlog will not be selected as sync standby - which would cause the master to block waiting for a location that could never happen. Fujii Masao	2012-07-04 15:24:17 +02:00
Magnus Hagander	580b94168e	Prevent non-streaming replication connections from being selected sync slave This prevents a pg_basebackup backup session that just does a base backup (no xlog involved at all) from becoming the synchronous slave and thus blocking all access while it runs. Also fixes the problem when a higher priority slave shows up it would become the sync standby before it has reached the STREAMING state, by making sure we can only switch to a walsender that's actually STREAMING. Fujii Masao	2012-06-11 15:17:52 +02:00
Tom Lane	eca0c389f1	Cast some printf arguments to avoid possibly-nonportable behavior. Per compiler warnings on buildfarm member black_firefly.	2012-03-23 20:18:08 -04:00
Heikki Linnakangas	04e9dc6e01	Fix bug where walsender goes into a busy loop if connection is terminated. The problem was that ResetLatch was not being called in the walsender loop if the connection was terminated, so WaitLatch never sleeps until the terminated connection is detected. In the master-branch, this was already fixed as a side-effect of some refactoring of the loop. This commit backports that refactoring to 9.1. 9.0 does not have this bug, because we didn't use latches back then. Fujii Masao	2012-03-21 17:43:53 +02:00
Tom Lane	85d85ff7ef	Fix corner cases in readlink() usage. Make sure all calls are protected by HAVE_READLINK, and get the buffer overflow tests right. Be a bit more paranoid about string length in _tarWriteHeader(), too.	2011-12-07 13:34:19 -05:00
Magnus Hagander	9c32da5caa	Avoid using readlink() on platforms that don't support it We don't have any such platforms now, but might in the future. Also, detect cases when a tablespace symlink points to a path that is longer than we can handle, and give a warning.	2011-12-07 12:09:59 +01:00
Heikki Linnakangas	8e8ac0894b	Fix overly-complicated usage of errcode_for_file_access(). No need to do "errcode(errcode_for_file_access())", just "errcode_for_file_access()" is enough. The extra errcode() call is useless but harmless, so there's no user-visible bug here. Nevertheless, backpatch to 9.1 where this code were added.	2011-10-22 20:22:12 +03:00
Tom Lane	d1d094e4cf	Simplify and improve ProcessStandbyHSFeedbackMessage logic. There's no need to clamp the standby's xmin to be greater than GetOldestXmin's result; if there were any such need this logic would be hopelessly inadequate anyway, because it fails to account for within-database versus cluster-wide values of GetOldestXmin. So get rid of that, and just rely on sanity-checking that the xmin is not wrapped around relative to the nextXid counter. Also, don't reset the walsender's xmin if the current feedback xmin is indeed out of range; that just creates more problems than we already had. Lastly, don't bother to take the ProcArrayLock; there's no need to do that to set xmin. Also improve the comments about this in GetOldestXmin itself.	2011-10-20 19:43:45 -04:00
Magnus Hagander	8c1501b292	Exclude postmaster.opts from base backups Noted by Fujii Masao	2011-10-18 15:47:24 +02:00
Magnus Hagander	5df22bba64	Ensure walsenders can be SIGTERMed while in non-walsender code In oder to exit on SIGTERM when in non-walsender code, such as do_pg_stop_backup(), we need to set the interrupt variables that are used there, and not just the walsender local ones.	2011-10-06 21:45:20 +02:00
Tom Lane	989f530d3f	Back-patch assorted latch-related fixes. Fix a whole bunch of signal handlers that had been hacked to do things that might change errno, without adding the necessary save/restore logic for errno. Also make some minor fixes in unix_latch.c, and clean up bizarre and unsafe scheme for disowning the process's latch. While at it, rename the PGPROC latch field to procLatch for consistency with 9.2. Issues noted while reviewing a patch by Peter Geoghegan.	2011-08-10 12:20:45 -04:00
Tom Lane	74d099494c	Measure WaitLatch's timeout parameter in milliseconds, not microseconds. The original definition had the problem that timeouts exceeding about 2100 seconds couldn't be specified on 32-bit machines. Milliseconds seem like sufficient resolution, and finer grain than that would be fantasy anyway on many platforms. Back-patch to 9.1 so that this aspect of the latch API won't change between 9.1 and later releases. Peter Geoghegan	2011-08-09 18:52:35 -04:00
Tom Lane	6760a4d402	Documentation improvement and minor code cleanups for the latch facility. Improve the documentation around weak-memory-ordering risks, and do a pass of general editorialization on the comments in the latch code. Make the Windows latch code more like the Unix latch code where feasible; in particular provide the same Assert checks in both implementations. Fix poorly-placed WaitLatch call in syncrep.c. This patch resolves, for the moment, concerns around weak-memory-ordering bugs in latch-related code: we have documented the restrictions and checked that existing calls meet them. In 9.2 I hope that we will install suitable memory barrier instructions in SetLatch/ResetLatch, so that their callers don't need to be quite so careful.	2011-08-09 15:30:51 -04:00
Tom Lane	af0eca1a80	Clean up ill-advised attempt to invent a private set of Node tags. Somebody thought it'd be cute to invent a set of Node tag numbers that were defined independently of, and indeed conflicting with, the main tag-number list. While this accidentally failed to fail so far, it would certainly lead to trouble as soon as anyone wanted to, say, apply copyObject to these node types. Clang was already complaining about the use of makeNode on these tags, and I think quite rightly so. Fix by pushing these node definitions into the mainstream, including putting replnodes.h where it belongs.	2011-08-06 14:53:59 -04:00
Peter Eisentraut	0a5b01a716	Message style improvements	2011-07-08 07:43:33 +03:00
Tom Lane	979d6c9cad	Add missing -I switch for VPATH builds. Per bug #6073 from Hartmut Raschick.	2011-06-22 13:20:12 -04:00
Peter Eisentraut	a0b5146b25	Message style and spelling improvements	2011-06-22 00:33:20 +03:00
Bruce Momjian	6560407c7d	Pgindent run before 9.1 beta2.	2011-06-09 14:32:50 -04:00
Bruce Momjian	5a71b64130	Lowercase status labels in pg_stat_replication view.	2011-04-29 22:20:43 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	2594cf0e8c	Revise the API for GUC variable assign hooks. The previous functions of assign hooks are now split between check hooks and assign hooks, where the former can fail but the latter shouldn't. Aside from being conceptually clearer, this approach exposes the "canonicalized" form of the variable value to guc.c without having to do an actual assignment. And that lets us fix the problem recently noted by Bernd Helmle that the auto-tune patch for wal_buffers resulted in bogus log messages about "parameter "wal_buffers" cannot be changed without restarting the server". There may be some speed advantage too, because this design lets hook functions avoid re-parsing variable values when restoring a previous state after a rollback (they can store a pre-parsed representation of the value instead). This patch also resolves a longstanding annoyance about custom error messages from variable assign hooks: they should modify, not appear separately from, guc.c's own message about "invalid parameter value".	2011-04-07 00:12:02 -04:00
Robert Haas	240067b3b0	Merge synchronous_replication setting into synchronous_commit. This means one less thing to configure when setting up synchronous replication, and also avoids some ambiguity around what the behavior should be when the settings of these variables conflict. Fujii Masao, with additional hacking by me.	2011-04-04 16:25:52 -04:00
Robert Haas	38b27792ea	Avoid possible hang during smart shutdown. If a smart shutdown occurs just as a child is starting up, and the child subsequently becomes a walsender, there is a race condition: the postmaster might count the exstant backends, determine that there is one normal backend, and wait for it to die off. Had the walsender transition already occurred before the postmaster counted, it would have proceeded with the shutdown. To fix this, have each child that transforms into a walsender kick the postmaster just after doing so, so that the state machine is certain to advance. Fujii Masao	2011-04-03 19:42:00 -04:00
Robert Haas	7fcc75dd26	Fix compiler warning.	2011-04-01 10:36:38 -04:00
Heikki Linnakangas	754baa21f7	Automatically terminate replication connections that are idle for more than replication_timeout (a new GUC) milliseconds. The TCP timeout is often too long, you want the master to notice a dead connection much sooner. People complained about that in 9.0 too, but with synchronous replication it's even more important to notice dead connections promptly. Fujii Masao and Heikki Linnakangas	2011-03-30 10:20:37 +03:00
Heikki Linnakangas	bc03c5937d	Adjust error message, now that we expect other message types than connection close at this point. Fix PQsetnonblocking() comment. Fujii Masao	2011-03-30 08:54:28 +03:00
Simon Riggs	92f4786fa9	Additional test for each commit in sync rep path to plug minute possibility of race condition that would effect performance only. Requested by Robert Haas. Re-arrange related comments.	2011-03-26 10:09:37 +00:00
Robert Haas	30f6136f28	Make walreceiver send a reply after receiving data but before flushing it. It originally worked this way, but was changed by commit `a8a8a3e096`, since which time it's been impossible for walreceiver to ever send a reply with write_location and flush_location set to different values.	2011-03-25 11:28:16 -04:00
Robert Haas	727589995a	Move synchronous_standbys_defined updates from WAL writer to BG writer. This is advantageous because the BG writer is alive until much later in the shutdown sequence than WAL writer; we want to make sure that it's possible to shut off synchronous replication during a smart shutdown, else it might not be possible to complete the shutdown at all. Per very reasonable gripes from Fujii Masao and Simon Riggs.	2011-03-18 21:43:45 -04:00
Robert Haas	7a37900443	Make synchronous replication query cancel/die messages more consistent. Per a gripe from Thom Brown about my previous commit in this area, commit `9a56dc3389`.	2011-03-18 10:20:22 -04:00
Robert Haas	02b1f84e7d	Remove bogus comment.	2011-03-17 14:43:57 -04:00
Robert Haas	9a56dc3389	Fix various possible problems with synchronous replication. 1. Don't ignore query cancel interrupts. Instead, if the user asks to cancel the query after we've already committed it, but before it's on the standby, just emit a warning and let the COMMIT finish. 2. Don't ignore die interrupts (pg_terminate_backend or fast shutdown). Instead, emit a warning message and close the connection without acknowledging the commit. Other backends will still see the effect of the commit, but there's no getting around that; it's too late to abort at this point, and ignoring die interrupts altogether doesn't seem like a good idea. 3. If synchronous_standby_names becomes empty, wake up all backends waiting for synchronous replication to complete. Without this, someone attempting to shut synchronous replication off could easily wedge the entire system instead. 4. Avoid depending on the assumption that if a walsender updates MyProc->syncRepState, we'll see the change even if we read it without holding the lock. The window for this appears to be quite narrow (and probably doesn't exist at all on machines with strong memory ordering) but protecting against it is practically free, so do that. 5. Remove useless state SYNC_REP_MUST_DISCONNECT, which isn't needed and doesn't actually do anything. There's still some further work needed here to make the behavior of fast shutdown plausible, but that looks complex, so I'm leaving it for a separate commit. Review by Fujii Masao.	2011-03-17 13:12:21 -04:00
Robert Haas	551c07d84a	Make error handling of synchronous_standby_names consistent. It's not a good idea to kill the postmaster just because someone muffs this, and it's not consistent with what we do for other, similar GUCs. Fujii Masao, with a bit more hacking by me	2011-03-10 16:24:52 -05:00
Robert Haas	2e019c8611	More synchronous replication typo fixes. Fujii Masao	2011-03-10 15:56:18 -05:00
Robert Haas	b8bb8dbf20	More synchronous replication tweaks. SyncRepRequested() must check not only the value of the synchronous_replication GUC but also whether max_wal_senders > 0. Otherwise, we might end up waiting for sync rep even when there's no possibility of a standby ever managing to connect. There are some existing cross-checks to prevent this, but they're not quite sufficient: the user can start the server with max_wal_senders=0, synchronous_standby_names='', and synchronous_replication=off and then subsequent make synchronous_standby_names not empty using pg_ctl reload, and then SET synchronous_standby=on, leading to an indefinite hang. Along the way, rename the global variable for the synchronous_replication GUC to match the name of the GUC itself, for clarity. Report by Fujii Masao, though I didn't use his patch.	2011-03-10 15:43:37 -05:00
Robert Haas	6436098795	Minor sync rep corrections. Fujii Masao, with a bit of additional wordsmithing by me.	2011-03-10 14:57:02 -05:00
Robert Haas	fcb99609b6	Replication README updates. Fujii Masao	2011-03-10 08:59:59 -05:00
Itagaki Takahiro	2d8de0a50b	Cleanup copyright years and file names in the header comments of some files.	2011-03-10 15:05:33 +09:00
Bruce Momjian	76fdee31c4	Mention gcc version in C comment.	2011-03-09 23:41:13 -05:00
Heikki Linnakangas	baabf05196	Silence compiler warning about undefined function when compiling without assertions.	2011-03-07 09:56:53 +02:00
Simon Riggs	cae4974e3d	Dynamic array required within pg_stat_replication.	2011-03-07 00:26:30 +00:00

1 2 3 4

151 Commits