1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-01 01:04:50 +03:00

34440 Commits

Author SHA1 Message Date
Simon Riggs
9dd9993370 Mark vacuum_defer_cleanup_age as PGC_POSTMASTER.
Following bug analysis of #7819 by Tom Lane
2013-02-02 18:50:42 +00:00
Alvaro Herrera
231dbb3c9b Fix typo in freeze_table_age implementation
The original code used freeze_min_age instead of freeze_table_age.  The
main consequence of this mistake is that lowering freeze_min_age would
cause full-table scans to occur much more frequently, which causes
serious issues because the number of writes required is much larger.
That feature (freeze_min_age) is supposed to affect only how soon tuples
are frozen; some pages should still be skipped due to the visibility
map.

Backpatch to 8.4, where the freeze_table_age feature was introduced.

Report and patch from Andres Freund
2013-02-01 12:10:10 -03:00
Bruce Momjian
1a7cd9f78a pg_upgrade docs: mention modification of postgresql.conf in new cluster
Mention it might be necessary to modify postgresql.conf in the new
cluster to match the old cluster.

Backpatch to 9.2.

Suggested by user.
2013-01-31 16:32:34 -05:00
Magnus Hagander
b0b8dff353 Properly zero-pad the day-of-year part of the win32 build number
This ensure the version number increases over time. The first three digits
in the version number is still set to the actual PostgreSQL version
number, but the last one is intended to be an ever increasing build number,
which previosly failed when it changed between 1, 2 and 3 digits long values.

Noted by Deepak
2013-01-31 15:07:04 +01:00
Tom Lane
e09825aa54 Fix plpgsql's reporting of plan-time errors in possibly-simple expressions.
exec_simple_check_plan and exec_eval_simple_expr attempted to call
GetCachedPlan directly.  This meant that if an error was thrown during
planning, the resulting context traceback would not include the line
normally contributed by _SPI_error_callback.  This is already inconsistent,
but just to be really odd, a re-execution of the very same expression
*would* show the additional context line, because we'd already have cached
the plan and marked the expression as non-simple.

The problem is easy to demonstrate in 9.2 and HEAD because planning of a
cached plan doesn't occur at all until GetCachedPlan is done.  In earlier
versions, it could only be an issue if initial planning had succeeded, then
a replan was forced (already somewhat improbable for a simple expression),
and the replan attempt failed.  Since the issue is mainly cosmetic in older
branches anyway, it doesn't seem worth the risk of trying to fix it there.
It is worth fixing in 9.2 since the instability of the context printout can
affect the results of GET STACKED DIAGNOSTICS, as per a recent discussion
on pgsql-novice.

To fix, introduce a SPI function that wraps GetCachedPlan while installing
the correct callback function.  Use this instead of calling GetCachedPlan
directly from plpgsql.

Also introduce a wrapper function for extracting a SPI plan's
CachedPlanSource list.  This lets us stop including spi_priv.h in
pl_exec.c, which was never a very good idea from a modularity standpoint.

In passing, fix a similar inconsistency that could occur in SPI_cursor_open,
which was also calling GetCachedPlan without setting up a context callback.
2013-01-30 20:02:33 -05:00
Tom Lane
e7105c5f98 Fix grammar for subscripting or field selection from a sub-SELECT result.
Such cases should work, but the grammar failed to accept them because of
our ancient precedence hacks to convince bison that extra parentheses
around a sub-SELECT in an expression are unambiguous.  (Formally, they
*are* ambiguous, but we don't especially care whether they're treated as
part of the sub-SELECT or part of the expression.  Bison cares, though.)
Fix by adding a redundant-looking production for this case.

This is a fine example of why fixing shift/reduce conflicts via
precedence declarations is more dangerous than it looks: you can easily
cause the parser to reject cases that should work.

This has been wrong since commit 3db4056e22b0c6b2adc92543baf8408d2894fe91
or maybe before, and apparently some people have been working around it
by inserting no-op casts.  That method introduces a dump/reload hazard,
as illustrated in bug #7838 from Jan Mate.  Hence, back-patch to all
active branches.
2013-01-30 14:16:23 -05:00
Alvaro Herrera
2d4e338793 DROP OWNED: don't try to drop tablespaces/databases
My "fix" for bugs #7578 and #6116 on DROP OWNED at fe3b5eb08a1 not only
misstated that it applied to REASSIGN OWNED (which it did not affect),
but it also failed to fix the problems fully, because I didn't test the
case of owned shared objects.  Thus I created a new bug, reported by
Thomas Kellerer as #7748, which would cause DROP OWNED to fail with a
not-for-user-consumption error message.  The code would attempt to drop
the database, which not only fails to work because the underlying code
does not support that, but is a pretty dangerous and undesirable thing
to be doing as well.

This patch fixes that bug by having DROP OWNED only attempt to process
shared objects when grants on them are found, ignoring ownership.

Backpatch to 8.3, which is as far as the previous bug was backpatched.
2013-01-28 18:59:05 -03:00
Michael Meskes
5ca31a758a Made ecpglib use translated messages.
Bug reported and fixed by Chen Huajun <chenhj@cn.fujitsu.com>.
2013-01-27 13:50:05 +01:00
Tom Lane
308711afc7 Fix plpython's handling of functions used as triggers on multiple tables.
plpython tried to use a single cache entry for a trigger function, but it
needs a separate cache entry for each table the trigger is applied to,
because there is table-dependent data in there.  This was done correctly
before 9.1, but commit 46211da1b84bc3537e799ee1126098e71c2428e8 broke it
by simplifying the lookup key from "function OID and triggered table OID"
to "function OID and is-trigger boolean".  Go back to using both OIDs
as the lookup key.  Per bug report from Sandro Santilli.

Andres Freund
2013-01-25 16:59:00 -05:00
Bruce Momjian
6d5c62ad99 doc: merge ecpg username/password example into C comment
Backpatch to 9.2

per Tom Lane
2013-01-25 13:46:38 -05:00
Bruce Momjian
dc33b0961e docs: In ecpg, clarify how username/password colon parameters are used
Backpatch to 9.2.

Patch from Alan B
2013-01-25 11:18:44 -05:00
Bruce Momjian
a33c0045a8 doc: improve wording of "foreign data server" in file-fdw docs
Backpatch to 9.2

Shigeru HANADA
2013-01-25 10:14:11 -05:00
Magnus Hagander
65fc0ab559 Make pg_dump exclude unlogged table data on hot standby slaves
Noted by Joe Van Dyk
2013-01-25 09:46:54 +01:00
Bruce Momjian
ab2d907a25 doc: correct sepgsql doc about permission checking of CASCADE
Backpatch to 9.2.

Patch from Kohei KaiGai
2013-01-24 21:21:50 -05:00
Tom Lane
ecfd9941ec Fix SPI documentation for new handling of ExecutorRun's count parameter.
Since 9.0, the count parameter has only limited the number of tuples
actually returned by the executor.  It doesn't affect the behavior of
INSERT/UPDATE/DELETE unless RETURNING is specified, because without
RETURNING, the ModifyTable plan node doesn't return control to execMain.c
for each tuple.  And we only check the limit at the top level.

While this behavioral change was unintentional at the time, discussion of
bug #6572 led us to the conclusion that we prefer the new behavior anyway,
and so we should just adjust the docs to match rather than change the code.
Accordingly, do that.  Back-patch as far as 9.0 so that the docs match the
code in each branch.
2013-01-24 18:34:04 -05:00
Andrew Dunstan
6c77ce328f Use correct output device for Windows prompts.
This ensures that mapping of non-ascii prompts
to the correct code page occurs.

Bug report and original patch from Alexander Law,
reviewed and reworked by Noah Misch.

Backpatch to all live branches.
2013-01-24 16:01:31 -05:00
Simon Riggs
b544ea1a41 Fix rare missing cancellations in Hot Standby.
The machinery around XLOG_HEAP2_CLEANUP_INFO failed
to correctly pass through the necessary information
on latestRemovedXid, avoiding cancellations in some
infrequent concurrent update/cleanup scenarios.

Backpatchable fix to 9.0

Detailed bug report and fix by Noah Misch,
backpatchable version by me.
2013-01-24 14:24:17 +00:00
Heikki Linnakangas
3cef2179e0 Also fix rotation of csvlog on Windows.
Backpatch to 9.2, like the previous fix.
2013-01-24 11:43:22 +02:00
Tom Lane
14ba9b11ea Fix failure to rotate postmaster log file for size reasons on Windows.
When we eliminated "unnecessary" wakeups of the syslogger process, we
broke size-based logfile rotation on Windows, because on that platform
data transfer is done in a separate thread.  While non-Windows platforms
would recheck the output file size after every log message, Windows only
did so when the control thread woke up for some other reason, which might
be quite infrequent.  Per bug #7814 from Tsunezumi.  Back-patch to 9.2
where the problem was introduced.

Jeff Janes
2013-01-23 22:08:14 -05:00
Kevin Grittner
a79ae0bc0d Fix performance problems with autovacuum truncation in busy workloads.
In situations where there are over 8MB of empty pages at the end of
a table, the truncation work for trailing empty pages takes longer
than deadlock_timeout, and there is frequent access to the table by
processes other than autovacuum, there was a problem with the
autovacuum worker process being canceled by the deadlock checking
code. The truncation work done by autovacuum up that point was
lost, and the attempt tried again by a later autovacuum worker. The
attempts could continue indefinitely without making progress,
consuming resources and blocking other processes for up to
deadlock_timeout each time.

This patch has the autovacuum worker checking whether it is
blocking any other thread at 20ms intervals. If such a condition
develops, the autovacuum worker will persist the work it has done
so far, release its lock on the table, and sleep in 50ms intervals
for up to 5 seconds, hoping to be able to re-acquire the lock and
try again. If it is unable to get the lock in that time, it moves
on and a worker will try to continue later from the point this one
left off.

While this patch doesn't change the rules about when and what to
truncate, it does cause the truncation to occur sooner, with less
blocking, and with the consumption of fewer resources when there is
contention for the table's lock.

The only user-visible change other than improved performance is
that the table size during truncation may change incrementally
instead of just once.

Backpatched to 9.0 from initial master commit at
b19e4250b45e91c9cbdd18d35ea6391ab5961c8d -- before that the
differences are too large to be clearly safe.

Jan Wieck
2013-01-23 13:39:28 -06:00
Tom Lane
9a3ddecdd9 Fix one-byte buffer overrun in PQprintTuples().
This bug goes back to the original Postgres95 sources.  Its significance
to modern PG versions is marginal, since we have not used PQprintTuples()
internally in a very long time, and it doesn't seem to have ever been
documented either.  Still, it *is* exposed to client apps, so somebody
out there might possibly be using it.

Xi Wang
2013-01-20 23:43:51 -05:00
Tom Lane
666569f1fd Fix error-checking typo in check_TSCurrentConfig().
The code failed to detect an out-of-memory failure.

Xi Wang
2013-01-20 23:10:00 -05:00
Peter Eisentraut
ff6deb2ec9 doc: Fix syntax of a URL
Leading white space before the "http:" is apparently treated as a
relative link at least by some browsers.
2013-01-20 19:37:53 -05:00
Magnus Hagander
437ecd1588 Clarify that streaming replication can be both async and sync
Josh Kupershmidt
2013-01-20 16:11:01 +01:00
Tom Lane
30a4853cfd Modernize string literal syntax in tutorial example.
Un-double the backslashes in the LIKE patterns, since
standard_conforming_strings is now the default.  Just to be sure, include
a command to set standard_conforming_strings to ON in the example.

Back-patch to 9.1, where standard_conforming_strings became the default.

Josh Kupershmidt, reviewed by Jeff Janes
2013-01-19 17:20:56 -05:00
Andrew Dunstan
e994d83d5b Make pgxs build executables with the right suffix.
Complaint and patch from Zoltán Böszörményi.

When cross-compiling, the native make doesn't know
about the Windows .exe suffix, so it only builds with
it when explicitly told to do so.

The native make will not see the link between the target
name and the built executable, and might this do unnecesary
work, but that's a bigger problem than this one, if in fact
we consider it a problem at all.

Back-patch to all live branches.
2013-01-19 14:54:29 -05:00
Tom Lane
d98547eb43 Protect against SnapshotNow race conditions in pg_tablespace scans.
Use of SnapshotNow is known to expose us to race conditions if the tuple(s)
being sought could be updated by concurrently-committing transactions.
CREATE DATABASE and DROP DATABASE are particularly exposed because they do
heavyweight filesystem operations during their scans of pg_tablespace,
so that the scans run for a very long time compared to most.  Furthermore,
the potential consequences of a missed or twice-visited row are nastier
than average:

* createdb() could fail with a bogus "file already exists" error, or
  silently fail to copy one or more tablespace's worth of files into the
  new database.

* remove_dbtablespaces() could miss one or more tablespaces, thus failing
  to free filesystem space for the dropped database.

* check_db_file_conflict() could likewise miss a tablespace, leading to an
  OID conflict that could result in data loss either immediately or in
  future operations.  (This seems of very low probability, though, since a
  duplicate database OID would be unlikely to start with.)

Hence, it seems worth fixing these three places to use MVCC snapshots, even
though this will someday be superseded by a generic solution to SnapshotNow
race conditions.

Back-patch to all active branches.

Stephen Frost and Tom Lane
2013-01-18 18:06:27 -05:00
Robert Haas
8d1fbf947d Unbreak lock conflict detection for Hot Standby.
This got broken in the original fast-path locking patch, because
I failed to account for the fact that Hot Standby startup process
might take a strong relation lock on a relation in a database to
which it is not bound, and confused MyDatabaseId with the database
ID of the relation being locked.

Report and diagnosis by Andres Freund.  Final form of patch by me.
2013-01-18 12:03:10 -05:00
Heikki Linnakangas
71c53d5285 On second thought, use an empty string instead of "none" when not connected.
"none" could mislead to think that you're connected a database with that
name. Also, it needs to be translated, which might be hard without some
context. So in back-branches, use empty string, so that the message is
(currently ""), which is at least unambiguous and doens't require
translation. In master, it's no problem to add translatable strings, so use
a different fix there.
2013-01-15 22:12:54 +02:00
Heikki Linnakangas
ee0ef989f3 Don't pass NULL to fprintf, if not currently connected to a database.
Backpatch all the way to 8.3. Fixes bug #7811, per report and diagnosis by
Meng Qingzhong.
2013-01-15 19:20:15 +02:00
Tom Lane
0471b013aa Reject out-of-range dates in to_date().
Dates outside the supported range could be entered, but would not print
reasonably, and operations such as conversion to timestamp wouldn't behave
sanely either.  Since this has the potential to result in undumpable table
data, it seems worth back-patching.

Hitoshi Harada
2013-01-14 15:20:02 -05:00
Tom Lane
7938d84a32 Add new timezone abbrevation "FET".
This seems to have been invented in 2011 to represent GMT+3, non daylight
savings rules, as now used in Europe/Kaliningrad and Europe/Minsk.
There are no conflicts so might as well add it to the Default list.
Per bug #7804 from Ruslan Izmaylov.
2013-01-14 14:46:09 -05:00
Andrew Dunstan
b72d5c55cc Extend and improve use of EXTRA_REGRESS_OPTS.
This is now used by ecpg tests, and not clobbered by pg_upgrade
tests. This change won't affect anything that doesn't set this
environment variable, but will enable the buildfarm to control
exactly what port regression test installs will be running on,
and thus to detect possible rogue postmasters more easily.

Backpatch to release 9.2 where EXTRA_REGRESS_OPTS was first used.
2013-01-12 08:24:38 -05:00
Tom Lane
3e9960e9d9 Revert ill-considered change of index-size fudge factor.
This partially reverts commit 21a39de5809cd3050a37d2554323cc1d0cbeed9d,
restoring the pre-9.2 cost estimates for index usage.  That change
introduced much too large a bias against larger indexes, as per reports
from Jeff Janes and others.  The whole thing needs a rewrite, which I've
done in HEAD, but the safest thing to do in 9.2 is just to undo this
multiplier change.
2013-01-11 13:08:19 -05:00
Magnus Hagander
0b3c54aac7 Properly install ecpg_compat and pgtypes libraries on msvc
JiangGuiqing
2013-01-09 17:31:04 +01:00
Tom Lane
fe35c0b6e5 Fix potential corruption of lock table in CREATE/DROP INDEX CONCURRENTLY.
If VirtualXactLock() has to wait for a transaction that holds its VXID lock
as a fast-path lock, it must first convert the fast-path lock to a regular
lock.  It failed to take the required "partition" lock on the main
shared-memory lock table while doing so.  This is the direct cause of the
assert failure in GetLockStatusData() recently observed in the buildfarm,
but more worryingly it could result in arbitrary corruption of the shared
lock table if some other process were concurrently engaged in modifying the
same partition of the lock table.  Fortunately, VirtualXactLock() is only
used by CREATE INDEX CONCURRENTLY and DROP INDEX CONCURRENTLY, so the
opportunities for failure are fewer than they might have been.

In passing, improve some comments and be a bit more consistent about
order of operations.
2013-01-08 18:26:03 -05:00
Tom Lane
a17da19ed9 Invent a "one-shot" variant of CachedPlans for better performance.
SPI_execute() and related functions create a CachedPlan, execute it once,
and immediately discard it, so that the functionality offered by
plancache.c is of no value in this code path.  And performance measurements
show that the extra data copying and invalidation checking done by
plancache.c slows down simple queries by 10% or more compared to 9.1.
However, enough of the SPI code is shared with functions that do need plan
caching that it seems impractical to bypass plancache.c altogether.
Instead, let's invent a variant version of cached plans that preserves
99% of the API but doesn't offer any of the actual functionality, nor the
overhead.  This puts SPI_execute() performance back on par, or maybe even
slightly better, than it was before.  This change should resolve recent
complaints of performance degradation from Dong Ye, Pavel Stehule, and
others.

By avoiding data copying, this change also reduces the amount of memory
needed to execute many-statement SPI_execute() strings, as for instance in
a recent complaint from Tomas Vondra.

An additional benefit of this change is that multi-statement SPI_execute()
query strings are now processed fully serially, that is we complete
execution of earlier statements before running parse analysis and planning
on following ones.  This eliminates a long-standing POLA violation, in that
DDL that affects the behavior of a later statement will now behave as
expected.

Back-patch to 9.2, since this was a performance regression compared to 9.1.
(In 9.2, place the added struct fields so as to avoid changing the offsets
of existing fields.)

Heikki Linnakangas and Tom Lane
2013-01-04 17:42:25 -05:00
Tom Lane
48e0b8a23e Prevent creation of postmaster's TCP socket during pg_upgrade testing.
On non-Windows machines, we use the Unix socket for connections to test
postmasters, so there is no need to create a TCP socket.  Furthermore,
doing so causes failures due to port conflicts if two builds are carried
out concurrently on one machine.  (If the builds are done in different
chroots, which is standard practice at least in Red Hat distros, there
is no risk of conflict on the Unix socket.)  Suppressing the TCP socket
by setting listen_addresses to empty has long been standard practice
for pg_regress, and pg_upgrade knows about this too ... but pg_upgrade's
test.sh didn't get the memo.

Back-patch to 9.2, and also sync the 9.2 version of the script with HEAD
as much as practical.
2013-01-03 18:34:57 -05:00
Heikki Linnakangas
b4c99c9af3 Tolerate timeline switches while "pg_basebackup -X fetch" is running.
If you take a base backup from a standby server with "pg_basebackup -X
fetch", and the timeline switches while the backup is being taken, the
backup used to fail with an error "requested WAL segment %s has already
been removed". This is because the server-side code that sends over the
required WAL files would not construct the WAL filename with the correct
timeline after a switch.

Fix that by using readdir() to scan pg_xlog for all the WAL segments in the
range, regardless of timeline.

Also, include all timeline history files in the backup, if taken with
"-X fetch". That fixes another related bug: If a timeline switch happened
just before the backup was initiated in a standby, the WAL segment
containing the initial checkpoint record contains WAL from the older
timeline too. Recovery will not accept that without a timeline history file
that lists the older timeline.

Backpatch to 9.2. Versions prior to that were not affected as you could not
take a base backup from a standby before 9.2.
2013-01-03 19:50:46 +02:00
Bruce Momjian
faf1b1bd71 Update copyrights for 2013
Fully update git head, and update back branches in ./COPYRIGHT and
legal.sgml files.
2013-01-01 17:15:00 -05:00
Heikki Linnakangas
7e1cf76524 Keep timeline history files restored from archive in pg_xlog.
The cascading standby patch in 9.2 changed the way WAL files are treated
when restored from the archive. Before, they were restored under a temporary
filename, and not kept in pg_xlog, but after the patch, they were copied
under pg_xlog. This is necessary for a cascading standby to find them, but
it also means that if the archive goes offline and a standby is restarted,
it can recover back to where it was using the files in pg_xlog. It also
means that if you take an offline backup from a standby server, it includes
all the required WAL files in pg_xlog.

However, the same change was not made to timeline history files, so if the
WAL segment containing the checkpoint record contains a timeline switch, you
will still get an error if you try to restart recovery without the archive,
or recover from an offline backup taken from the standby.

With this patch, timeline history files restored from archive are copied
into pg_xlog like WAL files are, so that pg_xlog contains all the files
required to recover. This is a corner-case pre-existing issue in 9.2, but
even more important in master where it's possible for a standby to follow a
timeline switch through streaming replication. To make that possible, the
timeline history files must be present in pg_xlog.
2012-12-30 14:29:54 +02:00
Peter Eisentraut
b573fc840b doc: Correct description of LDAP authentication
Parts of the description had claimed incorrect pg_hba.conf option names
for LDAP authentication.

Albe Laurenz
2012-12-29 23:01:01 -05:00
Tom Lane
7ec225d3d6 Fix some minor issues in view pretty-printing.
Code review for commit 2f582f76b1945929ff07116cd4639747ce9bb8a1: don't use
a static variable for what ought to be a deparse_context field, fix
non-multibyte-safe test for spaces, avoid useless and potentially O(N^2)
(though admittedly with a very small constant) calculations of wrap
positions when we aren't going to wrap.
2012-12-24 17:52:27 -05:00
Tom Lane
de4cf42af9 Prevent failure when RowExpr or XmlExpr is parse-analyzed twice.
transformExpr() is required to cope with already-transformed expression
trees, for various ugly-but-not-quite-worth-cleaning-up reasons.  However,
some of its newer subroutines hadn't gotten the memo.  This accounts for
bug #7763 from Norbert Buchmuller: transformRowExpr() was overwriting the
previously determined type of a RowExpr during CREATE TABLE LIKE INCLUDING
INDEXES.  Additional investigation showed that transformXmlExpr had the
same kind of problem, but all the other cases seem to be safe.

Andres Freund and Tom Lane
2012-12-23 14:07:31 -05:00
Tom Lane
97e1db771c Fix documentation typo.
"GetForeignTableColumnOptions" should be "GetForeignColumnOptions".
Noted by Metin Döşlü.
2012-12-22 15:01:45 -05:00
Heikki Linnakangas
52d469b4ac Fix race condition if a file is removed while pg_basebackup is running.
If a relation file was removed when the server-side counterpart of
pg_basebackup was just about to open it to send it to the client, you'd
get a "could not open file" error. Fix that.

Backpatch to 9.1, this goes back to when pg_basebackup was introduced.
2012-12-21 15:29:49 +02:00
Peter Eisentraut
3463dacc56 Fix grammatical mistake in error message 2012-12-20 23:38:15 -05:00
Tom Lane
bfe738d7f3 Fix pg_extension_config_dump() to handle update cases more sanely.
If pg_extension_config_dump() is executed again for a table already listed
in the extension's extconfig, the code was blindly making a new array entry.
This does not seem useful.  Fix it to replace the existing array entry
instead, so that it's possible for extension update scripts to alter the
filter conditions for configuration tables.

In addition, teach ALTER EXTENSION DROP TABLE to check for an extconfig
entry for the target table, and remove it if present.  This is not a 100%
solution because it's allowed for an extension update script to just
summarily DROP a member table, and that code path doesn't go through
ExecAlterExtensionContentsStmt.  We could probably make that case clean
things up if we had to, but it would involve sticking a very ugly wart
somewhere in the guts of dependency.c.  Since on the whole it seems quite
unlikely that extension updates would want to remove pre-existing
configuration tables, making the case possible with an explicit command
seems sufficient.

Per bug #7756 from Regina Obe.  Back-patch to 9.1 where extensions were
introduced.
2012-12-20 16:31:58 -05:00
Heikki Linnakangas
3881aedbcf Fix recycling of WAL segments after changing recovery target timeline.
After the recovery target timeline is changed, we would still recycle and
preallocate WAL segments on the old target timeline. Those WAL segments
created for the old timeline are a waste of space, although otherwise
harmless.

The problem is that when installing a recycled WAL segment as a future one,
ThisTimeLineID is used to construct the filename. ThisTimeLineID is
initialized in the checkpointer process to the recovery target timeline at
startup, but it was not updated when the startup process chooses a new
target timeline (recovery_target_timeline='latest'). To fix, always update
ThisTimeLineID before recycling WAL segments at a restartpoint.

This still leaves a small window where we might install WAL segments under
wrong timeline ID, if the target timeline is changed just as we're about to
start recycling. Also, when we're not on the target timeline yet, but still
replaying some older timeline, we'll install WAL segments to the newer
timeline anyway and they will still go wasted. We'll just live with the
waste in that situation.

Commit to 9.2 and 9.1. Older versions didn't change recovery target timeline
after startup, and for master, I'll commit a slightly different variant of
this.
2012-12-20 21:30:59 +02:00
Heikki Linnakangas
8e1e8278c3 Check if we've reached end-of-backup point also if no redo is required.
If you restored from a backup taken from a standby, and the last record in
the backup is the checkpoint record, ie. there is no redo required except
for the checkpoint record, we would fail to notice that we've reached the
end-of-backup point, and the database is consistent. The result was an
error "WAL ends before end of online backup". To fix, move the
have-we-reached-end-of-backup check into CheckRecoveryConsistency(), which
is already responsible for similar checks with minRecoveryPoint, and is
called in the right places.

Backpatch to 9.2, this check and bug did not exist before that.
2012-12-19 14:20:09 +02:00