1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00
Commit Graph

35041 Commits

Author SHA1 Message Date
7a4211ebd2 Report more information if pg_perm_setlocale() fails at startup.
We don't know why a few Windows users have seen this fail, but the
taciturnity of the error message certainly isn't helping debug it.
Let's at least find out which LC category isn't working.
2015-06-09 13:37:08 -04:00
18935145e7 Allow HotStandbyActiveInReplay() to be called in single user mode.
HotStandbyActiveInReplay, introduced in 061b079f, only allowed WAL
replay to happen in the startup process, missing the single user case.

This buglet is fairly harmless as it only causes problems when single
user mode in an assertion enabled build is used to replay a btree vacuum
record.

Backpatch to 9.2. 061b079f was backpatched further, but the assertion
was not.
2015-06-08 14:09:52 +02:00
3e69a73b98 Use a safer method for determining whether relcache init file is stale.
When we invalidate the relcache entry for a system catalog or index, we
must also delete the relcache "init file" if the init file contains a copy
of that rel's entry.  The old way of doing this relied on a specially
maintained list of the OIDs of relations present in the init file: we made
the list either when reading the file in, or when writing the file out.
The problem is that when writing the file out, we included only rels
present in our local relcache, which might have already suffered some
deletions due to relcache inval events.  In such cases we correctly decided
not to overwrite the real init file with incomplete data --- but we still
used the incomplete initFileRelationIds list for the rest of the current
session.  This could result in wrong decisions about whether the session's
own actions require deletion of the init file, potentially allowing an init
file created by some other concurrent session to be left around even though
it's been made stale.

Since we don't support changing the schema of a system catalog at runtime,
the only likely scenario in which this would cause a problem in the field
involves a "vacuum full" on a catalog concurrently with other activity, and
even then it's far from easy to provoke.  Remarkably, this has been broken
since 2002 (in commit 7863404417), but we had
never seen a reproducible test case until recently.  If it did happen in
the field, the symptoms would probably involve unexpected "cache lookup
failed" errors to begin with, then "could not open file" failures after the
next checkpoint, as all accesses to the affected catalog stopped working.
Recovery would require manually removing the stale "pg_internal.init" file.

To fix, get rid of the initFileRelationIds list, and instead consult
syscache.c's list of relations used in catalog caches to decide whether a
relation is included in the init file.  This should be a tad more efficient
anyway, since we're replacing linear search of a list with ~100 entries
with a binary search.  It's a bit ugly that the init file contents are now
so directly tied to the catalog caches, but in practice that won't make
much difference.

Back-patch to all supported branches.
2015-06-07 15:32:09 -04:00
04358dab21 Fix incorrect order of database-locking operations in InitPostgres().
We should set MyProc->databaseId after acquiring the per-database lock,
not beforehand.  The old way risked deadlock against processes trying to
copy or delete the target database, since they would first acquire the lock
and then wait for processes with matching databaseId to exit; that left a
window wherein an incoming process could set its databaseId and then block
on the lock, while the other process had the lock and waited in vain for
the incoming process to exit.

CountOtherDBBackends() would time out and fail after 5 seconds, so this
just resulted in an unexpected failure not a permanent lockup, but it's
still annoying when it happens.  A real-world example of a use-case is that
short-duration connections to a template database should not cause CREATE
DATABASE to fail.

Doing it in the other order should be fine since the contract has always
been that processes searching the ProcArray for a database ID must hold the
relevant per-database lock while searching.  Thus, this actually removes
the former race condition that required an assumption that storing to
MyProc->databaseId is atomic.

It's been like this for a long time, so back-patch to all active branches.
2015-06-05 13:22:27 -04:00
3f7928a289 Stamp 9.2.12. REL9_2_12 2015-06-01 15:10:35 -04:00
25e9e8984b Release notes for 9.4.3, 9.3.8, 9.2.12, 9.1.17, 9.0.21.
Also sneak entries for commits 97ff2a564 et al into the sections for
the previous releases in the relevant branches.  Those fixes did go out
in the previous releases, but missed getting documented.
2015-06-01 13:27:44 -04:00
77642a8197 Remove special cases for ETXTBSY from new fsync'ing logic.
The argument that this is a sufficiently-expected case to be silently
ignored seems pretty thin.  Andres had brought it up back when we were
still considering that most fsync failures should be hard errors, and it
probably would be legit not to fail hard for ETXTBSY --- but the same is
true for EROFS and other cases, which is why we gave up on hard failures.
ETXTBSY is surely not a normal case, so logging the failure seems fine
from here.
2015-05-29 15:11:36 -04:00
aa8377e64f Fix fsync-at-startup code to not treat errors as fatal.
Commit 2ce439f337 introduced a rather serious
regression, namely that if its scan of the data directory came across any
un-fsync-able files, it would fail and thereby prevent database startup.
Worse yet, symlinks to such files also caused the problem, which meant that
crash restart was guaranteed to fail on certain common installations such
as older Debian.

After discussion, we agreed that (1) failure to start is worse than any
consequence of not fsync'ing is likely to be, therefore treat all errors
in this code as nonfatal; (2) we should not chase symlinks other than
those that are expected to exist, namely pg_xlog/ and tablespace links
under pg_tblspc/.  The latter restriction avoids possibly fsync'ing a
much larger part of the filesystem than intended, if the user has left
random symlinks hanging about in the data directory.

This commit takes care of that and also does some code beautification,
mainly moving the relevant code into fd.c, which seems a much better place
for it than xlog.c, and making sure that the conditional compilation for
the pre_sync_fname pass has something to do with whether pg_flush_data
works.

I also relocated the call site in xlog.c down a few lines; it seems a
bit silly to be doing this before ValidateXLOGDirectoryStructure().

The similar logic in initdb.c ought to be made to match this, but that
change is noncritical and will be dealt with separately.

Back-patch to all active branches, like the prior commit.

Abhijit Menon-Sen and Tom Lane
2015-05-28 17:33:03 -04:00
f3c67aad43 Fix pg_get_functiondef() to print a function's LEAKPROOF property.
Seems to have been an oversight in the original leakproofness patch.
Per report and patch from Jeevan Chalke.

In passing, prettify some awkward leakproof-related code in AlterFunction.
2015-05-28 11:24:37 -04:00
44674d071f Fix portability issue in isolationtester grammar.
specparse.y and specscanner.l used "string" as a token name.  Now, bison
likes to define each token name as a macro for the token code it assigns,
which means those names are basically off-limits for any other use within
the grammar file or included headers.  So names as generic as "string" are
dangerous.  This is what was causing the recent failures on protosciurus:
some versions of Solaris' sys/kstat.h use "string" as a field name.
With late-model bison we don't see this problem because the token macros
aren't defined till later (that is why castoroides didn't show the problem
even though it's on the same machine).  But protosciurus uses bison 1.875
which defines the token macros up front.

This land mine has been there from day one; we'd have found it sooner
except that protosciurus wasn't trying to run the isolation tests till
recently.

To fix, rename the token to "string_literal" which is hopefully less
likely to collide with names used by system headers.  Back-patch to
all branches containing the isolation tests.
2015-05-27 19:14:40 -04:00
1b14571201 Remove configure check prohibiting threaded libpython on OpenBSD.
According to recent tests, this case now works fine, so there's no reason
to reject it anymore.  (Even if there are still some OpenBSD platforms
in the wild where it doesn't work, removing the check won't break any case
that worked before.)

We can actually remove the entire test that discovers whether libpython
is threaded, since without the OpenBSD case there's no need to know that
at all.

Per report from Davin Potts.  Back-patch to all active branches.
2015-05-26 22:14:59 -04:00
e1f1807c7a Rename pg_shdepend.c's typedef "objectType" to SharedDependencyObjectType.
The name objectType is widely used as a field name, and it's pure luck that
this conflict has not caused pgindent to go crazy before.  It messed up
pg_audit.c pretty good though.  Since pg_shdepend.c doesn't export this
typedef and only uses it in three places, changing that seems saner than
changing the field usages.

Back-patch because we're contemplating using the union of all branch
typedefs for future pgindent runs, so this won't fix anything if it
stays the same in back branches.
2015-05-24 13:03:45 -04:00
b78fbfe65f Back-patch libpq support for TLS versions beyond v1.
Since 7.3.2, libpq has been coded in such a way that the only SSL protocol
it would allow was TLS v1.  That approach is looking increasingly obsolete.
In commit 820f08cabd we fixed it to allow TLS >= v1, but did not
back-patch the change at the time, partly out of caution and partly because
the question was confused by a contemporary server-side change to reject
the now-obsolete SSL protocol v3.  9.4 has now been out long enough that
it seems safe to assume the change is OK; hence, back-patch into 9.0-9.3.

(I also chose to back-patch some relevant comments added by commit
326e1d73c4, but did *not* change the server behavior; hence, pre-9.4
servers will continue to allow SSL v3, even though no remotely modern
client will request it.)

Per gripe from Jan Bilek.
2015-05-21 20:41:55 -04:00
baf379bf22 Last-minute updates for release notes.
Revise description of CVE-2015-3166, in line with scaled-back patch.
Change release date.

Security: CVE-2015-3166
REL9_2_11
2015-05-19 18:33:58 -04:00
221f7a9494 Revert error-throwing wrappers for the printf family of functions.
This reverts commit 16304a0134, except
for its changes in src/port/snprintf.c; as well as commit
cac18a76bb which is no longer needed.

Fujii Masao reported that the previous commit caused failures in psql on
OS X, since if one exits the pager program early while viewing a query
result, psql sees an EPIPE error from fprintf --- and the wrapper function
thought that was reason to panic.  (It's a bit surprising that the same
does not happen on Linux.)  Further discussion among the security list
concluded that the risk of other such failures was far too great, and
that the one-size-fits-all approach to error handling embodied in the
previous patch is unlikely to be workable.

This leaves us again exposed to the possibility of the type of failure
envisioned in CVE-2015-3166.  However, that failure mode is strictly
hypothetical at this point: there is no concrete reason to believe that
an attacker could trigger information disclosure through the supposed
mechanism.  In the first place, the attack surface is fairly limited,
since so much of what the backend does with format strings goes through
stringinfo.c or psprintf(), and those already had adequate defenses.
In the second place, even granting that an unprivileged attacker could
control the occurrence of ENOMEM with some precision, it's a stretch to
believe that he could induce it just where the target buffer contains some
valuable information.  So we concluded that the risk of non-hypothetical
problems induced by the patch greatly outweighs the security risks.
We will therefore revert, and instead undertake closer analysis to
identify specific calls that may need hardening, rather than attempt a
universal solution.

We have kept the portion of the previous patch that improved snprintf.c's
handling of errors when it calls the platform's sprintf().  That seems to
be an unalloyed improvement.

Security: CVE-2015-3166
2015-05-19 18:17:42 -04:00
58b2cf033d Fix off-by-one error in Assertion.
The point of the assertion is to ensure that the arrays allocated in stack
are large enough, but the check was one item short.

This won't matter in practice because MaxIndexTuplesPerPage is an
overestimate, so you can't have that many items on a page in reality.
But let's be tidy.

Spotted by Anastasia Lubennikova. Backpatch to all supported versions, like
the patch that added the assertion.
2015-05-19 19:26:02 +03:00
97ff2a5645 Don't MultiXactIdIsRunning when in recovery
In 9.1 and earlier, it is possible for index_getnext() to try to examine
a heap buffer for possible HOT-prune when in recovery; this causes a
problem when a multixact is found in a tuple's Xmax, because
GetMultiXactIdMembers refuses to run when in recovery, raising an error:
	ERROR:  cannot GetMultiXactIdMembers() during recovery

This can be solved easily by having MultiXactIdIsRunning always return
false when in recovery, which is reasonable because a HOT standby cannot
acquire further tuple locks nor update/delete tuples.

(Note: it doesn't look like this specific code path has a problem in
9.2, because instead of doing HeapTupleSatisfiesUpdate directly,
heap_hot_search_buffer uses HeapTupleIsSurelyDead instead.  Still, there
may be other paths affected by the same bug, for instance in pgrowlocks,
and the multixact code hasn't changed; so apply the same fix
throughout.)

Apply this fix to 9.0 through 9.2.  In 9.3 the multixact code has been
changed completely and is no longer subject to this problem.

Per report from Marko Tiikkaja,
https://www.postgresql.org/message-id/54EB3283.2080305@joh.to
Analysis by Andres Freund
2015-05-18 17:44:21 -03:00
520fecfae5 Stamp 9.2.11. 2015-05-18 14:34:28 -04:00
b9215bc5d0 Fix error message in pre_sync_fname.
The old one didn't include %m anywhere, and required extra
translation.

Report by Peter Eisentraut. Fix by me. Review by Tom Lane.
2015-05-18 13:17:09 -04:00
f1946b134b Last-minute updates for release notes.
Add entries for security issues.

Security: CVE-2015-3165 through CVE-2015-3167
2015-05-18 12:09:03 -04:00
0ba2004312 pgcrypto: Report errant decryption as "Wrong key or corrupt data".
This has been the predominant outcome.  When the output of decrypting
with a wrong key coincidentally resembled an OpenPGP packet header,
pgcrypto could instead report "Corrupt data", "Not text data" or
"Unsupported compression algorithm".  The distinct "Corrupt data"
message added no value.  The latter two error messages misled when the
decrypted payload also exhibited fundamental integrity problems.  Worse,
error message variance in other systems has enabled cryptologic attacks;
see RFC 4880 section "14. Security Considerations".  Whether these
pgcrypto behaviors are likewise exploitable is unknown.

In passing, document that pgcrypto does not resist side-channel attacks.
Back-patch to 9.0 (all supported versions).

Security: CVE-2015-3167
2015-05-18 10:02:37 -04:00
01272d95a9 Check return values of sensitive system library calls.
PostgreSQL already checked the vast majority of these, missing this
handful that nearly cannot fail.  If putenv() failed with ENOMEM in
pg_GSS_recvauth(), authentication would proceed with the wrong keytab
file.  If strftime() returned zero in cache_locale_time(), using the
unspecified buffer contents could lead to information exposure or a
crash.  Back-patch to 9.0 (all supported versions).

Other unchecked calls to these functions, especially those in frontend
code, pose negligible security concern.  This patch does not address
them.  Nonetheless, it is always better to check return values whose
specification provides for indicating an error.

In passing, fix an off-by-one error in strftime_win32()'s invocation of
WideCharToMultiByte().  Upon retrieving a value of exactly MAX_L10N_DATA
bytes, strftime_win32() would overrun the caller's buffer by one byte.
MAX_L10N_DATA is chosen to exceed the length of every possible value, so
the vulnerable scenario probably does not arise.

Security: CVE-2015-3166
2015-05-18 10:02:37 -04:00
82b7393eb4 Add error-throwing wrappers for the printf family of functions.
All known standard library implementations of these functions can fail
with ENOMEM.  A caller neglecting to check for failure would experience
missing output, information exposure, or a crash.  Check return values
within wrappers and code, currently just snprintf.c, that bypasses the
wrappers.  The wrappers do not return after an error, so their callers
need not check.  Back-patch to 9.0 (all supported versions).

Popular free software standard library implementations do take pains to
bypass malloc() in simple cases, but they risk ENOMEM for floating point
numbers, positional arguments, large field widths, and large precisions.
No specification demands such caution, so this commit regards every call
to a printf family function as a potential threat.

Injecting the wrappers implicitly is a compromise between patch scope
and design goals.  I would prefer to edit each call site to name a
wrapper explicitly.  libpq and the ECPG libraries would, ideally, convey
errors to the caller rather than abort().  All that would be painfully
invasive for a back-patched security fix, hence this compromise.

Security: CVE-2015-3166
2015-05-18 10:02:37 -04:00
1e6652aea5 Permit use of vsprintf() in PostgreSQL code.
The next commit needs it.  Back-patch to 9.0 (all supported versions).
2015-05-18 10:02:37 -04:00
439ff9b6b9 Prevent a double free by not reentering be_tls_close().
Reentering this function with the right timing caused a double free,
typically crashing the backend.  By synchronizing a disconnection with
the authentication timeout, an unauthenticated attacker could achieve
this somewhat consistently.  Call be_tls_close() solely from within
proc_exit_prepare().  Back-patch to 9.0 (all supported versions).

Benkocs Norbert Attila

Security: CVE-2015-3165
2015-05-18 10:02:37 -04:00
6f8b6abd97 Translation updates
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git
Source-Git-Hash: 7da40afeb612deb39a77da4071ea1d5800706863
2015-05-18 08:45:56 -04:00
a05b8cffb6 Fix typos 2015-05-17 22:22:44 -04:00
9b06451d08 Release notes for 9.4.2, 9.3.7, 9.2.11, 9.1.16, 9.0.20. 2015-05-17 15:54:20 -04:00
639bf57caf Fix docs typo
I don't think "respectfully" is what was meant here ...
2015-05-16 13:28:27 -04:00
affc04d165 pg_upgrade: force timeline 1 in the new cluster
Previously, this prevented promoted standby servers from being upgraded
because of a missing WAL history file.  (Timeline 1 doesn't need a
history file, and we don't copy WAL files anyway.)

Report by Christian Echerer(?), Alexey Klyukin

Backpatch through 9.0
2015-05-16 00:40:18 -04:00
2a55e71343 pg_upgrade: only allow template0 to be non-connectable
This patch causes pg_upgrade to error out during its check phase if:

(1) template0 is marked connectable
or
(2) any other database is marked non-connectable

This is done because, in the first case, pg_upgrade would fail because
the pg_dumpall --globals restore would fail, and in the second case, the
database would not be restored, leading to data loss.

Report by Matt Landry (1), Stephen Frost (2)

Backpatch through 9.0
2015-05-16 00:10:03 -04:00
2a63434f07 Update time zone data files to tzdata release 2015d.
DST law changes in Egypt, Mongolia, Palestine.
Historical corrections for Canada and Chile.
Revised zone abbreviation for America/Adak (HST/HDT not HAST/HADT).
2015-05-15 19:36:06 -04:00
cde3e743cd Docs: fix erroneous claim about max byte length of GB18030.
This encoding has characters up to 4 bytes long, not 2.
2015-05-14 14:59:00 -04:00
1a99d392c1 Fix RBM_ZERO_AND_LOCK mode to not acquire lock on local buffers.
Commit 81c45081 introduced a new RBM_ZERO_AND_LOCK mode to ReadBuffer, which
takes a lock on the buffer before zeroing it. However, you cannot take a
lock on a local buffer, and you got a segfault instead. The version of that
patch committed to master included a check for !isLocalBuf, and therefore
didn't crash, but oddly I missed that in the back-patched versions. This
patch adds that check to the back-branches too.

RBM_ZERO_AND_LOCK mode is only used during WAL replay, and in hash indexes.
WAL replay only deals with shared buffers, so the only way to trigger the
bug is with a temporary hash index.

Reported by Artem Ignatyev, analysis by Tom Lane.
2015-05-13 10:06:52 +03:00
46f9acd3ef Fix incorrect checking of deferred exclusion constraint after a HOT update.
If a row that potentially violates a deferred exclusion constraint is
HOT-updated later in the same transaction, the exclusion constraint would
be reported as violated when the check finally occurs, even if the row(s)
the new row originally conflicted with have since been removed.  This
happened because the wrong TID was passed to check_exclusion_constraint(),
causing the live HOT-updated row to be seen as a conflicting row rather
than recognized as the row-under-test.

Per bug #13148 from Evan Martin.  It's been broken since exclusion
constraints were invented, so back-patch to all supported branches.
2015-05-11 12:25:28 -04:00
21cb21de2e Recommend include_realm=1 in docs
As discussed, the default setting of include_realm=0 can be dangerous in
multi-realm environments because it is then impossible to differentiate
users with the same username but who are from two different realms.

Recommend include_realm=1 and note that the default setting may change
in a future version of PostgreSQL and therefore users may wish to
explicitly set include_realm to avoid issues while upgrading.
2015-05-08 19:40:09 -04:00
447e165817 Properly send SCM status updates when shutting down service on Windows
The Service Control Manager should be notified regularly during a shutdown
that takes a long time. Previously we would increaes the counter, but forgot
to actually send the notification to the system. The loop counter was also
incorrectly initalized in the event that the startup of the system took long
enough for it to increase, which could cause the shutdown process not to wait
as long as expected.

Krystian Bigaj, reviewed by Michael Paquier
2015-05-07 15:09:42 +02:00
bf85ba2a9c citext's regexp_matches() functions weren't documented, either. 2015-05-05 16:11:16 -04:00
d4070d10c2 Fix incorrect declaration of citext's regexp_matches() functions.
These functions should return SETOF TEXT[], like the core functions they
are wrappers for; but they were incorrectly declared as returning just
TEXT[].  This mistake had two results: first, if there was no match you got
a scalar null result, whereas what you should get is an empty set (zero
rows).  Second, the 'g' flag was effectively ignored, since you would get
only one result array even if there were multiple matches, as reported by
Jeff Certain.

While ignoring 'g' is a clear bug, the behavior for no matches might well
have been thought to be the intended behavior by people who hadn't compared
it carefully to the core regexp_matches() functions.  So we should tread
carefully about introducing this change in the back branches.  Still, it
clearly is a bug and so providing some fix is desirable.

After discussion, the conclusion was to introduce the change in a 1.1
version of the citext extension (as we would need to do anyway); 1.0 still
contains the incorrect behavior.  1.1 is the default and only available
version in HEAD, but it is optional in the back branches, where 1.0 remains
the default version.  People wishing to adopt the fix in back branches will
need to explicitly do ALTER EXTENSION citext UPDATE TO '1.1'.  (I also
provided a downgrade script in the back branches, so people could go back
to 1.0 if necessary.)

This should be called out as an incompatible change in the 9.5 release
notes, although we'll also document it in the next set of back-branch
release notes.  The notes should mention that any views or rules that use
citext's regexp_matches() functions will need to be dropped before
upgrading to 1.1, and then recreated again afterwards.

Back-patch to 9.1.  The bug goes all the way back to citext's introduction
in 8.4, but pre-9.1 there is no extension mechanism with which to manage
the change.  Given the lack of previous complaints it seems unnecessary to
change this behavior in 9.0, anyway.
2015-05-05 15:50:53 -04:00
53e1498c65 Fix some problems with patch to fsync the data directory.
pg_win32_is_junction() was a typo for pgwin32_is_junction().  open()
was used not only in a two-argument form, which breaks on Windows,
but also where BasicOpenFile() should have been used.

Per reports from Andrew Dunstan and David Rowley.
2015-05-05 09:22:51 -04:00
2bc3397168 Recursively fsync() the data directory after a crash.
Otherwise, if there's another crash, some writes from after the first
crash might make it to disk while writes from before the crash fail
to make it to disk.  This could lead to data corruption.

Back-patch to all supported versions.

Abhijit Menon-Sen, reviewed by Andres Freund and slightly revised
by me.
2015-05-04 12:41:53 -04:00
7c2ccb494c Build libecpg with -DFRONTEND in all supported versions.
Fix an oversight in commit 151e74719b by
back-patching commit 44c5d387ea to 9.0.
2015-04-26 17:20:19 -04:00
950f80dd54 Prevent improper reordering of antijoins vs. outer joins.
An outer join appearing within the RHS of an antijoin can't commute with
the antijoin, but somehow I missed teaching make_outerjoininfo() about
that.  In Teodor Sigaev's recent trouble report, this manifests as a
"could not find RelOptInfo for given relids" error within eqjoinsel();
but I think silently wrong query results are possible too, if the planner
misorders the joins and doesn't happen to trigger any internal consistency
checks.  It's broken as far back as we had antijoins, so back-patch to all
supported branches.
2015-04-25 16:44:27 -04:00
5eab0f1383 Build every ECPG library with -DFRONTEND.
Each of the libraries incorporates src/port files, which often check
FRONTEND.  Build systems disagreed on whether to build libpgtypes this
way.  Only libecpg incorporates files that rely on it today.  Back-patch
to 9.0 (all supported versions) to forestall surprises.
2015-04-24 19:29:27 -04:00
b743a8b313 Fix obsolete comment in set_rel_size().
The cross-reference to set_append_rel_pathlist() was obsoleted by
commit e2fa76d80b, which split what
had been set_rel_pathlist() and child routines into two sets of
functions.  But I (tgl) evidently missed updating this comment.

Back-patch to 9.2 to avoid unnecessary divergence among branches.

Amit Langote
2015-04-24 15:18:48 -04:00
d3f5d2892b Fix deadlock at startup, if max_prepared_transactions is too small.
When the startup process recovers transactions by scanning pg_twophase
directory, it should clear MyLockedGxact after it's done processing each
transaction. Like we do during normal operation, at PREPARE TRANSACTION.
Otherwise, if the startup process exits due to an error, it will try to
clear the locking_backend field of the last recovered transaction. That's
usually harmless, but if the error happens in MarkAsPreparing, while
holding TwoPhaseStateLock, the shmem-exit hook will try to acquire
TwoPhaseStateLock again, and deadlock with itself.

This fixes bug #13128 reported by Grant McAlister. The bug was introduced
by commit bb38fb0d, so backpatch to all supported versions like that
commit.
2015-04-23 21:36:50 +03:00
3d73a67ffc Fix typo in comment
SLRU_SEGMENTS_PER_PAGE -> SLRU_PAGES_PER_SEGMENT

I introduced this ancient typo in subtrans.c and later propagated it to
multixact.c.  I fixed the latter in f741300c, but only back to 9.3;
backpatch to all supported branches for consistency.
2015-04-14 12:12:18 -03:00
cc2939f442 Don't archive bogus recycled or preallocated files after timeline switch.
After a timeline switch, we would leave behind recycled WAL segments that
are in the future, but on the old timeline. After promotion, and after they
become old enough to be recycled again, we would notice that they don't have
a .ready or .done file, create a .ready file for them, and archive them.
That's bogus, because the files contain garbage, recycled from an older
timeline (or prealloced as zeros). We shouldn't archive such files.

This could happen when we're following a timeline switch during replay, or
when we switch to new timeline at end-of-recovery.

To fix, whenever we switch to a new timeline, scan the data directory for
WAL segments on the old timeline, but with a higher segment number, and
remove them. Those don't belong to our timeline history, and are most
likely bogus recycled or preallocated files. They could also be valid files
that we streamed from the primary ahead of time, but in any case, they're
not needed to recover to the new timeline.
2015-04-13 17:26:59 +03:00
3a9951d62e Remove duplicated words in comments.
David Rowley
2015-04-12 10:49:40 +03:00
9abf89828e Fix incorrect punctuation
Amit Langote
2015-04-09 13:36:16 +02:00