"8" was correct back when "disable" was the longest allowed value, but
since "verify-full" was added, it should be "12". Given the lack of
complaints, I wouldn't be surprised if nobody is actually using these
values ... but still, if they're in the API, they should be right.
Noticed while pursuing a different problem. It's been wrong for quite
a long time, so back-patch to all supported branches.
This should eliminate the risk of recursive entry to syslog(3), which
appears to be the cause of the hang reported in bug #9551 from James
Morton.
Arguably, the real problem here is auth.c's willingness to turn on
ImmediateInterruptOK while executing fairly wide swaths of backend code.
We may well need to work at narrowing the code ranges in which the
authentication_timeout interrupt is enabled. For the moment, though,
this is a cheap and reasonably noninvasive fix for a field-reported
failure; the other approach would be complex and not necessarily
bug-free itself.
Back-patch to all supported branches.
Use TransactionIdIsInProgress, then TransactionIdDidCommit, to distinguish
whether a NOTIFY message's originating transaction is in progress,
committed, or aborted. The previous coding could accept a message from a
transaction that was still in-progress according to the PGPROC array;
if the client were fast enough at starting a new transaction, it might fail
to see table rows added/updated by the message-sending transaction. Which
of course would usually be the point of receiving the message. We noted
this type of race condition long ago in tqual.c, but async.c overlooked it.
The race condition probably cannot occur unless there are multiple NOTIFY
senders in action, since an individual backend doesn't send NOTIFY signals
until well after it's done committing. But if two senders commit in close
succession, it's certainly possible that we could see the second sender's
message within the race condition window while responding to the signal
from the first one.
Per bug #9557 from Marko Tiikkaja. This patch is slightly more invasive
than what he proposed, since it removes the now-redundant
TransactionIdDidAbort call.
Back-patch to 9.0, where the current NOTIFY implementation was introduced.
We don't take a full-page image of the GIN metapage; instead, the WAL record
contains all the information required to reconstruct it from scratch. But
to avoid torn page hazards, we must re-initialize it from the WAL record
every time, even if it already has a greater LSN, similar to how normal full
page images are restored.
This was highly unlikely to cause any problems in practice, because the GIN
metapage is small. We rely on an update smaller than a 512 byte disk sector
to be atomic elsewhere, at least in pg_control. But better safe than sorry,
and this would be easy to overlook if more fields are added to the metapage
so that it's no longer small.
Reported by Noah Misch. Backpatch to all supported versions.
Commit 08146775acd8bfe0fcc509c71857abb928697171 changed do_copy() to
temporarily scribble on pset.cur_cmd_source. That was a mighty ugly bit of
code in any case, but in particular it broke handleCopyIn's ability to tell
whether it was reading from the current script source file (in which case
pset.lineno should be incremented for each line of COPY data), or from
someplace else (in which case it shouldn't). The former case still worked,
the latter not so much. The visible effect was that line numbers reported
for errors in a script file would be wrong if there were an earlier \copy
that was reading anything other than inline-in-the-script-file data.
To fix, introduce another pset field that holds the file do_copy wants the
COPY code to use. This is a little bit ugly, but less so than passing the
file down explicitly through several layers that aren't COPY-specific.
Extracted from a larger patch by Kumar Rajeev Rastogi; that patch also
changes printing of COPY command tags, which is not a bug fix and shouldn't
get back-patched. This particular idea was from a suggestion by Amit
Khandekar, if I'm reading the thread correctly.
Back-patch to 9.2 where the faulty code was introduced.
A fake relcache entry can "own" a SmgrRelation object, like a regular
relcache entry. But when it was free'd, the owner field in SmgrRelation
was not cleared, so it was left pointing to free'd memory.
Amazingly this apparently hasn't caused crashes in practice, or we would've
heard about it earlier. Andres found this with Valgrind.
Report and fix by Andres Freund, with minor modifications by me. Backpatch
to all supported versions.
In make_ruledef and get_query_def, we have long used AcquireRewriteLocks
to ensure that the querytree we are about to deparse is up-to-date and
the schemas of the underlying relations aren't changing. Howwever, that
function thinks the query is about to be executed, so it acquires locks
that are stronger than necessary for the purpose of deparsing. Thus for
example, if pg_dump asks to deparse a rule that includes "INSERT INTO t",
we'd acquire RowExclusiveLock on t. That results in interference with
concurrent transactions that might for example ask for ShareLock on t.
Since pg_dump is documented as being purely read-only, this is unexpected.
(Worse, it used to actually be read-only; this behavior dates back only
to 8.1, cf commit ba4200246.)
Fix this by adding a parameter to AcquireRewriteLocks to tell it whether
we want the "real" execution locks or only AccessShareLock.
Report, diagnosis, and patch by Dean Rasheed. Back-patch to all supported
branches.
CheckRequiredParameterValues() should perform the checks if archive recovery
was requested, even if we are going to perform crash recovery first.
Reported by Kyotaro HORIGUCHI. Backpatch to 9.2, like the crash-then-archive
recovery mode.
When entering crash recovery followed by archive recovery, and the latest
checkpoint is a shutdown checkpoint, and there are no more WAL records to
replay before transitioning from crash to archive recovery, we would not
immediately allow read-only connections in hot standby mode even if we
could. That's because when starting from a shutdown checkpoint, we set
lastReplayedEndRecPtr incorrectly to the record before the checkpoint
record, instead of the checkpoint record itself. We don't run the redo
routine of the shutdown checkpoint record, but starting recovery from it
goes through the same motions, so it should be considered as replayed.
Reported by Kyotaro HORIGUCHI. All versions with hot standby are affected,
so backpatch to 9.0.
The regex code didn't have any provision for query cancel; which is
unsurprising given its non-Postgres origin, but still problematic since
some operations can take a long time. Introduce a callback function to
check for a pending query cancel or session termination request, and
call it in a couple of strategic spots where we can make the regex code
exit with an error indicator.
If we ever actually split out the regex code as a standalone library,
some additional work will be needed to let the cancel callback function
be specified externally to the library. But that's straightforward
(certainly so by comparison to putting the locale-dependent character
classification logic on a similar arms-length basis), and there seems
no need to do it right now.
A bigger issue is that there may be more places than these two where
we need to check for cancels. We can always add more checks later,
now that the infrastructure is in place.
Since there are known examples of not-terribly-long regexes that can
lock up a backend for a long time, back-patch to all supported branches.
I have hopes of fixing the known performance problems later, but adding
query cancel ability seems like a good idea even if they were all fixed.
If there are lots of uncommitted tuples at the end of the index range,
get_actual_variable_range() ends up fetching each one and doing an MVCC
visibility check on it, until it finally hits a visible tuple. This is
bad enough in isolation, considering that we don't need an exact answer
only an approximate one. But because the tuples are not yet committed,
each visibility check does a TransactionIdIsInProgress() test, which
involves scanning the ProcArray. When multiple sessions do this
concurrently, the ensuing contention results in horrid performance loss.
20X overall throughput loss on not-too-complicated queries is easy to
demonstrate in the back branches (though someone's made it noticeably
less bad in HEAD).
We can dodge the problem fairly effectively by using SnapshotDirty rather
than a normal MVCC snapshot. This will cause the index probe to take
uncommitted tuples as good, so that we incur only one tuple fetch and test
even if there are many such tuples. The extent to which this degrades the
estimate is debatable: it's possible the result is actually a more accurate
prediction than before, if the endmost tuple has become committed by the
time we actually execute the query being planned. In any case, it's not
very likely that it makes the estimate a lot worse.
SnapshotDirty will still reject tuples that are known committed dead, so
we won't give bogus answers if an invalid outlier has been deleted but not
yet vacuumed from the index. (Because btrees know how to mark such tuples
dead in the index, we shouldn't have a big performance problem in the case
that there are many of them at the end of the range.) This consideration
motivates not using SnapshotAny, which was also considered as a fix.
Note: the back branches were using SnapshotNow instead of an MVCC snapshot,
but the problem and solution are the same.
Per performance complaints from Bartlomiej Romanski, Josh Berkus, and
others. Back-patch to 9.0, where the issue was introduced (by commit
40608e7f949fb7e4025c0ddd5be01939adc79eec).
The SQL standard says that OVERLAPS should have a two-element row
constructor on each side. The original coding of OVERLAPS support in
our grammar attempted to extend that by allowing a single-element row
constructor, which it internally duplicated ... or tried to, anyway.
But that code has certainly not worked since our List infrastructure was
rewritten in 2004, and I'm none too sure it worked before that. As it
stands, it ends up building a List that includes itself, leading to
assorted undesirable behaviors later in the parser.
Even if it worked as intended, it'd be a bit evil because of the
possibility of duplicate evaluation of a volatile function that the user
had written only once. Given the lack of documentation, test cases, or
complaints, let's just get rid of the idea and only support the standard
syntax.
While we're at it, improve the error cursor positioning for the
wrong-number-of-arguments errors, and inline the makeOverlaps() function
since it's only called in one place anyway.
Per bug #9227 from Joshua Yanovski. Initial patch by Joshua Yanovski,
extended a bit by me.
The ASLR in Windows 8/Windows 2012 can break PostgreSQL's shared memory. It
doesn't fail every time (which is explained by the Random part in ASLR), but
can fail with errors abut failing to reserve shared memory region.
MauMau, reviewed by Craig Ringer
Coverity identified a number of places in which it couldn't prove that a
string being copied into a fixed-size buffer would fit. We believe that
most, perhaps all of these are in fact safe, or are copying data that is
coming from a trusted source so that any overrun is not really a security
issue. Nonetheless it seems prudent to forestall any risk by using
strlcpy() and similar functions.
Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports.
In addition, fix a potential null-pointer-dereference crash in
contrib/chkpass. The crypt(3) function is defined to return NULL on
failure, but chkpass.c didn't check for that before using the result.
The main practical case in which this could be an issue is if libc is
configured to refuse to execute unapproved hashing algorithms (e.g.,
"FIPS mode"). This ideally should've been a separate commit, but
since it touches code adjacent to one of the buffer overrun changes,
I included it in this commit to avoid last-minute merge issues.
This issue was reported by Honza Horak.
Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement. Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest. In passing, add check_stack_depth() to prevent stack overflow
in related functions.
Back-patch to 8.4 (all supported versions). The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.
Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.
Security: CVE-2014-0064
Many server functions use the MAXDATELEN constant to size a buffer for
parsing or displaying a datetime value. It was much too small for the
longest possible interval output and slightly too small for certain
valid timestamp input, particularly input with a long timezone name.
The long input was rejected needlessly; the long output caused
interval_out() to overrun its buffer. ECPG's pgtypes library has a copy
of the vulnerable functions, which bore the same vulnerabilities along
with some of its own. In contrast to the server, certain long inputs
caused stack overflow rather than failing cleanly. Back-patch to 8.4
(all supported versions).
Reported by Daniel Schüssler, reviewed by Tom Lane.
Security: CVE-2014-0063
If the name lookups come to different conclusions due to concurrent
activity, we might perform some parts of the DDL on a different table
than other parts. At least in the case of CREATE INDEX, this can be
used to cause the permissions checks to be performed against a
different table than the index creation, allowing for a privilege
escalation attack.
This changes the calling convention for DefineIndex, CreateTrigger,
transformIndexStmt, transformAlterTableStmt, CheckIndexCompatible
(in 9.2 and newer), and AlterTable (in 9.1 and older). In addition,
CheckRelationOwnership is removed in 9.2 and newer and the calling
convention is changed in older branches. A field has also been added
to the Constraint node (FkConstraint in 8.4). Third-party code calling
these functions or using the Constraint node will require updating.
Report by Andres Freund. Patch by Robert Haas and Andres Freund,
reviewed by Tom Lane.
Security: CVE-2014-0062
The primary role of PL validators is to be called implicitly during
CREATE FUNCTION, but they are also normal functions that a user can call
explicitly. Add a permissions check to each validator to ensure that a
user cannot use explicit validator calls to achieve things he could not
otherwise achieve. Back-patch to 8.4 (all supported versions).
Non-core procedural language extensions ought to make the same two-line
change to their own validators.
Andres Freund, reviewed by Tom Lane and Noah Misch.
Security: CVE-2014-0061
Granting a role without ADMIN OPTION is supposed to prevent the grantee
from adding or removing members from the granted role. Issuing SET ROLE
before the GRANT bypassed that, because the role itself had an implicit
right to add or remove members. Plug that hole by recognizing that
implicit right only when the session user matches the current role.
Additionally, do not recognize it during a security-restricted operation
or during execution of a SECURITY DEFINER function. The restriction on
SECURITY DEFINER is not security-critical. However, it seems best for a
user testing his own SECURITY DEFINER function to see the same behavior
others will see. Back-patch to 8.4 (all supported versions).
The SQL standards do not conflate roles and users as PostgreSQL does;
only SQL roles have members, and only SQL users initiate sessions. An
application using PostgreSQL users and roles as SQL users and roles will
never attempt to grant membership in the role that is the session user,
so the implicit right to add or remove members will never arise.
The security impact was mostly that a role member could revoke access
from others, contrary to the wishes of his own grantor. Unapproved role
member additions are less notable, because the member can still largely
achieve that by creating a view or a SECURITY DEFINER function.
Reviewed by Andres Freund and Tom Lane. Reported, independently, by
Jonas Sundman and Noah Misch.
Security: CVE-2014-0060
DST law changes in Jordan; historical changes in Cuba.
Also, remove the zones Asia/Riyadh87, Asia/Riyadh88, and Asia/Riyadh89.
Per the upstream announcement:
The files solar87, solar88, and solar89 are no longer distributed.
They were a negative experiment -- that is, a demonstration that
tz data can represent solar time only with some difficulty and error.
Their presence in the distribution caused confusion, as Riyadh
civil time was generally not solar time in those years.
Adjust handleCopyOut() to stop trying to write data once it's failed
one time. For typical cases such as out-of-disk-space or broken-pipe,
additional attempts aren't going to do anything but waste time, and
in any case clean truncation of the output seems like a better behavior
than randomly dropping blocks in the middle.
Also remove dubious (and misleadingly documented) attempt to force our way
out of COPY_OUT state if libpq didn't do that. If we did have a situation
like that, it'd be a bug in libpq and would be better fixed there, IMO.
We can hope that commit fa4440f51628d692f077d54b8313aea31af087ea took care
of any such problems, anyway.
Also fix longstanding bug in handleCopyIn(): PQputCopyEnd() only supports
a non-null errormsg parameter in protocol version 3, and will actively
fail if one is passed in version 2. This would've made our attempts
to get out of COPY_IN state after a failure into infinite loops when
talking to pre-7.4 servers.
Back-patch the COPY_OUT state change business back to 9.2 where it was
introduced, and the other two fixes into all supported branches.
We used the length of the input string, not the de-escaped string, as
the trigger for NAMEDATALEN truncation. AFAICS this would only result
in sometimes printing a phony truncation warning; but it's just luck
that there was no worse problem, since we were violating the API spec
for truncate_identifier(). Per bug #9204 from Joshua Yanovski.
This has been wrong since the Unicode-identifier support was added,
so back-patch to all supported branches.
In pqSendSome, if the connection is already closed at entry, discard any
queued output data before returning. There is no possibility of ever
sending the data, and anyway this corresponds to what we'd do if we'd
detected a hard error while trying to send(). This avoids possible
indefinite bloat of the output buffer if the application keeps trying
to send data (or even just keeps trying to do PQputCopyEnd, as psql
indeed will).
Because PQputCopyEnd won't transition out of PGASYNC_COPY_IN state
until it's successfully queued the COPY END message, and pqPutMsgEnd
doesn't distinguish a queuing failure from a pqSendSome failure,
this omission allowed an infinite loop in psql if the connection closure
occurred when we had at least 8K queued to send. It might be worth
refactoring so that we can make that distinction, but for the moment
the other changes made here seem to offer adequate defenses.
To guard against other variants of this scenario, do not allow
PQgetResult to return a PGRES_COPY_XXX result if the connection is
already known dead. Make sure it returns PGRES_FATAL_ERROR instead.
Per report from Stephen Frost. Back-patch to all active branches.
In a database that's not yet reached consistency, it's possible that some
segments of a relation are not full-size but are not the last ones either.
Because of the way smgrnblocks() works, asking for a new page with P_NEW
will fill in the last not-full-size segment --- and if that makes it full
size, the apparent EOF of the relation will increase by more than one page,
so that the next P_NEW request will yield a page past the next consecutive
one. This breaks the relation-extension logic in XLogReadBufferExtended,
possibly allowing a page update to be applied to some page far past where
it was intended to go. This appears to be the explanation for reports of
table bloat on replication slaves compared to their masters, and probably
explains some corrupted-slave reports as well.
Fix the loop to check the page number it actually got, rather than merely
Assert()'ing that dead reckoning got it to the desired place. AFAICT,
there are no other places that make assumptions about exactly which page
they'll get from P_NEW.
Problem identified by Greg Stark, though this is not the same as his
proposed patch.
It's been like this for a long time, so back-patch to all supported
branches.
If an error occurs in the foreground (backup) process of pg_basebackup,
and we exit in a controlled way, the background process (streaming
xlog process) would stay around and keep streaming.
Providing this information as plain text was doubtless worth the trouble
ten years ago, but it seems likely that hardly anyone reads it in this
format anymore. And the effort required to maintain these files (in the
form of extra-complex markup rules in the relevant parts of the SGML
documentation) is significant. So, let's stop doing that and rely solely
on the other documentation formats.
Per discussion, the plain-text INSTALL instructions might still be worth
their keep, so we continue to generate that file.
Rather than remove HISTORY and src/test/regress/README from distribution
tarballs entirely, replace them with simple stub files that tell the reader
where to find the relevant documentation. This is mainly to avoid possibly
breaking packaging recipes that expect these files to exist.
Back-patch to all supported branches, because simplifying the markup
requirements for release notes won't help much unless we do it in all
branches.
When using verbose mode for pg_basebackup, in tar format sent to
stdout, we'd print an unitialized buffer as the filename.
Reported by Pontus Lundkvist
Given a composite-type parameter named x, "$1.*" worked fine, but "x.*"
not so much. This has been broken since named parameter references were
added in commit 9bff0780cf5be2193a5bad0d3df2dbe143085264, so patch back
to 9.2. Per bug #9085 from Hardy Falk.
In p_isdigit and other character class test functions generated by the
p_iswhat macro, the code path for non-C locales with multibyte encodings
contained a bogus pointer cast that would accidentally fail to malfunction
if types wchar_t and wint_t have the same width. Apparently that is true
on most platforms, but not on recent Cygwin releases. Remove the cast,
as it seems completely unnecessary (I think it arose from a false analogy
to the need to cast to unsigned char when dealing with the <ctype.h>
functions). Per bug #8970 from Marco Atzeri.
In the same functions, the code path for C locale with a multibyte encoding
simply ANDed each wide character with 0xFF before passing it to the
corresponding <ctype.h> function. This could result in false positive
answers for some non-ASCII characters, so use a range test instead.
Noted by me while investigating Marco's complaint.
Also, remove some useless though not actually buggy maskings and casts
in the hand-coded p_isalnum and p_isalpha functions, which evidently
got tested a bit more carefully than the macro-generated functions.
WalSndKill was doing things exactly backwards: it should first clear
MyWalSnd (to stop signal handlers from touching MyWalSnd->latch),
then disown the latch, and only then mark the WalSnd struct unused by
clearing its pid field.
Also, WalRcvSigUsr1Handler and worker_spi_sighup failed to preserve
errno, which is surely a requirement for any signal handler.
Per discussion of recent buildfarm failures. Back-patch as far
as the relevant code exists.
The preferred method is to use "cc -shared", and this allows binaries
to be rebased if required, unlike dllwrap.
Backpatch to 9.0 where we have buildfarm coverage.
There are still some issues with Cygwin, especially modern Cygwin, but
this helps us get closer to good support.
Marco Atzeri.
This has long been done by the MSVC build system, and has caused
confusion in the past when programs like psql have failed to start
because they can't find the DLL. If it's in the same directory as it now
will be they will find it.
Backpatch to all live branches.
Evidence from buildfarm member crake suggests that the new test_shm_mq
module is routinely crashing the server due to the arrival of a SIGUSR1
after the shared memory segment has been unmapped. Although processes
using the new dynamic background worker facilities are more likely to
receive a SIGUSR1 around this time, the problem is also possible on older
branches, so I'm back-patching the parts of this change that apply to
older branches as far as they apply.
It's already generally the case that code checks whether these pointers
are NULL before deferencing them, so the important thing is mostly to
make sure that they do get set to NULL before they become invalid. But
in master, there's one case in procsignal_sigusr1_handler that lacks a
NULL guard, so add that.
Patch by me; review by Tom Lane.
Various places were supposing that errno could be expected to hold still
within an ereport() nest or similar contexts. This isn't true necessarily,
though in some cases it accidentally failed to fail depending on how the
compiler chanced to order the subexpressions. This class of thinko
explains recent reports of odd failures on clang-built versions, typically
missing or inappropriate HINT fields in messages.
Problem identified by Christian Kruse, who also submitted the patch this
commit is based on. (I fixed a few issues in his patch and found a couple
of additional places with the same disease.)
Back-patch as appropriate to all supported branches.
In the platform that doesn't support Unix-domain socket, when
neither host nor hostaddr are specified, the default host
'localhost' is used to connect to the server and PQhost() must
return that, but it didn't. This patch fixes PQhost() so that
it returns the default host in that case.
Also this patch fixes PQhost() so that it doesn't return
Unix-domain socket directory path in the platform that doesn't
support Unix-domain socket.
Back-patch to all supported versions.
A while back, 2c92edad48796119c83d7dbe6c33425d1924626d allowed
type_func_name_keywords to be used in more places, including role
identifiers. Unfortunately, that commit missed out on cases where
name_list was used for lists-of-roles, eg: for DROP ROLE. This
resulted in the unfortunate situation that you could CREATE a role
with a type_func_name_keywords-allowed identifier, but not DROP it
(directly- ALTER could be used to rename it to something which
could be DROP'd).
This extends allowing type_func_name_keywords to places where role
lists can be used.
Back-patch to 9.0, as 2c92edad48796119c83d7dbe6c33425d1924626d was.
All these constructs generate parse trees consisting of a Const and
a run-time type coercion (perhaps a FuncExpr or a CoerceViaIO). Modify
the raw parse output so that we end up with the original token's location
attached to the type coercion node while the Const has location -1;
before, it was the other way around. This makes no difference in terms
of what exprLocation() will say about the parse tree as a whole, so it
should not have any user-visible impact. The point of changing it is that
we do not want contrib/pg_stat_statements to treat these constructs as
replaceable constants. It will do the right thing if the Const has
location -1 rather than a valid location.
This is a pretty ugly hack, but then this code is ugly already; we should
someday replace this translation with special-purpose parse node(s) that
would allow ruleutils.c to reconstruct the original query text.
(See also commit 5d3fcc4c2e137417ef470d604fee5e452b22f6a7, which also
hacked location assignment rules for the benefit of pg_stat_statements.)
Back-patch to 9.2 where pg_stat_statements grew the ability to recognize
replaceable constants.
Kyotaro Horiguchi
We've always allowed CREATE TABLE to create tables in the database's default
tablespace without checking for CREATE permissions on that tablespace.
Unfortunately, the original implementation of ALTER TABLE ... SET TABLESPACE
didn't pick up on that exception.
This changes ALTER TABLE ... SET TABLESPACE to allow the database's default
tablespace without checking for CREATE rights on that tablespace, just as
CREATE TABLE works today. Users could always do this through a series of
commands (CREATE TABLE ... AS SELECT * FROM ...; DROP TABLE ...; etc), so
let's fix the oversight in SET TABLESPACE's original implementation.