1
0
mirror of https://github.com/postgres/postgres.git synced 2025-08-28 18:48:04 +03:00
Commit Graph

4532 Commits

Author SHA1 Message Date
Tom Lane
2e3cc39556 Guard against bad "dscale" values in numeric_recv().
We were not checking to see if the supplied dscale was valid for the given
digit array when receiving binary-format numeric values.  While dscale can
validly be more than the number of nonzero fractional digits, it shouldn't
be less; that case causes fractional digits to be hidden on display even
though they're there and participate in arithmetic.

Bug #12053 from Tommaso Sala indicates that there's at least one broken
client library out there that sometimes supplies an incorrect dscale value,
leading to strange behavior.  This suggests that simply throwing an error
might not be the best response; it would lead to failures in applications
that might seem to be working fine today.  What seems the least risky fix
is to truncate away any digits that would be hidden by dscale.  This
preserves the existing behavior in terms of what will be printed for the
transmitted value, while preventing subsequent arithmetic from producing
results inconsistent with that.

In passing, throw a specific error for the case of dscale being outside
the range that will fit into a numeric's header.  Before you got "value
overflows numeric format", which is a bit misleading.

Back-patch to all supported branches.
2014-12-01 15:25:15 -05:00
Heikki Linnakangas
4d1c738d10 Fix two bugs in tsquery @> operator.
1. The comparison for matching terms used only the CRC to decide if there's
a match. Two different terms with the same CRC gave a match.

2. It assumed that if the second operand has more terms than the first, it's
never a match. That assumption is bogus, because there can be duplicate
terms in either operand.

Rewrite the implementation in a way that doesn't have those bugs.

Backpatch to all supported versions.
2014-10-27 10:51:36 +02:00
Tom Lane
2784b68b32 Support timezone abbreviations that sometimes change.
Up to now, PG has assumed that any given timezone abbreviation (such as
"EDT") represents a constant GMT offset in the usage of any particular
region; we had a way to configure what that offset was, but not for it
to be changeable over time.  But, as with most things horological, this
view of the world is too simplistic: there are numerous regions that have
at one time or another switched to a different GMT offset but kept using
the same timezone abbreviation.  Almost the entire Russian Federation did
that a few years ago, and later this month they're going to do it again.
And there are similar examples all over the world.

To cope with this, invent the notion of a "dynamic timezone abbreviation",
which is one that is referenced to a particular underlying timezone
(as defined in the IANA timezone database) and means whatever it currently
means in that zone.  For zones that use or have used daylight-savings time,
the standard and DST abbreviations continue to have the property that you
can specify standard or DST time and get that time offset whether or not
DST was theoretically in effect at the time.  However, the abbreviations
mean what they meant at the time in question (or most recently before that
time) rather than being absolutely fixed.

The standard abbreviation-list files have been changed to use this behavior
for abbreviations that have actually varied in meaning since 1970.  The
old simple-numeric definitions are kept for abbreviations that have not
changed, since they are a bit faster to resolve.

While this is clearly a new feature, it seems necessary to back-patch it
into all active branches, because otherwise use of Russian zone
abbreviations is going to become even more problematic than it already was.
This change supersedes the changes in commit 513d06ded et al to modify the
fixed meanings of the Russian abbreviations; since we've not shipped that
yet, this will avoid an undesirably incompatible (not to mention incorrect)
change in behavior for timestamps between 2011 and 2014.

This patch makes some cosmetic changes in ecpglib to keep its usage of
datetime lookup tables as similar as possible to the backend code, but
doesn't do anything about the increasingly obsolete set of timezone
abbreviation definitions that are hard-wired into ecpglib.  Whatever we
do about that will likely not be appropriate material for back-patching.
Also, a potential free() of a garbage pointer after an out-of-memory
failure in ecpglib has been fixed.

This patch also fixes pre-existing bugs in DetermineTimeZoneOffset() that
caused it to produce unexpected results near a timezone transition, if
both the "before" and "after" states are marked as standard time.  We'd
only ever thought about or tested transitions between standard and DST
time, but that's not what's happening when a zone simply redefines their
base GMT offset.

In passing, update the SGML documentation to refer to the Olson/zoneinfo/
zic timezone database as the "IANA" database, since it's now being
maintained under the auspices of IANA.
2014-10-16 15:22:23 -04:00
Tom Lane
745723c9ef Update time zone data files to tzdata release 2014h.
Most zones in the Russian Federation are subtracting one or two hours
as of 2014-10-26.  Update the meanings of the abbreviations IRKT, KRAT,
MAGT, MSK, NOVT, OMST, SAKT, VLAT, YAKT, YEKT to match.

The IANA timezone database has adopted abbreviations of the form AxST/AxDT
for all Australian time zones, reflecting what they believe to be current
majority practice Down Under.  These names do not conflict with usage
elsewhere (other than ACST for Acre Summer Time, which has been in disuse
since 1994).  Accordingly, adopt these names into our "Default" timezone
abbreviation set.  The "Australia" abbreviation set now contains only
CST,EAST,EST,SAST,SAT,WST, all of which are thought to be mostly historical
usage.  Note that SAST has also been changed to be South Africa Standard
Time in the "Default" abbreviation set.

Add zone abbreviations SRET (Asia/Srednekolymsk) and XJT (Asia/Urumqi),
and use WSST/WSDT for western Samoa.

Also a DST law change in the Turks & Caicos Islands (America/Grand_Turk),
and numerous corrections for historical time zone data.
2014-10-04 14:18:39 -04:00
Tom Lane
cf5c20b063 Fix power_var_int() for large integer exponents.
The code for raising a NUMERIC value to an integer power wasn't very
careful about large powers.  It got an outright wrong answer for an
exponent of INT_MIN, due to failure to consider overflow of the Abs(exp)
operation; which is fixable by using an unsigned rather than signed
exponent value after that point.  Also, even though the number of
iterations of the power-computation loop is pretty limited, it's easy for
the repeated squarings to result in ridiculously enormous intermediate
values, which can take unreasonable amounts of time/memory to process,
or even overflow the internal "weight" field and so produce a wrong answer.
We can forestall misbehaviors of that sort by bailing out as soon as the
weight value exceeds what will fit in int16, since then the final answer
must overflow (if exp > 0) or underflow (if exp < 0) the packed numeric
format.

Per off-list report from Pavel Stehule.  Back-patch to all supported
branches.
2014-09-11 23:31:03 -04:00
Andres Freund
6bf6e528af Fix longstanding bug in HeapTupleSatisfiesVacuum().
HeapTupleSatisfiesVacuum() didn't properly discern between
DELETE_IN_PROGRESS and INSERT_IN_PROGRESS for rows that have been
inserted in the current transaction and deleted in a aborted
subtransaction of the current backend. At the very least that caused
problems for CLUSTER and CREATE INDEX in transactions that had
aborting subtransactions producing rows, leading to warnings like:
WARNING:  concurrent delete in progress within table "..."
possibly in an endless, uninterruptible, loop.

Instead of treating *InProgress xmins the same as *IsCurrent ones,
treat them as being distinct like the other visibility routines. As
implemented this separatation can cause a behaviour change for rows
that have been inserted and deleted in another, still running,
transaction. HTSV will now return INSERT_IN_PROGRESS instead of
DELETE_IN_PROGRESS for those. That's both, more in line with the other
visibility routines and arguably more correct. The latter because a
INSERT_IN_PROGRESS will make callers look at/wait for xmin, instead of
xmax.
The only current caller where that's possibly worse than the old
behaviour is heap_prune_chain() which now won't mark the page as
prunable if a row has concurrently been inserted and deleted. That's
harmless enough.

As a cautionary measure also insert a interrupt check before the gotos
in IndexBuildHeapScan() that lead to the uninterruptible loop. There
are other possible causes, like a row that several sessions try to
update and all fail, for repeated loops and the cost of doing so in
the retry case is low.

As this bug goes back all the way to the introduction of
subtransactions in 573a71a5da backpatch to all supported releases.

Reported-By: Sandro Santilli
2014-06-04 23:26:30 +02:00
Heikki Linnakangas
f47e4ce6ce Handle duplicate XIDs in txid_snapshot.
The proc array can contain duplicate XIDs, when a transaction is just being
prepared for two-phase commit. To cope, remove any duplicates in
txid_current_snapshot(). Also ignore duplicates in the input functions, so
that if e.g. you have an old pg_dump file that already contains duplicates,
it will be accepted.

Report and fix by Jan Wieck. Backpatch to all supported versions.
2014-05-15 18:31:08 +03:00
Bruce Momjian
2616a5d300 Remove tabs after spaces in C comments
This was not changed in HEAD, but will be done later as part of a
pgindent run.  Future pgindent runs will also do this.

Report by Tom Lane

Backpatch through all supported branches, but not HEAD
2014-05-06 11:26:26 -04:00
Tom Lane
db1fdc945d Fix failure to detoast fields in composite elements of structured types.
If we have an array of records stored on disk, the individual record fields
cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are
only prepared to deal with TOAST pointers appearing in top-level fields of
a stored row.  The same applies for ranges over composite types, nested
composites, etc.  However, the existing code only took care of expanding
sub-field TOAST pointers for the case of nested composites, not for other
structured types containing composites.  For example, given a command such
as

UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ...

where x is a direct reference to a field of an on-disk tuple, if that field
is long enough to be toasted out-of-line then the TOAST pointer would be
inserted as-is into the array column.  If the source record for x is later
deleted, the array field value would become a dangling pointer, leading
to errors along the line of "missing chunk number 0 for toast value ..."
when the value is referenced.  A reproducible test case for this was
provided by Jan Pecek, but it seems likely that some of the "missing chunk
number" reports we've heard in the past were caused by similar issues.

Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to
produce a self-contained Datum value if the Datum is of composite type.
Seen in this light, the problem is not just confined to arrays and ranges,
but could also affect some other places where detoasting is done in that
way, for example form_index_tuple().

I tried teaching the array code to apply toast_flatten_tuple_attribute()
along with PG_DETOAST_DATUM() when the array element type is composite,
but this was messy and imposed extra cache lookup costs whether or not any
TOAST pointers were present, indeed sometimes when the array element type
isn't even composite (since sometimes it takes a typcache lookup to find
that out).  The idea of extending that approach to all the places that
currently use PG_DETOAST_DATUM() wasn't attractive at all.

This patch instead solves the problem by decreeing that composite Datum
values must not contain any out-of-line TOAST pointers in the first place;
that is, we expand out-of-line fields at the point of constructing a
composite Datum, not at the point where we're about to insert it into a
larger tuple.  This rule is applied only to true composite Datums, not
to tuples that are being passed around the system as tuples, so it's not
as invasive as it might sound at first.  With this approach, the amount
of code that has to be touched for a full solution is greatly reduced,
and added cache lookup costs are avoided except when there actually is
a TOAST pointer that needs to be inlined.

The main drawback of this approach is that we might sometimes dereference
a TOAST pointer that will never actually be used by the query, imposing a
rather large cost that wasn't there before.  On the other side of the coin,
if the field value is used multiple times then we'll come out ahead by
avoiding repeat detoastings.  Experimentation suggests that common SQL
coding patterns are unaffected either way, though.  Applications that are
very negatively affected could be advised to modify their code to not fetch
columns they won't be using.

In future, we might consider reverting this solution in favor of detoasting
only at the point where data is about to be stored to disk, using some
method that can drill down into multiple levels of nested structured types.
That will require defining new APIs for structured types, though, so it
doesn't seem feasible as a back-patchable fix.

Note that this patch changes HeapTupleGetDatum() from a macro to a function
call; this means that any third-party code using that macro will not get
protection against creating TOAST-pointer-containing Datums until it's
recompiled.  The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER().
It seems likely that this is not a big problem in practice: most of the
tuple-returning functions in core and contrib produce outputs that could
not possibly be toasted anyway, and the same probably holds for third-party
extensions.

This bug has existed since TOAST was invented, so back-patch to all
supported branches.
2014-05-01 15:19:17 -04:00
Tom Lane
3897ee9b1f Check for interrupts and stack overflow during rule/view dumps.
Since ruleutils.c recurses, it could be driven to stack overflow by
deeply nested constructs.  Very large queries might also take long
enough to deparse that a check for interrupts seems like a good idea.
Stick appropriate tests into a couple of key places.

Noted by Greg Stark.  Back-patch to all supported branches.
2014-04-30 13:46:22 -04:00
Tom Lane
cb11f4d8d5 Fix processing of PGC_BACKEND GUC parameters on Windows.
EXEC_BACKEND builds (i.e., Windows) failed to absorb values of PGC_BACKEND
parameters if they'd been changed post-startup via the config file.  This
for example prevented log_connections from working if it were turned on
post-startup.  The mechanism for handling this case has always been a bit
of a kluge, and it wasn't revisited when we implemented EXEC_BACKEND.
While in a normal forking environment new backends will inherit the
postmaster's value of such settings, EXEC_BACKEND backends have to read
the settings from the CONFIG_EXEC_PARAMS file, and they were mistakenly
rejecting them.  So this case has always been broken in the Windows port;
so back-patch to all supported branches.

Amit Kapila
2014-04-05 12:41:34 -04:00
Tom Lane
2b52069011 Fix bugs in manipulation of PgBackendStatus.st_clienthostname.
Initialization of this field was not being done according to the
st_changecount protocol (it has to be done within the changecount increment
range, not outside).  And the test to see if the value should be reported
as null was wrong.  Noted while perusing uses of Port.remote_hostname.

This was wrong from the introduction of this code (commit 4a25bc145),
so back-patch to 9.1.
2014-04-01 21:30:18 -04:00
Tom Lane
f16ca97550 Prevent interrupts while reporting non-ERROR elog messages.
This should eliminate the risk of recursive entry to syslog(3), which
appears to be the cause of the hang reported in bug #9551 from James
Morton.

Arguably, the real problem here is auth.c's willingness to turn on
ImmediateInterruptOK while executing fairly wide swaths of backend code.
We may well need to work at narrowing the code ranges in which the
authentication_timeout interrupt is enabled.  For the moment, though,
this is a cheap and reasonably noninvasive fix for a field-reported
failure; the other approach would be complex and not necessarily
bug-free itself.

Back-patch to all supported branches.
2014-03-13 20:59:51 -04:00
Heikki Linnakangas
c7a186e392 Avoid memcpy() with same source and destination address.
The behavior of that is undefined, although unlikely to lead to problems in
practice.

Found by running regression tests with Valgrind.
2014-03-07 13:29:47 +02:00
Tom Lane
f3e3f6c5ba Avoid getting more than AccessShareLock when deparsing a query.
In make_ruledef and get_query_def, we have long used AcquireRewriteLocks
to ensure that the querytree we are about to deparse is up-to-date and
the schemas of the underlying relations aren't changing.  Howwever, that
function thinks the query is about to be executed, so it acquires locks
that are stronger than necessary for the purpose of deparsing.  Thus for
example, if pg_dump asks to deparse a rule that includes "INSERT INTO t",
we'd acquire RowExclusiveLock on t.  That results in interference with
concurrent transactions that might for example ask for ShareLock on t.
Since pg_dump is documented as being purely read-only, this is unexpected.
(Worse, it used to actually be read-only; this behavior dates back only
to 8.1, cf commit ba4200246.)

Fix this by adding a parameter to AcquireRewriteLocks to tell it whether
we want the "real" execution locks or only AccessShareLock.

Report, diagnosis, and patch by Dean Rasheed.  Back-patch to all supported
branches.
2014-03-06 19:31:16 -05:00
Tom Lane
a9eb4924a5 Allow regex operations to be terminated early by query cancel requests.
The regex code didn't have any provision for query cancel; which is
unsurprising given its non-Postgres origin, but still problematic since
some operations can take a long time.  Introduce a callback function to
check for a pending query cancel or session termination request, and
call it in a couple of strategic spots where we can make the regex code
exit with an error indicator.

If we ever actually split out the regex code as a standalone library,
some additional work will be needed to let the cancel callback function
be specified externally to the library.  But that's straightforward
(certainly so by comparison to putting the locale-dependent character
classification logic on a similar arms-length basis), and there seems
no need to do it right now.

A bigger issue is that there may be more places than these two where
we need to check for cancels.  We can always add more checks later,
now that the infrastructure is in place.

Since there are known examples of not-terribly-long regexes that can
lock up a backend for a long time, back-patch to all supported branches.
I have hopes of fixing the known performance problems later, but adding
query cancel ability seems like a good idea even if they were all fixed.
2014-03-01 15:21:07 -05:00
Tom Lane
3e2db4c80b Use SnapshotDirty rather than an active snapshot to probe index endpoints.
If there are lots of uncommitted tuples at the end of the index range,
get_actual_variable_range() ends up fetching each one and doing an MVCC
visibility check on it, until it finally hits a visible tuple.  This is
bad enough in isolation, considering that we don't need an exact answer
only an approximate one.  But because the tuples are not yet committed,
each visibility check does a TransactionIdIsInProgress() test, which
involves scanning the ProcArray.  When multiple sessions do this
concurrently, the ensuing contention results in horrid performance loss.
20X overall throughput loss on not-too-complicated queries is easy to
demonstrate in the back branches (though someone's made it noticeably
less bad in HEAD).

We can dodge the problem fairly effectively by using SnapshotDirty rather
than a normal MVCC snapshot.  This will cause the index probe to take
uncommitted tuples as good, so that we incur only one tuple fetch and test
even if there are many such tuples.  The extent to which this degrades the
estimate is debatable: it's possible the result is actually a more accurate
prediction than before, if the endmost tuple has become committed by the
time we actually execute the query being planned.  In any case, it's not
very likely that it makes the estimate a lot worse.

SnapshotDirty will still reject tuples that are known committed dead, so
we won't give bogus answers if an invalid outlier has been deleted but not
yet vacuumed from the index.  (Because btrees know how to mark such tuples
dead in the index, we shouldn't have a big performance problem in the case
that there are many of them at the end of the range.)  This consideration
motivates not using SnapshotAny, which was also considered as a fix.

Note: the back branches were using SnapshotNow instead of an MVCC snapshot,
but the problem and solution are the same.

Per performance complaints from Bartlomiej Romanski, Josh Berkus, and
others.  Back-patch to 9.0, where the issue was introduced (by commit
40608e7f94).
2014-02-25 16:04:16 -05:00
Tom Lane
4741e31600 Prevent potential overruns of fixed-size buffers.
Coverity identified a number of places in which it couldn't prove that a
string being copied into a fixed-size buffer would fit.  We believe that
most, perhaps all of these are in fact safe, or are copying data that is
coming from a trusted source so that any overrun is not really a security
issue.  Nonetheless it seems prudent to forestall any risk by using
strlcpy() and similar functions.

Fixes by Peter Eisentraut and Jozef Mlich based on Coverity reports.

In addition, fix a potential null-pointer-dereference crash in
contrib/chkpass.  The crypt(3) function is defined to return NULL on
failure, but chkpass.c didn't check for that before using the result.
The main practical case in which this could be an issue is if libc is
configured to refuse to execute unapproved hashing algorithms (e.g.,
"FIPS mode").  This ideally should've been a separate commit, but
since it touches code adjacent to one of the buffer overrun changes,
I included it in this commit to avoid last-minute merge issues.
This issue was reported by Honza Horak.

Security: CVE-2014-0065 for buffer overruns, CVE-2014-0066 for crypt()
2014-02-17 11:20:31 -05:00
Noah Misch
0b7026d964 Predict integer overflow to avoid buffer overruns.
Several functions, mostly type input functions, calculated an allocation
size such that the calculation wrapped to a small positive value when
arguments implied a sufficiently-large requirement.  Writes past the end
of the inadvertent small allocation followed shortly thereafter.
Coverity identified the path_in() vulnerability; code inspection led to
the rest.  In passing, add check_stack_depth() to prevent stack overflow
in related functions.

Back-patch to 8.4 (all supported versions).  The non-comment hstore
changes touch code that did not exist in 8.4, so that part stops at 9.0.

Noah Misch and Heikki Linnakangas, reviewed by Tom Lane.

Security: CVE-2014-0064
2014-02-17 09:33:37 -05:00
Noah Misch
23b5a85e60 Prevent privilege escalation in explicit calls to PL validators.
The primary role of PL validators is to be called implicitly during
CREATE FUNCTION, but they are also normal functions that a user can call
explicitly.  Add a permissions check to each validator to ensure that a
user cannot use explicit validator calls to achieve things he could not
otherwise achieve.  Back-patch to 8.4 (all supported versions).
Non-core procedural language extensions ought to make the same two-line
change to their own validators.

Andres Freund, reviewed by Tom Lane and Noah Misch.

Security: CVE-2014-0061
2014-02-17 09:33:36 -05:00
Noah Misch
5d320a16ca Shore up ADMIN OPTION restrictions.
Granting a role without ADMIN OPTION is supposed to prevent the grantee
from adding or removing members from the granted role.  Issuing SET ROLE
before the GRANT bypassed that, because the role itself had an implicit
right to add or remove members.  Plug that hole by recognizing that
implicit right only when the session user matches the current role.
Additionally, do not recognize it during a security-restricted operation
or during execution of a SECURITY DEFINER function.  The restriction on
SECURITY DEFINER is not security-critical.  However, it seems best for a
user testing his own SECURITY DEFINER function to see the same behavior
others will see.  Back-patch to 8.4 (all supported versions).

The SQL standards do not conflate roles and users as PostgreSQL does;
only SQL roles have members, and only SQL users initiate sessions.  An
application using PostgreSQL users and roles as SQL users and roles will
never attempt to grant membership in the role that is the session user,
so the implicit right to add or remove members will never arise.

The security impact was mostly that a role member could revoke access
from others, contrary to the wishes of his own grantor.  Unapproved role
member additions are less notable, because the member can still largely
achieve that by creating a view or a SECURITY DEFINER function.

Reviewed by Andres Freund and Tom Lane.  Reported, independently, by
Jonas Sundman and Noah Misch.

Security: CVE-2014-0060
2014-02-17 09:33:33 -05:00
Tom Lane
3f721588ae Fix possible crashes due to using elog/ereport too early in startup.
Per reports from Andres Freund and Luke Campbell, a server failure during
set_pglocale_pgservice results in a segfault rather than a useful error
message, because the infrastructure needed to use ereport hasn't been
initialized; specifically, MemoryContextInit hasn't been called.
One known cause of this is starting the server in a directory it
doesn't have permission to read.

We could try to prevent set_pglocale_pgservice from using anything that
depends on palloc or elog, but that would be messy, and the odds of future
breakage seem high.  Moreover there are other things being called in main.c
that look likely to use palloc or elog too --- perhaps those things
shouldn't be there, but they are there today.  The best solution seems to
be to move the call of MemoryContextInit to very early in the backend's
real main() function.  I've verified that an elog or ereport occurring
immediately after that is now capable of sending something useful to
stderr.

I also added code to elog.c to print something intelligible rather than
just crashing if MemoryContextInit hasn't created the ErrorContext.
This could happen if MemoryContextInit itself fails (due to malloc
failure), and provides some future-proofing against someone trying to
sneak in new code even earlier in server startup.

Back-patch to all supported branches.  Since we've only heard reports of
this type of failure recently, it may be that some recent change has made
it more likely to see a crash of this kind; but it sure looks like it's
broken all the way back.
2014-01-11 16:35:37 -05:00
Kevin Grittner
1f069d21d3 Fix misplaced right paren bugs in pgstatfuncs.c.
The bug would only show up if the C sockaddr structure contained
zero in the first byte for a valid address; otherwise it would
fail to fail, which is probably why it went unnoticed for so long.

Patch submitted by Joel Jacobson after seeing an article by Andrey
Karpov in which he reports finding this through static code
analysis using PVS-Studio.  While I was at it I moved a definition
of a local variable referenced in the buggy code to a more local
context.

Backpatch to all supported branches.
2013-12-27 15:41:18 -06:00
Peter Eisentraut
e1f7173ea1 Avoid potential buffer overflow crash
A pointer to a C string was treated as a pointer to a "name" datum and
passed to SPI_execute_plan().  This pointer would then end up being
passed through datumCopy(), which would try to copy the entire 64 bytes
of name data, thus running past the end of the C string.  Fix by
converting the string to a proper name structure.

Found by LLVM AddressSanitizer.
2013-11-23 07:31:07 -05:00
Tom Lane
8cfd4c6a17 Be more robust when strerror() doesn't give a useful result.
Back-patch commits 8e68816cc2 and
8dace66e07 into the stable branches.
Buildfarm testing revealed no great portability surprises, and it
seems useful to have this robustness improvement in all branches.
2013-11-07 16:33:28 -05:00
Tom Lane
172ba45780 Fix some odd behaviors when using a SQL-style simple GMT offset timezone.
Formerly, when using a SQL-spec timezone setting with a fixed GMT offset
(called a "brute force" timezone in the code), the session_timezone
variable was not updated to match the nominal timezone; rather, all code
was expected to ignore session_timezone if HasCTZSet was true.  This is
of course obviously fragile, though a search of the code finds only
timeofday() failing to honor the rule.  A bigger problem was that
DetermineTimeZoneOffset() supposed that if its pg_tz parameter was
pointer-equal to session_timezone, then HasCTZSet should override the
parameter.  This would cause datetime input containing an explicit zone
name to be treated as referencing the brute-force zone instead, if the
zone name happened to match the session timezone that had prevailed
before installing the brute-force zone setting (as reported in bug #8572).
The same malady could affect AT TIME ZONE operators.

To fix, set up session_timezone so that it matches the brute-force zone
specification, which we can do using the POSIX timezone definition syntax
"<abbrev>offset", and get rid of the bogus lookaside check in
DetermineTimeZoneOffset().  Aside from fixing the erroneous behavior in
datetime parsing and AT TIME ZONE, this will cause the timeofday() function
to print its result in the user-requested time zone rather than some
previously-set zone.  It might also affect results in third-party
extensions, if there are any that make use of session_timezone without
considering HasCTZSet, but in all cases the new behavior should be saner
than before.

Back-patch to all supported branches.
2013-11-01 12:13:29 -04:00
Tom Lane
6084c0705a Make sure float4in/float8in accept all standard spellings of "infinity".
The C99 and POSIX standards require strtod() to accept all these spellings
(case-insensitively): "inf", "+inf", "-inf", "infinity", "+infinity",
"-infinity".  However, pre-C99 systems might accept only some or none of
these, and apparently Windows still doesn't accept "inf".  To avoid
surprising cross-platform behavioral differences, manually check for each
of these spellings if strtod() fails.  We were previously handling just
"infinity" and "-infinity" that way, but since C99 is most of the world
now, it seems likely that applications are expecting all these spellings
to work.

Per bug #8355 from Basil Peace.  It turns out this fix won't actually
resolve his problem, because Python isn't being this careful; but that
doesn't mean we shouldn't be.
2013-08-03 12:40:45 -04:00
Alvaro Herrera
c928dc9dfb Fix old visibility bug in HeapTupleSatisfiesDirty
If a tuple is locked but not updated by a concurrent transaction,
HeapTupleSatisfiesDirty would return that transaction's Xid in xmax,
causing callers to wait on it, when it is not necessary (in fact, if the
other transaction had used a multixact instead of a plain Xid to mark
the tuple, HeapTupleSatisfiesDirty would have behave differently and
*not* returned the Xmax).

This bug was introduced in commit 3f7fbf85dc, dated December 1998,
so it's almost 15 years old now.  However, it's hard to see this
misbehave, because before we had NOWAIT the only consequence of this is
that transactions would wait for slightly more time than necessary; so
it's not surprising that this hasn't been reported yet.

Craig Ringer and Andres Freund
2013-08-02 17:07:08 -04:00
Tom Lane
1346f40b58 Fix regexp_matches() handling of zero-length matches.
We'd find the same match twice if it was of zero length and not immediately
adjacent to the previous match.  replace_text_regexp() got similar cases
right, so adjust this search logic to match that.  Note that even though
the regexp_split_to_xxx() functions share this code, they did not display
equivalent misbehavior, because the second match would be considered
degenerate and ignored.

Jeevan Chalke, with some cosmetic changes by me.
2013-07-31 11:31:33 -04:00
Tom Lane
13f11c8a8c Fix booltestsel() for case where we have NULL stats but not MCV stats.
In a boolean column that contains mostly nulls, ANALYZE might not find
enough non-null values to populate the most-common-values stats,
but it would still create a pg_statistic entry with stanullfrac set.
The logic in booltestsel() for this situation did the wrong thing for
"col IS NOT TRUE" and "col IS NOT FALSE" tests, forgetting that null
values would satisfy these tests (so that the true selectivity would
be close to one, not close to zero).  Per bug #8274.

Fix by Andrew Gierth, some comment-smithing by me.
2013-07-24 00:44:46 -04:00
Tom Lane
c1f51ed296 Change post-rewriter representation of dropped columns in joinaliasvars.
It's possible to drop a column from an input table of a JOIN clause in a
view, if that column is nowhere actually referenced in the view.  But it
will still be there in the JOIN clause's joinaliasvars list.  We used to
replace such entries with NULL Const nodes, which is handy for generation
of RowExpr expansion of a whole-row reference to the view.  The trouble
with that is that it can't be distinguished from the situation after
subquery pull-up of a constant subquery output expression below the JOIN.
Instead, replace such joinaliasvars with null pointers (empty expression
trees), which can't be confused with pulled-up expressions.  expandRTE()
still emits the old convention, though, for convenience of RowExpr
generation and to reduce the risk of breaking extension code.

In HEAD and 9.3, this patch also fixes a problem with some new code in
ruleutils.c that was failing to cope with implicitly-casted joinaliasvars
entries, as per recent report from Feike Steenbergen.  That oversight was
because of an inadequate description of the data structure in parsenodes.h,
which I've now corrected.  There were some pre-existing oversights of the
same ilk elsewhere, which I believe are now all fixed.
2013-07-23 16:23:11 -04:00
Stephen Frost
3cb7a393e8 Ensure 64bit arithmetic when calculating tapeSpace
In tuplesort.c:inittapes(), we calculate tapeSpace by first figuring
out how many 'tapes' we can use (maxTapes) and then multiplying the
result by the tape buffer overhead for each.  Unfortunately, when
we are on a system with an 8-byte long, we allow work_mem to be
larger than 2GB and that allows maxTapes to be large enough that the
32bit arithmetic can overflow when multiplied against the buffer
overhead.

When this overflow happens, we end up adding the overflow to the
amount of space available, causing the amount of memory allocated to
be larger than work_mem.

Note that to reach this point, you have to set work mem to at least
24GB and be sorting a set which is at least that size.  Given that a
user who can set work_mem to 24GB could also set it even higher, if
they were looking to run the system out of memory, this isn't
considered a security issue.

This overflow risk was found by the Coverity scanner.

Back-patch to all supported branches, as this issue has existed
since before 8.4.
2013-07-14 16:43:47 -04:00
Tom Lane
e46753e9b9 Fix cache flush hazard in cache_record_field_properties().
We need to increment the refcount on the composite type's cached tuple
descriptor while we do lookups of its column types.  Otherwise a cache
flush could occur and release the tuple descriptor before we're done with
it.  This fails reliably with -DCLOBBER_CACHE_ALWAYS, but the odds of a
failure in a production build seem rather low (since the pfree'd descriptor
typically wouldn't get scribbled on immediately).  That may explain the
lack of any previous reports.  Buildfarm issue noted by Christian Ullrich.

Back-patch to 9.1 where the bogus code was added.
2013-06-11 17:26:53 -04:00
Tom Lane
0c2c0f82c2 Guard against input_rows == 0 in estimate_num_groups().
This case doesn't normally happen, because the planner usually clamps
all row estimates to at least one row; but I found that it can arise
when dealing with relations excluded by constraints.  Without a defense,
estimate_num_groups() can return zero, which leads to divisions by zero
inside the planner as well as assertion failures in the executor.

An alternative fix would be to change set_dummy_rel_pathlist() to make
the size estimate for a dummy relation 1 row instead of 0, but that seemed
pretty ugly; and probably someday we'll want to drop the convention that
the minimum rowcount estimate is 1 row.

Back-patch to 8.4, as the problem can be demonstrated that far back.
2013-05-10 17:15:43 -04:00
Tom Lane
c13ed70a4f Fix longstanding race condition in plancache.c.
When creating or manipulating a cached plan for a transaction control
command (particularly ROLLBACK), we must not perform any catalog accesses,
since we might be in an aborted transaction.  However, plancache.c busily
saved or examined the search_path for every cached plan.  If we were
unlucky enough to do this at a moment where the path's expansion into
schema OIDs wasn't already cached, we'd do some catalog accesses; and with
some more bad luck such as an ill-timed signal arrival, that could lead to
crashes or Assert failures, as exhibited in bug #8095 from Nachiket Vaidya.
Fortunately, there's no real need to consider the search path for such
commands, so we can just skip the relevant steps when the subject statement
is a TransactionStmt.  This is somewhat related to bug #5269, though the
failure happens during initial cached-plan creation rather than
revalidation.

This bug has been there since the plan cache was invented, so back-patch
to all supported branches.
2013-04-20 16:59:31 -04:00
Tom Lane
ddf177228f Fix insecure parsing of server command-line switches.
An oversight in commit e710b65c1c allowed
database names beginning with "-" to be treated as though they were secure
command-line switches; and this switch processing occurs before client
authentication, so that even an unprivileged remote attacker could exploit
the bug, needing only connectivity to the postmaster's port.  Assorted
exploits for this are possible, some requiring a valid database login,
some not.  The worst known problem is that the "-r" switch can be invoked
to redirect the process's stderr output, so that subsequent error messages
will be appended to any file the server can write.  This can for example be
used to corrupt the server's configuration files, so that it will fail when
next restarted.  Complete destruction of database tables is also possible.

Fix by keeping the database name extracted from a startup packet fully
separate from command-line switches, as had already been done with the
user name field.

The Postgres project thanks Mitsumasa Kondo for discovering this bug,
Kyotaro Horiguchi for drafting the fix, and Noah Misch for recognizing
the full extent of the danger.

Security: CVE-2013-1899
2013-04-01 14:01:04 -04:00
Tom Lane
b403f4107b Make REPLICATION privilege checks test current user not authenticated user.
The pg_start_backup() and pg_stop_backup() functions checked the privileges
of the initially-authenticated user rather than the current user, which is
wrong.  For example, a user-defined index function could successfully call
these functions when executed by ANALYZE within autovacuum.  This could
allow an attacker with valid but low-privilege database access to interfere
with creation of routine backups.  Reported and fixed by Noah Misch.

Security: CVE-2013-1901
2013-04-01 13:09:35 -04:00
Tom Lane
81e2255fc7 Fix to_char() to use ASCII-only case-folding rules where appropriate.
formatting.c used locale-dependent case folding rules in some code paths
where the result isn't supposed to be locale-dependent, for example
to_char(timestamp, 'DAY').  Since the source data is always just ASCII
in these cases, that usually didn't matter ... but it does matter in
Turkish locales, which have unusual treatment of "i" and "I".  To confuse
matters even more, the misbehavior was only visible in UTF8 encoding,
because in single-byte encodings we used pg_toupper/pg_tolower which
don't have locale-specific behavior for ASCII characters.  Fix by providing
intentionally ASCII-only case-folding functions and using these where
appropriate.  Per bug #7913 from Adnan Dursun.  Back-patch to all active
branches, since it's been like this for a long time.
2013-03-05 13:02:38 -05:00
Tom Lane
3a77936602 Fix overflow check in tm2timestamp (this time for sure).
I fixed this code back in commit 841b4a2d5, but didn't think carefully
enough about the behavior near zero, which meant it improperly rejected
1999-12-31 24:00:00.  Per report from Magnus Hagander.
2013-03-04 15:14:00 -05:00
Tom Lane
1881634820 Prevent execution of enum_recv() from SQL.
This function was misdeclared to take cstring when it should take internal.
This at least allows crashing the server, and in principle an attacker
might be able to use the function to examine the contents of server memory.

The correct fix is to adjust the system catalog contents (and fix the
regression tests that should have caught this but failed to).  However,
asking users to correct the catalog contents in existing installations
is a pain, so as a band-aid fix for the back branches, install a check
in enum_recv() to make it throw error if called with a cstring argument.
We will later revert this in HEAD in favor of correcting the catalogs.

Our thanks to Sumit Soni (via Secunia SVCRP) for reporting this issue.

Security: CVE-2013-0255
2013-02-04 16:25:15 -05:00
Simon Riggs
2113574be8 Reset vacuum_defer_cleanup_age to PGC_SIGHUP.
Revert commit 84725aa5ef
2013-02-04 16:42:12 +00:00
Simon Riggs
ac148a2e93 Mark vacuum_defer_cleanup_age as PGC_POSTMASTER.
Following bug analysis of #7819 by Tom Lane
2013-02-02 18:51:21 +00:00
Tom Lane
4a6232cce3 Fix error-checking typo in check_TSCurrentConfig().
The code failed to detect an out-of-memory failure.

Xi Wang
2013-01-20 23:10:18 -05:00
Tom Lane
d4c78c1811 Reject out-of-range dates in to_date().
Dates outside the supported range could be entered, but would not print
reasonably, and operations such as conversion to timestamp wouldn't behave
sanely either.  Since this has the potential to result in undumpable table
data, it seems worth back-patching.

Hitoshi Harada
2013-01-14 15:20:15 -05:00
Tom Lane
ed98b48bf4 Fix failure to ignore leftover temp tables after a server crash.
During crash recovery, we remove disk files belonging to temporary tables,
but the system catalog entries for such tables are intentionally not
cleaned up right away.  Instead, the first backend that uses a temp schema
is expected to clean out any leftover objects therein.  This approach
requires that we be careful to ignore leftover temp tables (since any
actual access attempt would fail), *even if their BackendId matches our
session*, if we have not yet established use of the session's corresponding
temp schema.  That worked fine in the past, but was broken by commit
debcec7dc3 which incorrectly removed the
rd_islocaltemp relcache flag.  Put it back, and undo various changes
that substituted tests like "rel->rd_backend == MyBackendId" for use
of a state-aware flag.  Per trouble report from Heikki Linnakangas.

Back-patch to 9.1 where the erroneous change was made.  In the back
branches, be careful to add rd_islocaltemp in a spot in the struct that
was alignment padding before, so as not to break existing add-on code.
2012-12-17 20:15:45 -05:00
Tom Lane
9d39e94999 Fix filling of postmaster.pid in bootstrap/standalone mode.
We failed to ever fill the sixth line (LISTEN_ADDR), which caused the
attempt to fill the seventh line (SHMEM_KEY) to fail, so that the shared
memory key never got added to the file in standalone mode.  This has been
broken since we added more content to our lock files in 9.1.

To fix, tweak the logic in CreateLockFile to add an empty LISTEN_ADDR
line in standalone mode.  This is a tad grotty, but since that function
already knows almost everything there is to know about the contents of
lock files, it doesn't seem that it's any better to hack it elsewhere.

It's not clear how significant this bug really is, since a standalone
backend should never have any children and thus it seems not critical
to be able to check the nattch count of the shmem segment externally.
But I'm going to back-patch the fix anyway.

This problem had escaped notice because of an ancient (and in hindsight
pretty dubious) decision to suppress LOG-level messages by default in
standalone mode; so that the elog(LOG) complaint in AddToDataDirLockFile
that should have warned of the problem didn't do anything.  Fixing that
is material for a separate patch though.
2012-12-16 15:02:07 -05:00
Tom Lane
f0fc1d4c89 Add defenses against integer overflow in dynahash numbuckets calculations.
The dynahash code requires the number of buckets in a hash table to fit
in an int; but since we calculate the desired hash table size dynamically,
there are various scenarios where we might calculate too large a value.
The resulting overflow can lead to infinite loops, division-by-zero
crashes, etc.  I (tgl) had previously installed some defenses against that
in commit 299d171652, but that covered only one
call path.  Moreover it worked by limiting the request size to work_mem,
but in a 64-bit machine it's possible to set work_mem high enough that the
problem appears anyway.  So let's fix the problem at the root by installing
limits in the dynahash.c functions themselves.

Trouble report and patch by Jeff Davis.
2012-12-11 22:09:20 -05:00
Tom Lane
1da5bef317 Fix assorted bugs in CREATE INDEX CONCURRENTLY.
This patch changes CREATE INDEX CONCURRENTLY so that the pg_index
flag changes it makes without exclusive lock on the index are made via
heap_inplace_update() rather than a normal transactional update.  The
latter is not very safe because moving the pg_index tuple could result in
concurrent SnapshotNow scans finding it twice or not at all, thus possibly
resulting in index corruption.

In addition, fix various places in the code that ought to check to make
sure that the indexes they are manipulating are valid and/or ready as
appropriate.  These represent bugs that have existed since 8.2, since
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid
index behind, and we ought not try to do anything that might fail with
such an index.

Also fix RelationReloadIndexInfo to ensure it copies all the pg_index
columns that are allowed to change after initial creation.  Previously we
could have been left with stale values of some fields in an index relcache
entry.  It's not clear whether this actually had any user-visible
consequences, but it's at least a bug waiting to happen.

This is a subset of a patch already applied in 9.2 and HEAD.  Back-patch
into all earlier supported branches.

Tom Lane and Andres Freund
2012-11-29 14:51:46 -05:00
Tom Lane
4387cc9ab4 Improve handling of INT_MIN / -1 and related cases.
Some platforms throw an exception for this division, rather than returning
a necessarily-overflowed result.  Since we were testing for overflow after
the fact, an exception isn't nice.  We can avoid the problem by treating
division by -1 as negation.

Add some regression tests so that we'll find out if any compilers try to
optimize away the overflow check conditions.

Back-patch of commit 1f7cb5c309.

Per discussion with Xi Wang, though this is different from the patch he
submitted.
2012-11-19 21:21:40 -05:00
Tom Lane
fe838e5074 Limit values of archive_timeout, post_auth_delay, auth_delay.milliseconds.
The previous definitions of these GUC variables allowed them to range
up to INT_MAX, but in point of fact the underlying code would suffer
overflows or other errors with large values.  Reduce the maximum values
to something that won't misbehave.  There's no apparent value in working
harder than this, since very large delays aren't sensible for any of
these.  (Note: the risk with archive_timeout is that if we're late
checking the state, the timestamp difference it's being compared to
might overflow.  So we need some amount of slop; the choice of INT_MAX/2
is arbitrary.)

Per followup investigation of bug #7670.  Although this isn't a very
significant fix, might as well back-patch.
2012-11-18 17:15:16 -05:00