calculating a page's initial free space was fine, and should not have been
"improved" by letting PageGetHeapFreeSpace do it. VACUUM FULL is going to
reclaim LP_DEAD line pointers later, so there is no need for a guard
against the page being too full of line pointers, and having one risks
rejecting pages that are perfectly good move destinations.
This also exposed a second bug, which is that the empty_end_pages logic
assumed that any page with no live tuples would get entered into the
fraged_pages list automatically (by virtue of having more free space than
the threshold in the do_frag calculation). This assumption certainly
seems risky when a low fillfactor has been chosen, and even without
tunable fillfactor I think it could conceivably fail on a page with many
unused line pointers. So fix the code to force do_frag true when notup
is true, and patch this part of the fix all the way back.
Per report from Tomas Szepe.
we need to be able to swallow NOTICE messages, and potentially also
ParameterStatus messages (although the latter would be a bit weird),
without exiting COPY OUT state. Fix it, and adjust the protocol documentation
to emphasize the need for this. Per off-list report from Alexander Galler.
constant ORDER/GROUP BY entries properly:
http://archives.postgresql.org/pgsql-hackers/2001-04/msg00457.php
The original solution to that was in fact no good, as demonstrated by
today's report from Martin Pitt:
http://archives.postgresql.org/pgsql-bugs/2008-01/msg00027.php
We can't use the column-number-reference format for a constant that is
a resjunk targetlist entry, a case that was unfortunately not thought of
in the original discussion. What we can do instead (which did not work
at the time, but does work in 7.3 and up) is to emit the constant with
explicit ::typename decoration, even if it otherwise wouldn't need it.
This is sufficient to keep the parser from thinking it's a column number
reference, and indeed is probably what the user must have done to get
such a thing into the querytree in the first place.
failed to cover all the ways in which a connection can be initiated in dblink.
Plug the remaining holes. Also, disallow transient connections in functions
for which that feature makes no sense (because they are only sensible as
part of a sequence of operations on the same connection). Joe Conway
Security: CVE-2007-6601
and CLUSTER) execute as the table owner rather than the calling user, using
the same privilege-switching mechanism already used for SECURITY DEFINER
functions. The purpose of this change is to ensure that user-defined
functions used in index definitions cannot acquire the privileges of a
superuser account that is performing routine maintenance. While a function
used in an index is supposed to be IMMUTABLE and thus not able to do anything
very interesting, there are several easy ways around that restriction; and
even if we could plug them all, there would remain a risk of reading sensitive
information and broadcasting it through a covert channel such as CPU usage.
To prevent bypassing this security measure, execution of SET SESSION
AUTHORIZATION and SET ROLE is now forbidden within a SECURITY DEFINER context.
Thanks to Itagaki Takahiro for reporting this vulnerability.
Security: CVE-2007-6600
are shared with Tcl, since it's their code to begin with, and the patches
have been copied from Tcl 8.5.0. Problems:
CVE-2007-4769: Inadequate check on the range of backref numbers allows
crash due to out-of-bounds read.
CVE-2007-4772: Infinite loop in regex optimizer for pattern '($|^)*'.
CVE-2007-6067: Very slow optimizer cleanup for regex with a large NFA
representation, as well as crash if we encounter an out-of-memory condition
during NFA construction.
Part of the response to CVE-2007-6067 is to put a limit on the number of
states in the NFA representation of a regex. This seems needed even though
the within-the-code problems have been corrected, since otherwise the code
could try to use very large amounts of memory for a suitably-crafted regex,
leading to potential DOS by driving the system into swap, activating a kernel
OOM killer, etc.
Although there are certainly plenty of ways to drive the system into effective
DOS with poorly-written SQL queries, these problems seem worth treating as
security issues because many applications might accept regex search patterns
from untrustworthy sources.
Thanks to Will Drewry of Google for reporting these problems. Patches by Will
Drewry and Tom Lane.
Security: CVE-2007-4769, CVE-2007-4772, CVE-2007-6067
The zero-point case is sensible so far as the data structure is concerned,
so maybe we ought to allow it sometime; but right now the textual input
routines for these types don't allow it, and it seems that not all the
functions for the types are prepared to cope.
Report and patch by Merlin Moncure.
whole table instead, to ensure that it goes away when the table is dropped.
Per bug #3723 from Sam Mason.
Backpatch as far as 7.4; AFAICT 7.3 does not have the issue, because it doesn't
have general-purpose expression indexes and so there must be at least one
column referenced by an index.
simplification gets detoasted before it is incorporated into a Const node.
Otherwise, if an immutable function were to return a TOAST pointer (an
unlikely case, but it can be made to happen), we would end up with a plan
that depends on the continued existence of the out-of-line toast datum.
eval_const_expressions simplifies this to just "WHERE false", but we have
already done pull_up_IN_clauses so the IN join will be done, or at least
planned, anyway. The trouble case comes when the sub-SELECT is itself a join
and we decide to implement the IN by unique-ifying the sub-SELECT outputs:
with no remaining reference to the output Vars in WHERE, we won't have
propagated the Vars up to the upper join point, leading to "variable not found
in subplan target lists" error. Fix by adding an extra scan of in_info_list
and forcing all Vars mentioned therein to be propagated up to the IN join
point. Per bug report from Miroslav Sulc.
no-longer-needed pages at the end of a table. We thought we could throw away
pages containing HEAPTUPLE_DEAD tuples; but this is not so, because such
tuples very likely have index entries pointing at them, and we wouldn't have
removed the index entries. The problem only emerges in a somewhat unlikely
race condition: the dead tuples have to have been inserted by a transaction
that later aborted, and this has to have happened between VACUUM's initial
scan of the page and then rechecking it for empty in count_nondeletable_pages.
But that timespan will include an index-cleaning pass, so it's not all that
hard to hit. This seems to explain a couple of previously unsolved bug
reports.
remote sessions, instead of erroring out in the middle of the operation.
This is a backpatch of a previous fix applied to CLUSTER to HEAD and 8.2, all
the way back that it is relevant to.
read from the temp file didn't match the file length reported by ftello(),
the wrong variable's value was printed, and so the message made no sense.
Clean up a couple other coding infelicities while at it.
relcache entry after having heap_close'd it. This could lead to misbehavior
if a relcache flush wiped out the cache entry meanwhile. In 8.2 there is a
very real risk of CREATE INDEX CONCURRENTLY using the wrong relid for locking
and waiting purposes. I think the bug is only cosmetic in 8.0 and 8.1,
because their transgression is limited to using RelationGetRelationName(rel)
in an ereport message immediately after heap_close, and there's no way (except
with special debugging options) for a cache flush to occur in that interval.
Not quite sure that it's cosmetic in 7.4, but seems best to patch anyway.
Found by trying to run the regression tests with CLOBBER_CACHE_ALWAYS enabled.
Maybe we should try to do that on a regular basis --- it's awfully slow,
but perhaps some fast buildfarm machine could do it once in awhile.
padded encryption scheme. Formerly it would try to access res[(unsigned) -1],
which resulted in core dumps on 64-bit machines, and was certainly trouble
waiting to happen on 32-bit machines (though in at least the known case
it was harmless because that byte would be overwritten after return).
Per report from Ken Colson; fix by Marko Kreen.
byte after the last full byte of the bit array, regardless of whether that
byte was part of the valid data or not. Found by buildfarm testing.
Thanks to Stefan Kaltenbrunner for nailing down the cause.
hash table is allocated in a child context of the agg node's memory
context, MemoryContextReset() will reset but *not* delete the child
context. Since ExecReScanAgg() proceeds to build a new hash table
from scratch (in a new sub-context), this results in leaking the
header for the previous memory context. Therefore, use
MemoryContextResetAndDeleteChildren() instead.
Credit: My colleague Sailesh Krishnamurthy at Truviso for isolating
the cause of the leak.
clauses in which one side or the other references both sides of the join
cannot be removed as redundant, because that expression won't have been
constrained below the join. Per report from Sergey Burladyan.
log_min_error_statement is active and there is some problem in logging the
current query string; for example, that it's too long to include in the log
message without running out of memory. This problem has existed since the
log_min_error_statement feature was introduced. No doubt the reason it
wasn't detected long ago is that 8.2 is the first release that defaults
log_min_error_statement to less than PANIC level.
Per report from Bill Moran.
when handed an invalidly-encoded pattern. The previous coding could get
into an infinite loop if pg_mb2wchar_with_len() returned a zero-length
string after we'd tested for nonempty pattern; which is exactly what it
will do if the string consists only of an incomplete multibyte character.
This led to either an out-of-memory error or a backend crash depending
on platform. Per report from Wiktor Wodecki.
been broken since forever, but was not noticed because people seldom look
at raw parse trees. AFAIK, no impact on users except that debug_print_parse
might fail; but patch it all the way back anyway. Per report from Jeff Ross.
to prevent possible escalation of privilege. Provide new SECURITY
DEFINER functions with old behavior, but initially REVOKE ALL
from public for these functions. Per list discussion and design
proposed by Tom Lane.
This is a Linux kernel bug that apparently exists in every extant kernel
version: sometimes shmctl() will fail with EIDRM when EINVAL is correct.
We were assuming that EIDRM indicates a possible conflict with pre-existing
backends, and refusing to start the postmaster when this happens. Fortunately,
there does not seem to be any case where Linux can legitimately return EIDRM
(it doesn't track shmem segments in a way that would allow that), so we can
get away with just assuming that EIDRM means EINVAL on this platform.
Per reports from Michael Fuhr and Jon Lapham --- it's a bit surprising
we have not seen more reports, actually.