we should check that the function code returns the claimed result datatype
every time we parse the function for execution. Formerly, for simple
scalar result types we assumed the creation-time check was sufficient, but
this fails if the function selects from a table that's been redefined since
then, and even more obviously fails if check_function_bodies had been OFF.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Security: CVE-2007-0555
when collapsing of JOIN trees is stopped by join_collapse_limit. For instance
a list of 11 LEFT JOINs with limit 8 now produces something like
((1 2 3 4 5 6 7 8) 9 10 11 12)
instead of
(((1 2 3 4 5 6 7 8) (9)) 10 11 12)
The latter structure is really only required for a FULL JOIN.
Noted while studying an example from Shane Ambler.
hash joins with the estimated-larger relation on the inside. There are
several cases where doing that makes perfect sense, and in cases where it
doesn't, the regular cost computation really ought to be able to figure that
out. Make some marginal tweaks in said computation to try to get results
approximating reality a bit better. Per an example from Shane Ambler.
Also, fix an oversight in the original patch to add seq_page_cost: the costs
of spilling a hash join to disk should be scaled by seq_page_cost.
are all in new-in-8.2 logic associated with indexability of ScalarArrayOpExpr
(IN-clauses) or amortization of indexscan costs across repeated indexscans
on the inside of a nestloop. In particular:
Fix some logic errors in the estimation for multiple scans induced by a
ScalarArrayOpExpr indexqual.
Include a small cost component in bitmap index scans to reflect the costs of
manipulating the bitmap itself; this is mainly to prevent a bitmap scan from
appearing to have the same cost as a plain indexscan for fetching a single
tuple.
Also add a per-index-scan-startup CPU cost component; while prior releases
were clearly too pessimistic about the cost of repeated indexscans, the
original 8.2 coding allowed the cost of an indexscan to effectively go to zero
if repeated often enough, which is overly optimistic.
Pay some attention to index correlation when estimating costs for a nestloop
inner indexscan: this is significant when the plan fetches multiple heap
tuples per iteration, since high correlation means those tuples are probably
on the same or adjacent heap pages.
joinclause doesn't use any outer-side vars) requires a "bushy" plan to be
created. The normal heuristic to avoid joins with no joinclause has to be
overridden in that case. Problem is new in 8.2; before that we forced the
outer join order anyway. Per example from Teodor.
rearrangeable outer joins and the WHERE clause is non-strict and mentions
only nullable-side relations. New bug in 8.2, caused by new logic to allow
rearranging outer joins. Per bug #2807 from Ross Cohen; thanks to Jeff
Davis for producing a usable test case.
a sublink's test expression have the correct vartypmod, rather than defaulting
to -1. There's at least one place where this is important because we're
expecting these Vars to be exactly equal() to those appearing in the subplan
itself. This is a pretty klugy solution --- it would likely be cleaner to
change Param nodes to include a typmod field --- but we can't do that in the
already-released 8.2 branch.
Per bug report from Hubert Fongarnand.
accurately: we have to distinguish the effects of the join's own ON
clauses from the effects of pushed-down clauses. Failing to do so
was a quick hack long ago, but it's time to be smarter. Per example
from Thomas H.
node of a SubLink or SubPlan testexpr field. Bug resulted from replacing
the old lefthand/exprs list fields with a simple expression field, and not
remembering that expression_tree_walker is coded to save a few cycles by
recursing directly to self on list fields (on the assumption the walker
isn't interested in List nodes per se). On non-list fields it must of
course call the walker. Possibly that hack isn't worth the risk of more
such bugs, but I'll leave it be for now. Per bug report from James Robinson.
outer joins. Originally it was only looking for overlap of the righthand
side of a left join, but we have to do it on the lefthand side too.
Per example from Jean-Pierre Pelletier.
the SQL spec, viz IS NULL is true if all the row's fields are null, IS NOT
NULL is true if all the row's fields are not null. The former coding got
this right for a limited number of cases with IS NULL (ie, those where it
could disassemble a ROW constructor at parse time), but was entirely wrong
for IS NOT NULL. Per report from Teodor.
I desisted from changing the behavior for arrays, since on closer inspection
it's not clear that there's any support for that in the SQL spec. This
probably needs more consideration.
tables in the query compete for cache space, not just the one we are
currently costing an indexscan for. This seems more realistic, and it
definitely will help in examples recently exhibited by Stefan
Kaltenbrunner. To get the total size of all the tables involved, we must
tweak the handling of 'append relations' a bit --- formerly we looked up
information about the child tables on-the-fly during set_append_rel_pathlist,
but it needs to be done before we start doing any cost estimation, so
push it into the add_base_rels_to_query scan.
to a relation on the nullable side of an outer join. I had removed
this during the outer join planning rewrite a few months ago ... I think
I intended to put it somewhere else, but forgot ...
that has parameters is always planned afresh for each Bind command,
treating the parameter values as constants in the planner. This removes
the performance penalty formerly often paid for using out-of-line
parameters --- with this definition, the planner can do constant folding,
LIKE optimization, etc. After a suggestion by Andrew@supernews.
trivial if it contains either Vars referencing the corresponding subplan
columns, or Consts equaling the corresponding subplan columns. This
lets the planner eliminate the SubqueryScan in some cases generated by
generate_setop_tlist().
functions in its targetlist, to avoid introducing multiple evaluations
of volatile functions that textually appear only once. This is a
slightly tighter version of Jaime Casanova's recent patch.
mergejoin possibility where the inner rel was less well sorted than
the outer (ie, it matches some but not all of the merge clauses that
can work with the outer), if the inner path in question is also the
overall cheapest path for its rel. This is an old bug, but I'm not
sure it's worth back-patching, because it's such a corner case.
Noted while investigating a test case from Peter Hardman.
subquery's pathkey is a RelabelType applied to something that appears
in the subquery's output; for example where the subquery returns a
varchar Var and the sort order is shown as that Var coerced to text.
This comes up because varchar doesn't have its own sort operator.
Per example from Peter Hardman.
merely a matter of fixing the error check, since the underlying Portal
infrastructure already handles it. This in turn allows these statements
to be used in some existing plpgsql and plperl contexts, such as a
plpgsql FOR loop. Also, do some marginal code cleanup in places that
were being sloppy about distinguishing SELECT from SELECT INTO.
plpgsql support to come later. Along the way, convert execMain's
SELECT INTO support into a DestReceiver, in order to eliminate some ugly
special cases.
Jonah Harris and Tom Lane
same data type and same typmod, we show that typmod as the output
typmod, rather than generic -1. This responds to several complaints
over the past few years about UNIONs unexpectedly dropping length or
precision info.
list, when some of the child rels have been excluded by constraint
exclusion. This doesn't save a huge amount of time but it'll save some,
and it makes the EXPLAIN output look saner. We already did the
equivalent thing in set_append_rel_pathlist(), but not here.
contradictory WHERE-clauses applied to a relation. This makes the
GUC variable constraint_exclusion rather inappropriately named,
but I've refrained for the moment from renaming it.
Per example from Martin Lesser.
This doesn't matter too much for ordinary NOTs, since prepqual.c does
its best to get rid of those, but it helps with IS NOT TRUE clauses
which the rule rewriter likes to insert. Per example from Martin Lesser.
(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed
by the spec. (e.g. similar to a FROM clause subselect). initdb required.
Joe Conway and Tom Lane.
(table or index) before trying to open its relcache entry. This fixes
race conditions in which someone else commits a change to the relation's
catalog entries while we are in process of doing relcache load. Problems
of that ilk have been reported sporadically for years, but it was not
really practical to fix until recently --- for instance, the recent
addition of WAL-log support for in-place updates helped.
Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
concurrent update.
the opportunity to treat COUNT(*) as a zero-argument aggregate instead
of the old hack that equated it to COUNT(1); this is materially cleaner
(no more weird ANYOID cases) and ought to be at least a tiny bit faster.
Original patch by Sergey Koposov; review, documentation, simple regression
tests, pg_dump and psql support by moi.
eliminate unnecessary code, force initdb because stored rules change
(limit nodes are now supposed to be int8 not int4 expressions).
Update comments and error messages, which still all said 'integer'.
effects in a nestloop inner indexscan, I had only dealt with plain index
scans and the index portion of bitmap scans. But there will be cache
benefits for the heap accesses of bitmap scans too, so fix
cost_bitmap_heap_scan() to account for that.
ScalarArrayOpExpr index quals: we were estimating the right total
number of rows returned, but treating the index-access part of the
cost as if a single scan were fetching that many consecutive index
tuples. Actually we should treat it as a multiple indexscan, and
if there are enough of 'em the Mackert-Lohman discount should kick in.
clauses containing no variables and no volatile functions. Such a clause
can be used as a one-time qual in a gating Result plan node, to suppress
plan execution entirely when it is false. Even when the clause is true,
putting it in a gating node wins by avoiding repeated evaluation of the
clause. In previous PG releases, query_planner() would do this for
pseudoconstant clauses appearing at the top level of the jointree, but
there was no ability to generate a gating Result deeper in the plan tree.
To fix it, get rid of the special case in query_planner(), and instead
process pseudoconstant clauses through the normal RestrictInfo qual
distribution mechanism. When a pseudoconstant clause is found attached to
a path node in create_plan(), pull it out and generate a gating Result at
that point. This requires special-casing pseudoconstants in selectivity
estimation and cost_qual_eval, but on the whole it's pretty clean.
It probably even makes the planner a bit faster than before for the normal
case of no pseudoconstants, since removing pull_constant_clauses saves one
useless traversal of the qual tree. Per gripe from Phil Frost.
by creating a reference-count mechanism, similar to what we did a long time
ago for catcache entries. The back branches have an ugly solution involving
lots of extra copies, but this way is more efficient. Reference counting is
only applied to tupdescs that are actually in caches --- there seems no need
to use it for tupdescs that are generated in the executor, since they'll go
away during plan shutdown by virtue of being in the per-query memory context.
Neil Conway and Tom Lane
choose_bitmap_and(). It was way too fuzzy --- per comment, it was meant to be
1% relative difference, but was actually coded as 0.01 absolute difference,
thus causing selectivities of say 0.001 and 0.000000000001 to be treated as
equal. I believe this thinko explains Maxim Boguk's recent complaint. While
we could change it to a relative test coded like compare_fuzzy_path_costs(),
there's a bigger problem here, which is that any fuzziness at all renders the
comparison function non-transitive, which could confuse qsort() to the point
of delivering completely wrong results. So forget the whole thing and just
do an exact comparison.
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently. We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway. This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop. Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages. This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before. The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all. Per my recent proposal.
Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
cost_nonsequential_access() is really totally inappropriate for its only
remaining use, namely estimating I/O costs in cost_sort(). The routine
was designed on the assumption that disk caching might eliminate the need
for some re-reads on a random basis, but there's nothing very random in
that sense about sort's access pattern --- it'll always be picking up the
oldest outputs. If we had a good fix on the effective cache size we
might consider charging zero for I/O unless the sort temp file size
exceeds it, but that's probably putting much too much faith in the
parameter. Instead just drop the logic in favor of a fixed compromise
between seq_page_cost and random_page_cost per page of sort I/O.