useless substructure for its RangeTblEntry nodes. (I chose to keep using the
same struct node type and just zero out the link fields for unneeded info,
rather than making a separate ExecRangeTblEntry type --- it seemed too
fragile to have two different rangetable representations.)
Along the way, put subplans into a list in the toplevel PlannedStmt node,
and have SubPlan nodes refer to them by list index instead of direct pointers.
Vadim wanted to do that years ago, but I never understood what he was on about
until now. It makes things a *whole* lot more robust, because we can stop
worrying about duplicate processing of subplans during expression tree
traversals. That's been a constant source of bugs, and it's finally gone.
There are some consequent simplifications yet to be made, like not using
a separate EState for subplans in the executor, but I'll tackle that later.
this code was last gone over, there wasn't really any alternative to
globals because we didn't have the PlannerInfo struct being passed all
through the planner code. Now that we do, we can restructure things
to avoid non-reentrancy. I'm fooling with this because otherwise I'd
have had to add another global variable for the planned compact
range table list.
forces a particular relation nonnullable, then we can say that the OR does.
This is worth a little extra trouble since it may allow reduction of
outer joins to plain joins.
considered when it is necessary to do so because of a join-order restriction
(that is, an outer-join or IN-subselect construct). The former coding was a
bit ad-hoc and inconsistent, and it missed some cases, as exposed by Mario
Weilguni's recent bug report. His specific problem was that an IN could be
turned into a "clauseless" join due to constant-propagation removing the IN's
joinclause, and if the IN's subselect involved more than one relation and
there was more than one such IN linking to the same upper relation, then the
only valid join orders involve "bushy" plans but we would fail to consider the
specific paths needed to get there. (See the example case added to the join
regression test.) On examining the code I wonder if there weren't some other
problem cases too; in particular it seems that GEQO was defending against a
different set of corner cases than the main planner was. There was also an
efficiency problem, in that when we did realize we needed a clauseless join
because of an IN, we'd consider clauseless joins against every other relation
whether this was sensible or not. It seems a better design is to use the
outer-join and in-clause lists as a backup heuristic, just as the rule of
joining only where there are joinclauses is a heuristic: we'll join two
relations if they have a usable joinclause *or* this might be necessary to
satisfy an outer-join or IN-clause join order restriction. I refactored the
code to have just one place considering this instead of three, and made sure
that it covered all the cases that any of them had been considering.
Backpatch as far as 8.1 (which has only the IN-clause form of the disease).
By rights 8.0 and 7.4 should have the bug too, but they accidentally fail
to fail, because the joininfo structure used in those releases preserves some
memory of there having once been a joinclause between the inner and outer
sides of an IN, and so it leads the code in the right direction anyway.
I'll be conservative and not touch them.
that aren't turned into true joins). Since this is the last missing bit of
infrastructure, go ahead and fill out the hash integer_ops and float_ops
opfamilies with cross-type operators. The operator family project is now
DONE ... er, except for documentation ...
we should check that the function code returns the claimed result datatype
every time we parse the function for execution. Formerly, for simple
scalar result types we assumed the creation-time check was sufficient, but
this fails if the function selects from a table that's been redefined since
then, and even more obviously fails if check_function_bodies had been OFF.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Security: CVE-2007-0555
Standard English uses "may", "can", and "might" in different ways:
may - permission, "You may borrow my rake."
can - ability, "I can lift that log."
might - possibility, "It might rain today."
Unfortunately, in conversational English, their use is often mixed, as
in, "You may use this variable to do X", when in fact, "can" is a better
choice. Similarly, "It may crash" is better stated, "It might crash".
which I had removed in the first cut of the EquivalenceClass rewrite to
simplify that patch a little. But it's still important --- in a four-way
join problem mergejoinscansel() was eating about 40% of the planning time
according to gprof. Also, improve the EquivalenceClass code to re-use
join RestrictInfos rather than generating fresh ones for each join
considered. This saves some memory space but more importantly improves
the effectiveness of caching planning info in RestrictInfos.
columns procost and prorows, to allow simple user adjustment of the estimated
cost of a function call, as well as control of the estimated number of rows
returned by a set-returning function. We might eventually wish to extend this
to allow function-specific estimation routines, but there seems to be
consensus that we should try a simple constant estimate first. In particular
this provides a relatively simple way to control the order in which different
WHERE clauses are applied in a plan node, which is a Good Thing in view of the
fact that the recent EquivalenceClass planner rewrite made that much less
predictable than before.
provide just a boolean 'amcanorder', instead of fields that specify the
sort operator strategy numbers. We have decided to require ordering-capable
AMs to use btree-compatible strategy numbers, so the old fields are
overkill (and indeed misleading about what's allowed).
representation of equivalence classes of variables. This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
which comparison operators to use for plan nodes involving tuple comparison
(Agg, Group, Unique, SetOp). Formerly the executor looked up the default
equality operator for the datatype, which was really pretty shaky, since it's
possible that the data being fed to the node is sorted according to some
nondefault operator class that could have an incompatible idea of equality.
The planner knows what it has sorted by and therefore can provide the right
equality operator to use. Also, this change moves a couple of catalog lookups
out of the executor and into the planner, which should help startup time for
pre-planned queries by some small amount. Modify the planner to remove some
other cavalier assumptions about always being able to use the default
operators. Also add "nulls first/last" info to the Plan node for a mergejoin
--- neither the executor nor the planner can cope yet, but at least the API is
in place.
per-column options for btree indexes. The planner's support for this is still
pretty rudimentary; it does not yet know how to plan mergejoins with
nondefault ordering options. The documentation is pretty rudimentary, too.
I'll work on improving that stuff later.
Note incompatible change from prior behavior: ORDER BY ... USING will now be
rejected if the operator is not a less-than or greater-than member of some
btree opclass. This prevents less-than-sane behavior if an operator that
doesn't actually define a proper sort ordering is selected.
the XmlExpr code in various lists, use a representation that has some hope
of reverse-listing correctly (though it's still a de-escaping function
shy of correctness), generally try to make it look more like Postgres
coding conventions.
cases. Operator classes now exist within "operator families". While most
families are equivalent to a single class, related classes can be grouped
into one family to represent the fact that they are semantically compatible.
Cross-type operators are now naturally adjunct parts of a family, without
having to wedge them into a particular opclass as we had done originally.
This commit restructures the catalogs and cleans up enough of the fallout so
that everything still works at least as well as before, but most of the work
needed to actually improve the planner's behavior will come later. Also,
there are not yet CREATE/DROP/ALTER OPERATOR FAMILY commands; the only way
to create a new family right now is to allow CREATE OPERATOR CLASS to make
one by default. I owe some more documentation work, too. But that can all
be done in smaller pieces once this infrastructure is in place.
operator strategy numbers, ie, GiST and GIN. This is almost cosmetic
enough to not need a catversion bump, but since the opr_sanity regression
test has to change in sync with the catalog entry, I figured I'd better
do one.
joinclause doesn't use any outer-side vars) requires a "bushy" plan to be
created. The normal heuristic to avoid joins with no joinclause has to be
overridden in that case. Problem is new in 8.2; before that we forced the
outer join order anyway. Per example from Teodor.
node of a SubLink or SubPlan testexpr field. Bug resulted from replacing
the old lefthand/exprs list fields with a simple expression field, and not
remembering that expression_tree_walker is coded to save a few cycles by
recursing directly to self on list fields (on the assumption the walker
isn't interested in List nodes per se). On non-list fields it must of
course call the walker. Possibly that hack isn't worth the risk of more
such bugs, but I'll leave it be for now. Per bug report from James Robinson.
the SQL spec, viz IS NULL is true if all the row's fields are null, IS NOT
NULL is true if all the row's fields are not null. The former coding got
this right for a limited number of cases with IS NULL (ie, those where it
could disassemble a ROW constructor at parse time), but was entirely wrong
for IS NOT NULL. Per report from Teodor.
I desisted from changing the behavior for arrays, since on closer inspection
it's not clear that there's any support for that in the SQL spec. This
probably needs more consideration.
tables in the query compete for cache space, not just the one we are
currently costing an indexscan for. This seems more realistic, and it
definitely will help in examples recently exhibited by Stefan
Kaltenbrunner. To get the total size of all the tables involved, we must
tweak the handling of 'append relations' a bit --- formerly we looked up
information about the child tables on-the-fly during set_append_rel_pathlist,
but it needs to be done before we start doing any cost estimation, so
push it into the add_base_rels_to_query scan.
that has parameters is always planned afresh for each Bind command,
treating the parameter values as constants in the planner. This removes
the performance penalty formerly often paid for using out-of-line
parameters --- with this definition, the planner can do constant folding,
LIKE optimization, etc. After a suggestion by Andrew@supernews.
merely a matter of fixing the error check, since the underlying Portal
infrastructure already handles it. This in turn allows these statements
to be used in some existing plpgsql and plperl contexts, such as a
plpgsql FOR loop. Also, do some marginal code cleanup in places that
were being sloppy about distinguishing SELECT from SELECT INTO.
plpgsql support to come later. Along the way, convert execMain's
SELECT INTO support into a DestReceiver, in order to eliminate some ugly
special cases.
Jonah Harris and Tom Lane
same data type and same typmod, we show that typmod as the output
typmod, rather than generic -1. This responds to several complaints
over the past few years about UNIONs unexpectedly dropping length or
precision info.
contradictory WHERE-clauses applied to a relation. This makes the
GUC variable constraint_exclusion rather inappropriately named,
but I've refrained for the moment from renaming it.
Per example from Martin Lesser.
This doesn't matter too much for ordinary NOTs, since prepqual.c does
its best to get rid of those, but it helps with IS NOT TRUE clauses
which the rule rewriter likes to insert. Per example from Martin Lesser.
(e.g. "INSERT ... VALUES (...), (...), ...") and elsewhere as allowed
by the spec. (e.g. similar to a FROM clause subselect). initdb required.
Joe Conway and Tom Lane.
(table or index) before trying to open its relcache entry. This fixes
race conditions in which someone else commits a change to the relation's
catalog entries while we are in process of doing relcache load. Problems
of that ilk have been reported sporadically for years, but it was not
really practical to fix until recently --- for instance, the recent
addition of WAL-log support for in-place updates helped.
Along the way, remove pg_am.amconcurrent: all AMs are now expected to support
concurrent update.
the opportunity to treat COUNT(*) as a zero-argument aggregate instead
of the old hack that equated it to COUNT(1); this is materially cleaner
(no more weird ANYOID cases) and ought to be at least a tiny bit faster.
Original patch by Sergey Koposov; review, documentation, simple regression
tests, pg_dump and psql support by moi.
effects in a nestloop inner indexscan, I had only dealt with plain index
scans and the index portion of bitmap scans. But there will be cache
benefits for the heap accesses of bitmap scans too, so fix
cost_bitmap_heap_scan() to account for that.
clauses containing no variables and no volatile functions. Such a clause
can be used as a one-time qual in a gating Result plan node, to suppress
plan execution entirely when it is false. Even when the clause is true,
putting it in a gating node wins by avoiding repeated evaluation of the
clause. In previous PG releases, query_planner() would do this for
pseudoconstant clauses appearing at the top level of the jointree, but
there was no ability to generate a gating Result deeper in the plan tree.
To fix it, get rid of the special case in query_planner(), and instead
process pseudoconstant clauses through the normal RestrictInfo qual
distribution mechanism. When a pseudoconstant clause is found attached to
a path node in create_plan(), pull it out and generate a gating Result at
that point. This requires special-casing pseudoconstants in selectivity
estimation and cost_qual_eval, but on the whole it's pretty clean.
It probably even makes the planner a bit faster than before for the normal
case of no pseudoconstants, since removing pull_constant_clauses saves one
useless traversal of the qual tree. Per gripe from Phil Frost.
by creating a reference-count mechanism, similar to what we did a long time
ago for catcache entries. The back branches have an ugly solution involving
lots of extra copies, but this way is more efficient. Reference counting is
only applied to tupdescs that are actually in caches --- there seems no need
to use it for tupdescs that are generated in the executor, since they'll go
away during plan shutdown by virtue of being in the per-query memory context.
Neil Conway and Tom Lane
that the Mackert-Lohmann formula applies across all the repetitions of the
nestloop, not just each scan independently. We use the M-L formula to
estimate the number of pages fetched from the index as well as from the table;
that isn't what it was designed for, but it seems reasonably applicable
anyway. This makes large numbers of repetitions look much cheaper than
before, which accords with many reports we've received of overestimation
of the cost of a nestloop. Also, change the index access cost model to
charge random_page_cost per index leaf page touched, while explicitly
not counting anything for access to metapage or upper tree pages. This
may all need tweaking after we get some field experience, but in simple
tests it seems to be giving saner results than before. The main thing
is to get the infrastructure in place to let cost_index() and amcostestimate
functions take repeated scans into account at all. Per my recent proposal.
Note: this patch changes pg_proc.h, but I did not force initdb because
the changes are basically cosmetic --- the system does not look into
pg_proc to decide how to call an index amcostestimate function, and
there's no way to call such a function from SQL at all.
not named ones, and replace linear searches of the list with array indexing.
The named-parameter support has been dead code for many years anyway,
and recent profiling suggests that the searching was costing a noticeable
amount of performance for complex queries.
output, ie, no OR immediately below an OR. Otherwise we get Asserts or
wrong answers for cases such as
select * from tenk1 a, tenk1 b
where (a.ten = b.ten and (a.unique1 = 100 or a.unique1 = 101))
or (a.hundred = b.hundred and a.unique1 = 42);
Per report from Rafael Martinez Guerrero.
during parse analysis, not only errors detected in the flex/bison stages.
This is per my earlier proposal. This commit includes all the basic
infrastructure, but locations are only tracked and reported for errors
involving column references, function calls, and operators. More could
be done later but this seems like a good set to start with. I've also
moved the ReportSyntaxErrorPosition logic out of psql and into libpq,
which should make it available to more people --- even within psql this
is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
would basically punt in all cases for 'foo <> ALL (array)', which resulted
in a performance regression for NOT IN compared to what we were doing in
8.1 and before. Per report from Pavel Stehule.
relations: fix the executor so that we can have an Append plan on the
inside of a nestloop and still pass down outer index keys to index scans
within the Append, then generate such plans as if they were regular
inner indexscans. This avoids the need to evaluate the outer relation
multiple times.
... in fact, it will be applied now in any query whatsoever. I'm still
a bit concerned about the cycles that might be expended in failed proof
attempts, but given that CE is turned off by default, it's the user's
choice whether to expend those cycles or not. (Possibly we should
change the simple bool constraint_exclusion parameter to something
more fine-grained?)