rather than a Query node; this allows set_plan_references to recurse
into subplans correctly. Fixes core dump on full outer joins in
subplans. Also, invoke preprocess_expression on function RTEs'
function expressions. This seems to fix the planner's problems with
outer-level Vars in function RTEs.
returns-set boolean field in Func and Oper nodes. This allows cleaner,
more reliable tests for expressions returning sets in the planner and
parser. For example, a WHERE clause returning a set is now detected
and complained of in the parser, not only at runtime.
lists to join RTEs, attach a list of Vars and COALESCE expressions that will
replace the join's alias variables during planning. This simplifies
flatten_join_alias_vars while still making it easy to fix up varno references
when transforming the query tree. Add regression test cases for interactions
of subqueries with outer joins.
now has an RTE of its own, and references to its outputs now are Vars
referencing the JOIN RTE, rather than CASE-expressions. This allows
reverse-listing in ruleutils.c to use the correct alias easily, rather
than painfully reverse-engineering the alias namespace as it used to do.
Also, nested FULL JOINs work correctly, because the result of the inner
joins are simple Vars that the planner can cope with. This fixes a bug
reported a couple times now, notably by Tatsuo on 18-Nov-01. The alias
Vars are expanded into COALESCE expressions where needed at the very end
of planning, rather than during parsing.
Also, beginnings of support for showing plan qualifier expressions in
EXPLAIN. There are probably still cases that need work.
initdb forced due to change of stored-rule representation.
set-returning functions in its target list. This ensures that we
won't rewrite the query in a way that places set-returning functions
into quals (WHERE clauses). Cf. bug reports from Joe Conway.
from Philip Warner. Side effect of change is that GROUP BY expressions
will not be re-evaluated at multiple plan levels anymore, whereas this
sometimes happened with old code.
clause being added to a particular restriction-clause list is redundant
with those already in the list. This avoids useless work at runtime,
and (perhaps more importantly) keeps the selectivity estimation routines
from generating too-small estimates of numbers of output rows.
Also some minor improvements in OPTIMIZER_DEBUG displays.
of costsize.c routines to pass Query root, so that costsize can figure
more things out by itself and not be so dependent on its callers to tell
it everything it needs to know. Use selectivity of hash or merge clause
to estimate number of tuples processed internally in these joins
(this is more useful than it would've been before, since eqjoinsel is
somewhat more accurate than before).
create_index_paths are not immediately discarded, but are available for
subsequent planner work. This allows avoiding redundant syscache lookups
in several places. Change interface to operator selectivity estimation
procedures to allow faster and more flexible estimation.
Initdb forced due to change of pg_proc entries for selectivity functions!
a separate statement (though it can still be invoked as part of VACUUM, too).
pg_statistic redesigned to be more flexible about what statistics are
stored. ANALYZE now collects a list of several of the most common values,
not just one, plus a histogram (not just the min and max values). Random
sampling is used to make the process reasonably fast even on very large
tables. The number of values and histogram bins collected is now
user-settable via an ALTER TABLE command.
There is more still to do; the new stats are not being used everywhere
they could be in the planner. But the remaining changes for this project
should be localized, and the behavior is already better than before.
A not-very-related change is that sorting now makes use of btree comparison
routines if it can find one, rather than invoking '<' twice.
join. This is needed to avoid improper evaluation of expressions that
should be nulled out, as in Victor Wagner's bug report of 4/27/01.
Pretty ugly solution, but no time to do anything better for 7.1.1.
try to push restrictions on the view down into the view subquery,
so that they can become indexscan quals or what-have-you rather than
being applied at the top level of the subquery. 7.0 and before were
able to do this, though in a much klugier way, and I'd hate to have
anyone complaining that 7.1 is stupider than 7.0 ...
comparison does not consider paths different when they differ only in
uninteresting aspects of sort order. (We had a special case of this
consideration for indexscans already, but generalize it to apply to
ordered join paths too.) Be stricter about what is a canonical pathkey
to allow faster pathkey comparison. Cache canonical pathkeys and
dispersion stats for left and right sides of a RestrictInfo's clause,
to avoid repeated computation. Total speedup will depend on number of
tables in a query, but I see about 4x speedup of planning phase for
a sample seven-table query.
work where we can (given that the executor only handles it at top level)
and generate an error where we can't. Note that while the parser has
been allowing views to say SELECT FOR UPDATE for a few weeks now, that
hasn't actually worked until just now.
joins, and clean things up a good deal at the same time. Append plan node
no longer hacks on rangetable at runtime --- instead, all child tables are
given their own RT entries during planning. Concept of multiple target
tables pushed up into execMain, replacing bug-prone implementation within
nodeAppend. Planner now supports generating Append plans for inheritance
sets either at the top of the plan (the old way) or at the bottom. Expanding
at the bottom is appropriate for tables used as sources, since they may
appear inside an outer join; but we must still expand at the top when the
target of an UPDATE or DELETE is an inheritance set, because we actually need
a different targetlist and junkfilter for each target table in that case.
Fortunately a target table can't be inside an outer join... Bizarre mutual
recursion between union_planner and prepunion.c is gone --- in fact,
union_planner doesn't really have much to do with union queries anymore,
so I renamed it grouping_planner.
ExecutorRun. This allows LIMIT to work in a view. Also, LIMIT in a
cursor declaration will behave in a reasonable fashion, whereas before
it was overridden by the FETCH count.
SQL92 semantics, including support for ALL option. All three can be used
in subqueries and views. DISTINCT and ORDER BY work now in views, too.
This rewrite fixes many problems with cross-datatype UNIONs and INSERT/SELECT
where the SELECT yields different datatypes than the INSERT needs. I did
that by making UNION subqueries and SELECT in INSERT be treated like
subselects-in-FROM, thereby allowing an extra level of targetlist where the
datatype conversions can be inserted safely.
INITDB NEEDED!
(Don't forget that an alias is required.) Views reimplemented as expanding
to subselect-in-FROM. Grouping, aggregates, DISTINCT in views actually
work now (he says optimistically). No UNION support in subselects/views
yet, but I have some ideas about that. Rule-related permissions checking
moved out of rewriter and into executor.
INITDB REQUIRED!
pg_proc.c (where it's actually used). Fix it to correctly handle tlists
that contain resjunk target items, and improve error messages. This
addresses bug reported by Krupnikov 6-July-00.
from Param nodes, per discussion a few days ago on pghackers. Add new
expression node type FieldSelect that implements the functionality where
it's actually needed. Clean up some other unused fields in Func nodes
as well.
NOTE: initdb forced due to change in stored expression trees for rules.
materialized tupleset is small enough) instead of a temporary relation.
This was something I was thinking of doing anyway for performance, and Jan
says he needs it for TOAST because he doesn't want to cope with toasting
noname relations. With this change, the 'noname table' support in heap.c
is dead code, and I have accordingly removed it. Also clean up 'noname'
plan handling in planner --- nonames are either sort or materialize plans,
and it seems less confusing to handle them separately under those names.
to simplify constant expressions and expand SubLink nodes into SubPlans
is done in a separate routine subquery_planner() that calls union_planner().
We formerly did most of this work in query_planner(), but that's the
wrong place because it may never see the real targetlist. Splitting
union_planner into two routines also allows us to avoid redundant work
when union_planner is invoked recursively for UNION and inheritance
cases. Upshot is that it is now possible to do something like
select float8(count(*)) / (select count(*) from int4_tbl) from int4_tbl
group by f1;
which has never worked before.
portion of the query result that will be retrieved. As far as I could
tell, the consensus was that we should let the planner do the best it
can with a LIMIT query, and require the user to add ORDER BY if he
wants consistent results from different LIMIT values.
accesses versus sequential accesses, a (very crude) estimate of the
effects of caching on random page accesses, and cost to evaluate WHERE-
clause expressions. Export critical parameters for this model as SET
variables. Also, create SET variables for the planner's enable flags
(enable_seqscan, enable_indexscan, etc) so that these can be controlled
more conveniently than via PGOPTIONS.
Planner now estimates both startup cost (cost before retrieving
first tuple) and total cost of each path, so it can optimize queries
with LIMIT on a reasonable basis by interpolating between these costs.
Same facility is a win for EXISTS(...) subqueries and some other cases.
Redesign pathkey representation to achieve a major speedup in planning
(I saw as much as 5X on a 10-way join); also minor changes in planner
to reduce memory consumption by recycling discarded Path nodes and
not constructing unnecessary lists.
Minor cleanups to display more-plausible costs in some cases in
EXPLAIN output.
Initdb forced by change in interface to index cost estimation
functions.
SELECT DISTINCT ON (expr [, expr ...]) targetlist ...
and there is a check to make sure that the user didn't specify an ORDER BY
that's incompatible with the DISTINCT operation.
Reimplement nodeUnique and nodeGroup to use the proper datatype-specific
equality function for each column being compared --- they used to do
bitwise comparisons or convert the data to text strings and strcmp().
(To add insult to injury, they'd look up the conversion functions once
for each tuple...) Parse/plan representation of DISTINCT is now a list
of SortClause nodes.
initdb forced by querytree change...
subselects can only appear on the righthand side of a binary operator.
That's still true for quantified predicates like x = ANY (SELECT ...),
but a subselect that delivers a single result can now appear anywhere
in an expression. This is implemented by changing EXPR_SUBLINK sublinks
to represent just the (SELECT ...) expression, without any 'left hand
side' or combining operator --- so they're now more like EXISTS_SUBLINK.
To handle the case of '(x, y, z) = (SELECT ...)', I added a new sublink
type MULTIEXPR_SUBLINK, which acts just like EXPR_SUBLINK used to.
But the grammar will only generate one for a multiple-left-hand-side
row expression.
mentioned in FROM but not elsewhere in the query: such tables should be
joined over anyway. Aside from being more standards-compliant, this allows
removal of some very ugly hacks for COUNT(*) processing. Also, allow
HAVING clause without aggregate functions, since SQL does. Clean up
CREATE RULE statement-list syntax the same way Bruce just fixed the
main stmtmulti production.
CAUTION: addition of a field to RangeTblEntry nodes breaks stored rules;
you will have to initdb if you have any rules.
Frankpitt, plus some improvements from yours truly. The simplifier depends
on the proiscachable field of pg_proc to tell it whether a function is
safe to pre-evaluate --- things like nextval() are not, for example.
Update pg_proc.h to contain reasonable cacheability information; as of
6.5.* hardly any functions were marked cacheable. I may have erred too
far in the other direction; see recent mail to pghackers for more info.
This update does not force an initdb, exactly, but you won't see much
benefit from the simplifier until you do one.