that the inputs to a given operator can be recursively simplified to
constants, it was evaluating the operator using the op's *original*
(unsimplified) arg list, so that any subexpressions had to be evaluated
again. A constant subexpression at depth N got evaluated N times.
Probably not very important in practical situations, but it made us look
real slow in MySQL's 'crashme' test...
subplan: do it if subplan has subplans itself, and always do it if the
subplan is an indexscan. (I originally set it to materialize an indexscan
only if the indexqual is fairly selective, but I dunno what I was
thinking ... an unselective indexscan is still expensive ...)
portion of the query result that will be retrieved. As far as I could
tell, the consensus was that we should let the planner do the best it
can with a LIMIT query, and require the user to add ORDER BY if he
wants consistent results from different LIMIT values.
represent the result of a binary-compatible type coercion. At runtime
it just evaluates its argument --- but during type resolution, exprType
will pick up the output type of the RelabelType node instead of the type
of the argument. This solves some longstanding problems with dropped
type coercions, an example being 'select now()::abstime::int4' which
used to produce date-formatted output, not an integer, because the
coercion to int4 was dropped on the floor.
selectivity functions and make the r-tree operators use them. The
estimation functions themselves are just stubs, unfortunately, but
perhaps someday someone will make them compute realistic estimates.
Change pg_am so that the optimizer can reliably tell the difference
between ordered and unordered indexes --- before it would think that
an r-tree index can be scanned in '<<' order, which is not right AFAIK.
Repair broken negator links for network_sup and related ops.
Initdb forced. This might be my last initdb force for 7.0 ... hope so
anyway ...
accesses versus sequential accesses, a (very crude) estimate of the
effects of caching on random page accesses, and cost to evaluate WHERE-
clause expressions. Export critical parameters for this model as SET
variables. Also, create SET variables for the planner's enable flags
(enable_seqscan, enable_indexscan, etc) so that these can be controlled
more conveniently than via PGOPTIONS.
Planner now estimates both startup cost (cost before retrieving
first tuple) and total cost of each path, so it can optimize queries
with LIMIT on a reasonable basis by interpolating between these costs.
Same facility is a win for EXISTS(...) subqueries and some other cases.
Redesign pathkey representation to achieve a major speedup in planning
(I saw as much as 5X on a 10-way join); also minor changes in planner
to reduce memory consumption by recycling discarded Path nodes and
not constructing unnecessary lists.
Minor cleanups to display more-plausible costs in some cases in
EXPLAIN output.
Initdb forced by change in interface to index cost estimation
functions.
SELECT a FROM t1 tx (a);
Allow join syntax, including queries like
SELECT * FROM t1 NATURAL JOIN t2;
Update RTE structure to hold column aliases in an Attr structure.
fields in JoinPaths --- turns out that we do need that after all :-(.
Also, rearrange planner so that only one RelOptInfo is created for a
particular set of joined base relations, no matter how many different
subsets of relations it can be created from. This saves memory and
processing time compared to the old method of making a bunch of RelOptInfos
and then removing the duplicates. Clean up the jointree iteration logic;
not sure if it's better, but I sure find it more readable and plausible
now, particularly for the case of 'bushy plans'.
nonoverlap_sets() and is_subset() to list.c, where they should have lived
to begin with, and rename to nonoverlap_setsi and is_subseti since they
only work on integer lists.
extracting from an AND subclause just those opclauses that are relevant
for a particular index. For example, we can now consider using an index
on x to process WHERE (x = 1 AND y = 2) OR (x = 2 AND y = 4) OR ...
2-Oct-98 or TODO.detail/cnfify) to decide whether we want to reduce
WHERE clause to CNF form, DNF form, or neither. This is a HUGE win.
The heuristic conditions could probably still use a little tweaking to
make sure we don't pick CNF when DNF would be better, or vice versa,
but the risk of exponential explosion in cnfify() is gone. I was able
to run ten-thousand-AND-subclause queries through the planner in a
reasonable amount of time.
SELECT DISTINCT ON (expr [, expr ...]) targetlist ...
and there is a check to make sure that the user didn't specify an ORDER BY
that's incompatible with the DISTINCT operation.
Reimplement nodeUnique and nodeGroup to use the proper datatype-specific
equality function for each column being compared --- they used to do
bitwise comparisons or convert the data to text strings and strcmp().
(To add insult to injury, they'd look up the conversion functions once
for each tuple...) Parse/plan representation of DISTINCT is now a list
of SortClause nodes.
initdb forced by querytree change...
(ie, WHERE x > lowbound AND x < highbound). It's not very bright yet
but it does something useful. Also, rename intltsel/intgtsel to
scalarltsel/scalargtsel to reflect usage better. Extend convert_to_scalar
to do something a little bit useful with string data types. Still need
to make it do something with date/time datatypes, but I'll wait for
Thomas's datetime unification dust to settle first. Eventually the
routine ought not have any type-specific knowledge at all; it ought to
be calling a type-dependent routine found via a pg_type column; but
that's a task for another day.
pghackers discussion of 5-Jan-2000. The amopselect and amopnpages
estimators are gone, and in their place is a per-AM amcostestimate
procedure (linked to from pg_am, not pg_amop).
Make all system indexes unique.
Make all cache loads use system indexes.
Rename *rel to *relid in inheritance tables.
Rename cache names to be clearer.
returns a list of RelOptInfos, eliminating the need for static state
in index_info. That static state was a direct cause of coredumps; if
anything decided to elog(ERROR) partway through an index_info search of
pg_index, the next query would try to close a scan pointer that was
pointing at no-longer-valid memory. Another example of the reasons to
avoid static state variables...
subselects can only appear on the righthand side of a binary operator.
That's still true for quantified predicates like x = ANY (SELECT ...),
but a subselect that delivers a single result can now appear anywhere
in an expression. This is implemented by changing EXPR_SUBLINK sublinks
to represent just the (SELECT ...) expression, without any 'left hand
side' or combining operator --- so they're now more like EXISTS_SUBLINK.
To handle the case of '(x, y, z) = (SELECT ...)', I added a new sublink
type MULTIEXPR_SUBLINK, which acts just like EXPR_SUBLINK used to.
But the grammar will only generate one for a multiple-left-hand-side
row expression.
In particular, don't bother to look up type information for attributes
where we're not actually going to use it, and avoid copying entire tlist
structure when it's not necessary.
mentioned in FROM but not elsewhere in the query: such tables should be
joined over anyway. Aside from being more standards-compliant, this allows
removal of some very ugly hacks for COUNT(*) processing. Also, allow
HAVING clause without aggregate functions, since SQL does. Clean up
CREATE RULE statement-list syntax the same way Bruce just fixed the
main stmtmulti production.
CAUTION: addition of a field to RangeTblEntry nodes breaks stored rules;
you will have to initdb if you have any rules.
in the Expr nodes they produce. This fixes a few cases of errors like
'typeidTypeRelid: Invalid type - oid = 0' caused by calling parser-related
routines on expression trees that have already been processed by planner-
related routines.
Frankpitt, plus some improvements from yours truly. The simplifier depends
on the proiscachable field of pg_proc to tell it whether a function is
safe to pre-evaluate --- things like nextval() are not, for example.
Update pg_proc.h to contain reasonable cacheability information; as of
6.5.* hardly any functions were marked cacheable. I may have erred too
far in the other direction; see recent mail to pghackers for more info.
This update does not force an initdb, exactly, but you won't see much
benefit from the simplifier until you do one.
additional argument specifying the kind of lock to acquire/release (or
'NoLock' to do no lock processing). Ensure that all relations are locked
with some appropriate lock level before being examined --- this ensures
that relevant shared-inval messages have been processed and should prevent
problems caused by concurrent VACUUM. Fix several bugs having to do with
mismatched increment/decrement of relation ref count and mismatched
heap_open/close (which amounts to the same thing). A bogus ref count on
a relation doesn't matter much *unless* a SI Inval message happens to
arrive at the wrong time, which is probably why we got away with this
sloppiness for so long. Repair missing grab of AccessExclusiveLock in
DROP TABLE, ALTER/RENAME TABLE, etc, as noted by Hiroshi.
Recommend 'make clean all' after pulling this update; I modified the
Relation struct layout slightly.
Will post further discussion to pghackers list shortly.
conditions. There are some pretty bogus heuristics in prepqual.c that
try to decide whether to output CNF or DNF format; they need to be replaced,
likely. Right now the code is probably too willing to choose DNF form,
which might hurt performance in some cases that used to work OK.
But at least we have a foundation to build on.
in or_normalize, remove detection of duplicate subexpressions (since it's
highly unlikely to be worth the amount of time it takes), and introduce
a dnfify() entry point so that unintelligible backwards logic in UNION
processing can be eliminated. This is just an intermediate step ---
next thing is to look at not forcing the qual into CNF form when it would
be better off in DNF form.
was rejecting negative attnums as bogus, which of course they are not.
Add code to get_attdisbursion to produce a useful value for OID attribute,
since VACUUM does not store stats for system attributes.
Also, repair bug that's been in eqjoinsel for a long time: it was taking
the max of the two columns' disbursions, whereas it should use the min.