1
0
mirror of https://github.com/postgres/postgres.git synced 2025-06-22 02:52:08 +03:00

Refactor planner's pathkeys data structure to create a separate, explicit

representation of equivalence classes of variables.  This is an extensive
rewrite, but it brings a number of benefits:
* planner no longer fails in the presence of "incomplete" operator families
that don't offer operators for every possible combination of datatypes.
* avoid generating and then discarding redundant equality clauses.
* remove bogus assumption that derived equalities always use operators
named "=".
* mergejoins can work with a variety of sort orders (e.g., descending) now,
instead of tying each mergejoinable operator to exactly one sort order.
* better recognition of redundant sort columns.
* can make use of equalities appearing underneath an outer join.
This commit is contained in:
Tom Lane
2007-01-20 20:45:41 +00:00
parent 2b7334d487
commit f41803bb39
35 changed files with 3882 additions and 2719 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/xoper.sgml,v 1.37 2006/12/23 00:43:08 tgl Exp $ --> <!-- $PostgreSQL: pgsql/doc/src/sgml/xoper.sgml,v 1.38 2007/01/20 20:45:38 tgl Exp $ -->
<sect1 id="xoper"> <sect1 id="xoper">
<title>User-Defined Operators</title> <title>User-Defined Operators</title>
@ -337,7 +337,7 @@ table1.column1 OP table2.column2
join will never compare them at all, implicitly assuming that the join will never compare them at all, implicitly assuming that the
result of the join operator must be false. So it never makes sense result of the join operator must be false. So it never makes sense
to specify <literal>HASHES</literal> for operators that do not represent to specify <literal>HASHES</literal> for operators that do not represent
equality. some form of equality.
</para> </para>
<para> <para>
@ -347,7 +347,7 @@ table1.column1 OP table2.column2
exist yet. But attempts to use the operator in hash joins will fail exist yet. But attempts to use the operator in hash joins will fail
at run time if no such operator family exists. The system needs the at run time if no such operator family exists. The system needs the
operator family to find the data-type-specific hash function for the operator family to find the data-type-specific hash function for the
operator's input data type. Of course, you must also supply a suitable operator's input data type. Of course, you must also create a suitable
hash function before you can create the operator family. hash function before you can create the operator family.
</para> </para>
@ -382,8 +382,9 @@ table1.column1 OP table2.column2
false, never null, for any two nonnull inputs. If this rule is false, never null, for any two nonnull inputs. If this rule is
not followed, hash-optimization of <literal>IN</> operations may not followed, hash-optimization of <literal>IN</> operations may
generate wrong results. (Specifically, <literal>IN</> might return generate wrong results. (Specifically, <literal>IN</> might return
false where the correct answer according to the standard would be null; or it might false where the correct answer according to the standard would be null;
yield an error complaining that it wasn't prepared for a null result.) or it might yield an error complaining that it wasn't prepared for a
null result.)
</para> </para>
</note> </note>
@ -407,19 +408,18 @@ table1.column1 OP table2.column2
that can only succeed for pairs of values that fall at the that can only succeed for pairs of values that fall at the
<quote>same place</> <quote>same place</>
in the sort order. In practice this means that the join operator must in the sort order. In practice this means that the join operator must
behave like equality. But unlike hash join, where the left and right behave like equality. But it is possible to merge-join two
data types had better be the same (or at least bitwise equivalent),
it is possible to merge-join two
distinct data types so long as they are logically compatible. For distinct data types so long as they are logically compatible. For
example, the <type>smallint</type>-versus-<type>integer</type> equality operator example, the <type>smallint</type>-versus-<type>integer</type>
is merge-joinable. equality operator is merge-joinable.
We only need sorting operators that will bring both data types into a We only need sorting operators that will bring both data types into a
logically compatible sequence. logically compatible sequence.
</para> </para>
<para> <para>
To be marked <literal>MERGES</literal>, the join operator must appear To be marked <literal>MERGES</literal>, the join operator must appear
in a btree index operator family. This is not enforced when you create as an equality member of a btree index operator family.
This is not enforced when you create
the operator, since of course the referencing operator family couldn't the operator, since of course the referencing operator family couldn't
exist yet. But the operator will not actually be used for merge joins exist yet. But the operator will not actually be used for merge joins
unless a matching operator family can be found. The unless a matching operator family can be found. The
@ -428,30 +428,14 @@ table1.column1 OP table2.column2
</para> </para>
<para> <para>
There are additional restrictions on operators that you mark A merge-joinable operator must have a commutator (itself if the two
merge-joinable. These restrictions are not currently checked by operand data types are the same, or a related equality operator
<command>CREATE OPERATOR</command>, but errors may occur when if they are different) that appears in the same operator family.
the operator is used if any are not true: If this is not the case, planner errors may occur when the operator
is used. Also, it is a good idea (but not strictly required) for
<itemizedlist> a btree operator family that supports multiple datatypes to provide
<listitem> equality operators for every combination of the datatypes; this
<para> allows better optimization.
A merge-joinable equality operator must have a merge-joinable
commutator (itself if the two operand data types are the same, or a related
equality operator if they are different).
</para>
</listitem>
<listitem>
<para>
If there is a merge-joinable operator relating any two data types
A and B, and another merge-joinable operator relating B to any
third data type C, then A and C must also have a merge-joinable
operator; in other words, having a merge-joinable operator must
be transitive.
</para>
</listitem>
</itemizedlist>
</para> </para>
<note> <note>

View File

@ -15,7 +15,7 @@
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.361 2007/01/10 18:06:02 tgl Exp $ * $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.362 2007/01/20 20:45:38 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -1284,16 +1284,18 @@ _copyFromExpr(FromExpr *from)
*/ */
/* /*
* _copyPathKeyItem * _copyPathKey
*/ */
static PathKeyItem * static PathKey *
_copyPathKeyItem(PathKeyItem *from) _copyPathKey(PathKey *from)
{ {
PathKeyItem *newnode = makeNode(PathKeyItem); PathKey *newnode = makeNode(PathKey);
COPY_NODE_FIELD(key); /* EquivalenceClasses are never moved, so just shallow-copy the pointer */
COPY_SCALAR_FIELD(sortop); COPY_SCALAR_FIELD(pk_eclass);
COPY_SCALAR_FIELD(nulls_first); COPY_SCALAR_FIELD(pk_opfamily);
COPY_SCALAR_FIELD(pk_strategy);
COPY_SCALAR_FIELD(pk_nulls_first);
return newnode; return newnode;
} }
@ -1316,21 +1318,15 @@ _copyRestrictInfo(RestrictInfo *from)
COPY_BITMAPSET_FIELD(left_relids); COPY_BITMAPSET_FIELD(left_relids);
COPY_BITMAPSET_FIELD(right_relids); COPY_BITMAPSET_FIELD(right_relids);
COPY_NODE_FIELD(orclause); COPY_NODE_FIELD(orclause);
/* EquivalenceClasses are never copied, so shallow-copy the pointers */
COPY_SCALAR_FIELD(parent_ec);
COPY_SCALAR_FIELD(eval_cost); COPY_SCALAR_FIELD(eval_cost);
COPY_SCALAR_FIELD(this_selec); COPY_SCALAR_FIELD(this_selec);
COPY_SCALAR_FIELD(mergejoinoperator); COPY_NODE_FIELD(mergeopfamilies);
COPY_SCALAR_FIELD(left_sortop); /* EquivalenceClasses are never copied, so shallow-copy the pointers */
COPY_SCALAR_FIELD(right_sortop); COPY_SCALAR_FIELD(left_ec);
COPY_SCALAR_FIELD(mergeopfamily); COPY_SCALAR_FIELD(right_ec);
COPY_SCALAR_FIELD(outer_is_left);
/*
* Do not copy pathkeys, since they'd not be canonical in a copied query
*/
newnode->left_pathkey = NIL;
newnode->right_pathkey = NIL;
COPY_SCALAR_FIELD(left_mergescansel);
COPY_SCALAR_FIELD(right_mergescansel);
COPY_SCALAR_FIELD(hashjoinoperator); COPY_SCALAR_FIELD(hashjoinoperator);
COPY_SCALAR_FIELD(left_bucketsize); COPY_SCALAR_FIELD(left_bucketsize);
COPY_SCALAR_FIELD(right_bucketsize); COPY_SCALAR_FIELD(right_bucketsize);
@ -3033,8 +3029,8 @@ copyObject(void *from)
/* /*
* RELATION NODES * RELATION NODES
*/ */
case T_PathKeyItem: case T_PathKey:
retval = _copyPathKeyItem(from); retval = _copyPathKey(from);
break; break;
case T_RestrictInfo: case T_RestrictInfo:
retval = _copyRestrictInfo(from); retval = _copyRestrictInfo(from);

View File

@ -18,7 +18,7 @@
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.295 2007/01/10 18:06:03 tgl Exp $ * $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.296 2007/01/20 20:45:38 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -596,11 +596,27 @@ _equalFromExpr(FromExpr *a, FromExpr *b)
*/ */
static bool static bool
_equalPathKeyItem(PathKeyItem *a, PathKeyItem *b) _equalPathKey(PathKey *a, PathKey *b)
{ {
COMPARE_NODE_FIELD(key); /*
COMPARE_SCALAR_FIELD(sortop); * This is normally used on non-canonicalized PathKeys, so must chase
COMPARE_SCALAR_FIELD(nulls_first); * up to the topmost merged EquivalenceClass and see if those are the
* same (by pointer equality).
*/
EquivalenceClass *a_eclass;
EquivalenceClass *b_eclass;
a_eclass = a->pk_eclass;
while (a_eclass->ec_merged)
a_eclass = a_eclass->ec_merged;
b_eclass = b->pk_eclass;
while (b_eclass->ec_merged)
b_eclass = b_eclass->ec_merged;
if (a_eclass != b_eclass)
return false;
COMPARE_SCALAR_FIELD(pk_opfamily);
COMPARE_SCALAR_FIELD(pk_strategy);
COMPARE_SCALAR_FIELD(pk_nulls_first);
return true; return true;
} }
@ -2016,8 +2032,8 @@ equal(void *a, void *b)
/* /*
* RELATION NODES * RELATION NODES
*/ */
case T_PathKeyItem: case T_PathKey:
retval = _equalPathKeyItem(a, b); retval = _equalPathKey(a, b);
break; break;
case T_RestrictInfo: case T_RestrictInfo:
retval = _equalRestrictInfo(a, b); retval = _equalRestrictInfo(a, b);

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.293 2007/01/10 18:06:03 tgl Exp $ * $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.294 2007/01/20 20:45:38 tgl Exp $
* *
* NOTES * NOTES
* Every node type that can appear in stored rules' parsetrees *must* * Every node type that can appear in stored rules' parsetrees *must*
@ -1196,29 +1196,11 @@ _outNestPath(StringInfo str, NestPath *node)
static void static void
_outMergePath(StringInfo str, MergePath *node) _outMergePath(StringInfo str, MergePath *node)
{ {
int numCols;
int i;
WRITE_NODE_TYPE("MERGEPATH"); WRITE_NODE_TYPE("MERGEPATH");
_outJoinPathInfo(str, (JoinPath *) node); _outJoinPathInfo(str, (JoinPath *) node);
WRITE_NODE_FIELD(path_mergeclauses); WRITE_NODE_FIELD(path_mergeclauses);
numCols = list_length(node->path_mergeclauses);
appendStringInfo(str, " :path_mergeFamilies");
for (i = 0; i < numCols; i++)
appendStringInfo(str, " %u", node->path_mergeFamilies[i]);
appendStringInfo(str, " :path_mergeStrategies");
for (i = 0; i < numCols; i++)
appendStringInfo(str, " %d", node->path_mergeStrategies[i]);
appendStringInfo(str, " :path_mergeNullsFirst");
for (i = 0; i < numCols; i++)
appendStringInfo(str, " %d", (int) node->path_mergeNullsFirst[i]);
WRITE_NODE_FIELD(outersortkeys); WRITE_NODE_FIELD(outersortkeys);
WRITE_NODE_FIELD(innersortkeys); WRITE_NODE_FIELD(innersortkeys);
} }
@ -1241,7 +1223,8 @@ _outPlannerInfo(StringInfo str, PlannerInfo *node)
/* NB: this isn't a complete set of fields */ /* NB: this isn't a complete set of fields */
WRITE_NODE_FIELD(parse); WRITE_NODE_FIELD(parse);
WRITE_NODE_FIELD(join_rel_list); WRITE_NODE_FIELD(join_rel_list);
WRITE_NODE_FIELD(equi_key_list); WRITE_NODE_FIELD(eq_classes);
WRITE_NODE_FIELD(canon_pathkeys);
WRITE_NODE_FIELD(left_join_clauses); WRITE_NODE_FIELD(left_join_clauses);
WRITE_NODE_FIELD(right_join_clauses); WRITE_NODE_FIELD(right_join_clauses);
WRITE_NODE_FIELD(full_join_clauses); WRITE_NODE_FIELD(full_join_clauses);
@ -1284,6 +1267,7 @@ _outRelOptInfo(StringInfo str, RelOptInfo *node)
WRITE_NODE_FIELD(subplan); WRITE_NODE_FIELD(subplan);
WRITE_NODE_FIELD(baserestrictinfo); WRITE_NODE_FIELD(baserestrictinfo);
WRITE_NODE_FIELD(joininfo); WRITE_NODE_FIELD(joininfo);
WRITE_BOOL_FIELD(has_eclass_joins);
WRITE_BITMAPSET_FIELD(index_outer_relids); WRITE_BITMAPSET_FIELD(index_outer_relids);
WRITE_NODE_FIELD(index_inner_paths); WRITE_NODE_FIELD(index_inner_paths);
} }
@ -1306,13 +1290,48 @@ _outIndexOptInfo(StringInfo str, IndexOptInfo *node)
} }
static void static void
_outPathKeyItem(StringInfo str, PathKeyItem *node) _outEquivalenceClass(StringInfo str, EquivalenceClass *node)
{ {
WRITE_NODE_TYPE("PATHKEYITEM"); /*
* To simplify reading, we just chase up to the topmost merged EC and
* print that, without bothering to show the merge-ees separately.
*/
while (node->ec_merged)
node = node->ec_merged;
WRITE_NODE_FIELD(key); WRITE_NODE_TYPE("EQUIVALENCECLASS");
WRITE_OID_FIELD(sortop);
WRITE_BOOL_FIELD(nulls_first); WRITE_NODE_FIELD(ec_opfamilies);
WRITE_NODE_FIELD(ec_members);
WRITE_NODE_FIELD(ec_sources);
WRITE_BITMAPSET_FIELD(ec_relids);
WRITE_BOOL_FIELD(ec_has_const);
WRITE_BOOL_FIELD(ec_has_volatile);
WRITE_BOOL_FIELD(ec_below_outer_join);
WRITE_BOOL_FIELD(ec_broken);
}
static void
_outEquivalenceMember(StringInfo str, EquivalenceMember *node)
{
WRITE_NODE_TYPE("EQUIVALENCEMEMBER");
WRITE_NODE_FIELD(em_expr);
WRITE_BITMAPSET_FIELD(em_relids);
WRITE_BOOL_FIELD(em_is_const);
WRITE_BOOL_FIELD(em_is_child);
WRITE_OID_FIELD(em_datatype);
}
static void
_outPathKey(StringInfo str, PathKey *node)
{
WRITE_NODE_TYPE("PATHKEY");
WRITE_NODE_FIELD(pk_eclass);
WRITE_OID_FIELD(pk_opfamily);
WRITE_INT_FIELD(pk_strategy);
WRITE_BOOL_FIELD(pk_nulls_first);
} }
static void static void
@ -1331,12 +1350,11 @@ _outRestrictInfo(StringInfo str, RestrictInfo *node)
WRITE_BITMAPSET_FIELD(left_relids); WRITE_BITMAPSET_FIELD(left_relids);
WRITE_BITMAPSET_FIELD(right_relids); WRITE_BITMAPSET_FIELD(right_relids);
WRITE_NODE_FIELD(orclause); WRITE_NODE_FIELD(orclause);
WRITE_OID_FIELD(mergejoinoperator); WRITE_NODE_FIELD(parent_ec);
WRITE_OID_FIELD(left_sortop); WRITE_NODE_FIELD(mergeopfamilies);
WRITE_OID_FIELD(right_sortop); WRITE_NODE_FIELD(left_ec);
WRITE_OID_FIELD(mergeopfamily); WRITE_NODE_FIELD(right_ec);
WRITE_NODE_FIELD(left_pathkey); WRITE_BOOL_FIELD(outer_is_left);
WRITE_NODE_FIELD(right_pathkey);
WRITE_OID_FIELD(hashjoinoperator); WRITE_OID_FIELD(hashjoinoperator);
} }
@ -2163,8 +2181,14 @@ _outNode(StringInfo str, void *obj)
case T_IndexOptInfo: case T_IndexOptInfo:
_outIndexOptInfo(str, obj); _outIndexOptInfo(str, obj);
break; break;
case T_PathKeyItem: case T_EquivalenceClass:
_outPathKeyItem(str, obj); _outEquivalenceClass(str, obj);
break;
case T_EquivalenceMember:
_outEquivalenceMember(str, obj);
break;
case T_PathKey:
_outPathKey(str, obj);
break; break;
case T_RestrictInfo: case T_RestrictInfo:
_outRestrictInfo(str, obj); _outRestrictInfo(str, obj);

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/nodes/print.c,v 1.82 2007/01/05 22:19:30 momjian Exp $ * $PostgreSQL: pgsql/src/backend/nodes/print.c,v 1.83 2007/01/20 20:45:38 tgl Exp $
* *
* HISTORY * HISTORY
* AUTHOR DATE MAJOR EVENT * AUTHOR DATE MAJOR EVENT
@ -404,7 +404,7 @@ print_expr(Node *expr, List *rtable)
/* /*
* print_pathkeys - * print_pathkeys -
* pathkeys list of list of PathKeyItems * pathkeys list of PathKeys
*/ */
void void
print_pathkeys(List *pathkeys, List *rtable) print_pathkeys(List *pathkeys, List *rtable)
@ -414,17 +414,26 @@ print_pathkeys(List *pathkeys, List *rtable)
printf("("); printf("(");
foreach(i, pathkeys) foreach(i, pathkeys)
{ {
List *pathkey = (List *) lfirst(i); PathKey *pathkey = (PathKey *) lfirst(i);
EquivalenceClass *eclass;
ListCell *k; ListCell *k;
bool first = true;
eclass = pathkey->pk_eclass;
/* chase up, in case pathkey is non-canonical */
while (eclass->ec_merged)
eclass = eclass->ec_merged;
printf("("); printf("(");
foreach(k, pathkey) foreach(k, eclass->ec_members)
{ {
PathKeyItem *item = (PathKeyItem *) lfirst(k); EquivalenceMember *mem = (EquivalenceMember *) lfirst(k);
print_expr(item->key, rtable); if (first)
if (lnext(k)) first = false;
else
printf(", "); printf(", ");
print_expr((Node *) mem->em_expr, rtable);
} }
printf(")"); printf(")");
if (lnext(i)) if (lnext(i))

View File

@ -90,21 +90,19 @@ have a list of relations to join. However, FULL OUTER JOIN clauses are
never flattened, and other kinds of JOIN might not be either, if the never flattened, and other kinds of JOIN might not be either, if the
flattening process is stopped by join_collapse_limit or from_collapse_limit flattening process is stopped by join_collapse_limit or from_collapse_limit
restrictions. Therefore, we end up with a planning problem that contains restrictions. Therefore, we end up with a planning problem that contains
both lists of relations to be joined in any order, and JOIN nodes that lists of relations to be joined in any order, where any individual item
force a particular join order. For each un-flattened JOIN node, we join might be a sub-list that has to be joined together before we can consider
exactly that pair of relations (after recursively planning their inputs, joining it to its siblings. We process these sub-problems recursively,
if the inputs aren't single base relations). We generate a Path for each bottom up. Note that the join list structure constrains the possible join
feasible join method, and select the cheapest Path. Note that the JOIN orders, but it doesn't constrain the join implementation method at each
clause structure determines the join Path structure, but it doesn't join (nestloop, merge, hash), nor does it say which rel is considered outer
constrain the join implementation method at each join (nestloop, merge, or inner at each join. We consider all these possibilities in building
hash), nor does it say which rel is considered outer or inner at each Paths. We generate a Path for each feasible join method, and select the
join. We consider all these possibilities in building Paths. cheapest Path.
3) At the top level of the FROM clause we will have a list of relations For each planning problem, therefore, we will have a list of relations
that are either base rels or joinrels constructed per un-flattened JOIN that are either base rels or joinrels constructed per sub-join-lists.
directives. (This is also the situation, recursively, when we can flatten We can join these rels together in any order the planner sees fit.
sub-joins underneath an un-flattenable JOIN into a list of relations to
join.) We can join these rels together in any order the planner sees fit.
The standard (non-GEQO) planner does this as follows: The standard (non-GEQO) planner does this as follows:
Consider joining each RelOptInfo to each other RelOptInfo specified in its Consider joining each RelOptInfo to each other RelOptInfo specified in its
@ -114,17 +112,17 @@ choice but to generate a clauseless Cartesian-product join; so we consider
joining that rel to each other available rel. But in the presence of join joining that rel to each other available rel. But in the presence of join
clauses we will only consider joins that use available join clauses.) clauses we will only consider joins that use available join clauses.)
If we only had two relations in the FROM list, we are done: we just pick If we only had two relations in the list, we are done: we just pick
the cheapest path for the join RelOptInfo. If we had more than two, we now the cheapest path for the join RelOptInfo. If we had more than two, we now
need to consider ways of joining join RelOptInfos to each other to make need to consider ways of joining join RelOptInfos to each other to make
join RelOptInfos that represent more than two FROM items. join RelOptInfos that represent more than two list items.
The join tree is constructed using a "dynamic programming" algorithm: The join tree is constructed using a "dynamic programming" algorithm:
in the first pass (already described) we consider ways to create join rels in the first pass (already described) we consider ways to create join rels
representing exactly two FROM items. The second pass considers ways representing exactly two list items. The second pass considers ways
to make join rels that represent exactly three FROM items; the next pass, to make join rels that represent exactly three list items; the next pass,
four items, etc. The last pass considers how to make the final join four items, etc. The last pass considers how to make the final join
relation that includes all FROM items --- obviously there can be only one relation that includes all list items --- obviously there can be only one
join rel at this top level, whereas there can be more than one join rel join rel at this top level, whereas there can be more than one join rel
at lower levels. At each level we use joins that follow available join at lower levels. At each level we use joins that follow available join
clauses, if possible, just as described for the first level. clauses, if possible, just as described for the first level.
@ -155,7 +153,7 @@ For example:
{1 2 3 4} {1 2 3 4}
We consider left-handed plans (the outer rel of an upper join is a joinrel, We consider left-handed plans (the outer rel of an upper join is a joinrel,
but the inner is always a single FROM item); right-handed plans (outer rel but the inner is always a single list item); right-handed plans (outer rel
is always a single item); and bushy plans (both inner and outer can be is always a single item); and bushy plans (both inner and outer can be
joins themselves). For example, when building {1 2 3 4} we consider joins themselves). For example, when building {1 2 3 4} we consider
joining {1 2 3} to {4} (left-handed), {4} to {1 2 3} (right-handed), and joining {1 2 3} to {4} (left-handed), {4} to {1 2 3} (right-handed), and
@ -336,7 +334,9 @@ RelOptInfo - a relation or joined relations
MergePath - merge joins MergePath - merge joins
HashPath - hash joins HashPath - hash joins
PathKeys - a data structure representing the ordering of a path EquivalenceClass - a data structure representing a set of values known equal
PathKey - a data structure representing the sort ordering of a path
The optimizer spends a good deal of its time worrying about the ordering The optimizer spends a good deal of its time worrying about the ordering
of the tuples returned by a path. The reason this is useful is that by of the tuples returned by a path. The reason this is useful is that by
@ -363,213 +363,250 @@ without sorting, since it can pick from any of the paths retained for its
inputs. inputs.
EquivalenceClasses
------------------
During the deconstruct_jointree() scan of the query's qual clauses, we look
for mergejoinable equality clauses A = B whose applicability is not delayed
by an outer join; these are called "equivalence clauses". When we find
one, we create an EquivalenceClass containing the expressions A and B to
record this knowledge. If we later find another equivalence clause B = C,
we add C to the existing EquivalenceClass for {A B}; this may require
merging two existing EquivalenceClasses. At the end of the scan, we have
sets of values that are known all transitively equal to each other. We can
therefore use a comparison of any pair of the values as a restriction or
join clause (when these values are available at the scan or join, of
course); furthermore, we need test only one such comparison, not all of
them. Therefore, equivalence clauses are removed from the standard qual
distribution process. Instead, when preparing a restriction or join clause
list, we examine each EquivalenceClass to see if it can contribute a
clause, and if so we select an appropriate pair of values to compare. For
example, if we are trying to join A's relation to C's, we can generate the
clause A = C, even though this appeared nowhere explicitly in the original
query. This may allow us to explore join paths that otherwise would have
been rejected as requiring Cartesian-product joins.
Sometimes an EquivalenceClass may contain a pseudo-constant expression
(i.e., one not containing Vars or Aggs of the current query level, nor
volatile functions). In this case we do not follow the policy of
dynamically generating join clauses: instead, we dynamically generate
restriction clauses "var = const" wherever one of the variable members of
the class can first be computed. For example, if we have A = B and B = 42,
we effectively generate the restriction clauses A = 42 and B = 42, and then
we need not bother with explicitly testing the join clause A = B when the
relations are joined. In effect, all the class members can be tested at
relation-scan level and there's never a need for join tests.
The precise technical interpretation of an EquivalenceClass is that it
asserts that at any plan node where more than one of its member values
can be computed, output rows in which the values are not all equal may
be discarded without affecting the query result. (We require all levels
of the plan to enforce EquivalenceClasses, hence a join need not recheck
equality of values that were computable by one of its children.) For an
ordinary EquivalenceClass that is "valid everywhere", we can further infer
that the values are all non-null, because all mergejoinable operators are
strict. However, we also allow equivalence clauses that appear below the
nullable side of an outer join to form EquivalenceClasses; for these
classes, the interpretation is that either all the values are equal, or
all (except pseudo-constants) have gone to null. (This requires a
limitation that non-constant members be strict, else they might not go
to null when the other members do.) Consider for example
SELECT *
FROM a LEFT JOIN
(SELECT * FROM b JOIN c ON b.y = c.z WHERE b.y = 10) ss
ON a.x = ss.y
WHERE a.x = 42;
We can form the below-outer-join EquivalenceClass {b.y c.z 10} and thereby
apply c.z = 10 while scanning c. (The reason we disallow outerjoin-delayed
clauses from forming EquivalenceClasses is exactly that we want to be able
to push any derived clauses as far down as possible.) But once above the
outer join it's no longer necessarily the case that b.y = 10, and thus we
cannot use such EquivalenceClasses to conclude that sorting is unnecessary
(see discussion of PathKeys below).
In this example, notice also that a.x = ss.y (really a.x = b.y) is not an
equivalence clause because its applicability to b is delayed by the outer
join; thus we do not try to insert b.y into the equivalence class {a.x 42}.
But since we see that a.x has been equated to 42 above the outer join, we
are able to form a below-outer-join class {b.y 42}; this restriction can be
added because no b/c row not having b.y = 42 can contribute to the result
of the outer join, and so we need not compute such rows. Now this class
will get merged with {b.y c.z 10}, leading to the contradiction 10 = 42,
which lets the planner deduce that the b/c join need not be computed at all
because none of its rows can contribute to the outer join. (This gets
implemented as a gating Result filter, since more usually the potential
contradiction involves Param values rather than just Consts, and thus has
to be checked at runtime.)
To aid in determining the sort ordering(s) that can work with a mergejoin,
we mark each mergejoinable clause with the EquivalenceClasses of its left
and right inputs. For an equivalence clause, these are of course the same
EquivalenceClass. For a non-equivalence mergejoinable clause (such as an
outer-join qualification), we generate two separate EquivalenceClasses for
the left and right inputs. This may result in creating single-item
equivalence "classes", though of course these are still subject to merging
if other equivalence clauses are later found to bear on the same
expressions.
Another way that we may form a single-item EquivalenceClass is in creation
of a PathKey to represent a desired sort order (see below). This is a bit
different from the above cases because such an EquivalenceClass might
contain an aggregate function or volatile expression. (A clause containing
a volatile function will never be considered mergejoinable, even if its top
operator is mergejoinable, so there is no way for a volatile expression to
get into EquivalenceClasses otherwise. Aggregates are disallowed in WHERE
altogether, so will never be found in a mergejoinable clause.) This is just
a convenience to maintain a uniform PathKey representation: such an
EquivalenceClass will never be merged with any other.
An EquivalenceClass also contains a list of btree opfamily OIDs, which
determines what the equalities it represents actually "mean". All the
equivalence clauses that contribute to an EquivalenceClass must have
equality operators that belong to the same set of opfamilies. (Note: most
of the time, a particular equality operator belongs to only one family, but
it's possible that it belongs to more than one. We keep track of all the
families to ensure that we can make use of an index belonging to any one of
the families for mergejoin purposes.)
PathKeys PathKeys
-------- --------
The PathKeys data structure represents what is known about the sort order The PathKeys data structure represents what is known about the sort order
of a particular Path. of the tuples generated by a particular Path. A path's pathkeys field is a
list of PathKey nodes, where the n'th item represents the n'th sort key of
the result. Each PathKey contains these fields:
Path.pathkeys is a List of Lists of PathKeyItem nodes that represent * a reference to an EquivalenceClass
the sort order of the result generated by the Path. The n'th sublist * a btree opfamily OID (must match one of those in the EC)
represents the n'th sort key of the result. * a sort direction (ascending or descending)
* a nulls-first-or-last flag
The EquivalenceClass represents the value being sorted on. Since the
various members of an EquivalenceClass are known equal according to the
opfamily, we can consider a path sorted by any one of them to be sorted by
any other too; this is what justifies referencing the whole
EquivalenceClass rather than just one member of it.
In single/base relation RelOptInfo's, the Paths represent various ways In single/base relation RelOptInfo's, the Paths represent various ways
of scanning the relation and the resulting ordering of the tuples. of scanning the relation and the resulting ordering of the tuples.
Sequential scan Paths have NIL pathkeys, indicating no known ordering. Sequential scan Paths have NIL pathkeys, indicating no known ordering.
Index scans have Path.pathkeys that represent the chosen index's ordering, Index scans have Path.pathkeys that represent the chosen index's ordering,
if any. A single-key index would create a pathkey with a single sublist, if any. A single-key index would create a single-PathKey list, while a
e.g. ( (tab1.indexkey1/sortop1) ). A multi-key index generates a sublist multi-column index generates a list with one element per index column.
per key, e.g. ( (tab1.indexkey1/sortop1) (tab1.indexkey2/sortop2) ) which (Actually, since an index can be scanned either forward or backward, there
shows major sort by indexkey1 (ordering by sortop1) and minor sort by are two possible sort orders and two possible PathKey lists it can
indexkey2 with sortop2. generate.)
Note that a multi-pass indexscan (OR clause scan) has NIL pathkeys since Note that a bitmap scan or multi-pass indexscan (OR clause scan) has NIL
we can say nothing about the overall order of its result. Also, an pathkeys since we can say nothing about the overall order of its result.
indexscan on an unordered type of index generates NIL pathkeys. However, Also, an indexscan on an unordered type of index generates NIL pathkeys.
we can always create a pathkey by doing an explicit sort. The pathkeys However, we can always create a pathkey by doing an explicit sort. The
for a Sort plan's output just represent the sort key fields and the pathkeys for a Sort plan's output just represent the sort key fields and
ordering operators used. the ordering operators used.
Things get more interesting when we consider joins. Suppose we do a Things get more interesting when we consider joins. Suppose we do a
mergejoin between A and B using the mergeclause A.X = B.Y. The output mergejoin between A and B using the mergeclause A.X = B.Y. The output
of the mergejoin is sorted by X --- but it is also sorted by Y. We of the mergejoin is sorted by X --- but it is also sorted by Y. Again,
represent this fact by listing both keys in a single pathkey sublist: this can be represented by a PathKey referencing an EquivalenceClass
( (A.X/xsortop B.Y/ysortop) ). This pathkey asserts that the major containing both X and Y.
sort order of the Path can be taken to be *either* A.X or B.Y.
They are equal, so they are both primary sort keys. By doing this,
we allow future joins to use either var as a pre-sorted key, so upper
Mergejoins may be able to avoid having to re-sort the Path. This is
why pathkeys is a List of Lists.
We keep a sortop associated with each PathKeyItem because cross-data-type With a little further thought, it becomes apparent that nestloop joins
mergejoins are possible; for example int4 = int8 is mergejoinable. can also produce sorted output. For example, if we do a nestloop join
In this case we need to remember that the left var is ordered by int4lt between outer relation A and inner relation B, then any pathkeys relevant
while the right var is ordered by int8lt. So the different members of to A are still valid for the join result: we have not altered the order of
each sublist could have different sortops. the tuples from A. Even more interesting, if there was an equivalence clause
A.X=B.Y, and A.X was a pathkey for the outer relation A, then we can assert
Note that while the order of the top list is meaningful (primary vs. that B.Y is a pathkey for the join result; X was ordered before and still
secondary sort key), the order of each sublist is arbitrary. Each sublist is, and the joined values of Y are equal to the joined values of X, so Y
should be regarded as a set of equivalent keys, with no significance
to the list order.
With a little further thought, it becomes apparent that pathkeys for
joins need not only come from mergejoins. For example, if we do a
nestloop join between outer relation A and inner relation B, then any
pathkeys relevant to A are still valid for the join result: we have
not altered the order of the tuples from A. Even more interesting,
if there was a mergeclause (more formally, an "equijoin clause") A.X=B.Y,
and A.X was a pathkey for the outer relation A, then we can assert that
B.Y is a pathkey for the join result; X was ordered before and still is,
and the joined values of Y are equal to the joined values of X, so Y
must now be ordered too. This is true even though we used neither an must now be ordered too. This is true even though we used neither an
explicit sort nor a mergejoin on Y. explicit sort nor a mergejoin on Y. (Note: hash joins cannot be counted
on to preserve the order of their outer relation, because the executor
might decide to "batch" the join, so we always set pathkeys to NIL for
a hashjoin path.) Exception: a RIGHT or FULL join doesn't preserve the
ordering of its outer relation, because it might insert nulls at random
points in the ordering.
More generally, whenever we have an equijoin clause A.X = B.Y and a In general, we can justify using EquivalenceClasses as the basis for
pathkey A.X, we can add B.Y to that pathkey if B is part of the joined pathkeys because, whenever we scan a relation containing multiple
relation the pathkey is for, *no matter how we formed the join*. It works EquivalenceClass members or join two relations each containing
as long as the clause has been applied at some point while forming the EquivalenceClass members, we apply restriction or join clauses derived from
join relation. (In the current implementation, we always apply qual the EquivalenceClass. This guarantees that any two values listed in the
clauses as soon as possible, ie, as far down in the plan tree as possible. EquivalenceClass are in fact equal in all tuples emitted by the scan or
So we can treat the pathkeys as equivalent everywhere. The exception is join, and therefore that if the tuples are sorted by one of the values,
when the relations A and B are joined inside the nullable side of an they can be considered sorted by any other as well. It does not matter
OUTER JOIN and the equijoin clause comes from above the OUTER JOIN. In this whether the test clause is used as a mergeclause, or merely enforced
case we cannot apply the qual as soon as A and B are joined, so we do not after-the-fact as a qpqual filter.
consider the pathkeys to be equivalent. This could be improved if we wanted
to go to the trouble of making pathkey equivalence be context-dependent,
but that seems much more complex than it's worth.)
In short, then: when producing the pathkeys for a merge or nestloop join, Note that there is no particular difficulty in labeling a path's sort
we can keep all of the keys of the outer path, since the ordering of the order with a PathKey referencing an EquivalenceClass that contains
outer path will be preserved in the result. Furthermore, we can add to variables not yet joined into the path's output. We can simply ignore
each pathkey sublist any inner vars that are equijoined to any of the such entries as not being relevant (yet). This makes it possible to
outer vars in the sublist; this works regardless of whether we are use the same EquivalenceClasses throughout the join planning process.
implementing the join using that equijoin clause as a mergeclause, In fact, by being careful not to generate multiple identical PathKey
or merely enforcing the clause after-the-fact as a qpqual filter. objects, we can reduce comparison of EquivalenceClasses and PathKeys
to simple pointer comparison, which is a huge savings because add_path
Although Hashjoins also work only with equijoin operators, it is *not* has to make a large number of PathKey comparisons in deciding whether
safe to consider the output of a Hashjoin to be sorted in any particular competing Paths are equivalently sorted.
order --- not even the outer path's order. This is true because the
executor might have to split the join into multiple batches. Therefore
a Hashjoin is always given NIL pathkeys. (Also, we need to use only
mergejoinable operators when deducing which inner vars are now sorted,
because a mergejoin operator tells us which left- and right-datatype
sortops can be considered equivalent, whereas a hashjoin operator
doesn't imply anything about sort order.)
Pathkeys are also useful to represent an ordering that we wish to achieve, Pathkeys are also useful to represent an ordering that we wish to achieve,
since they are easily compared to the pathkeys of a potential candidate since they are easily compared to the pathkeys of a potential candidate
path. So, SortClause lists are turned into pathkeys lists for use inside path. So, SortClause lists are turned into pathkeys lists for use inside
the optimizer. the optimizer.
OK, now for how it *really* works: Because we have to generate pathkeys lists from the sort clauses before
we've finished EquivalenceClass merging, we cannot use the pointer-equality
We did implement pathkeys just as described above, and found that the method of comparing PathKeys in the earliest stages of the planning
planner spent a huge amount of time comparing pathkeys, because the process. Instead, we generate "non canonical" PathKeys that reference
representation of pathkeys as unordered lists made it expensive to decide single-element EquivalenceClasses that might get merged later. After we
whether two were equal or not. So, we've modified the representation complete EquivalenceClass merging, we replace these with "canonical"
as described next. PathKeys that reference only fully-merged classes, and after that we make
sure we don't generate more than one copy of each "canonical" PathKey.
If we scan the WHERE clause for equijoin clauses (mergejoinable clauses) Then it is safe to use pointer comparison on canonical PathKeys.
during planner startup, we can construct lists of equivalent pathkey items
for the query. There could be more than two items per equivalence set;
for example, WHERE A.X = B.Y AND B.Y = C.Z AND D.R = E.S creates the
equivalence sets { A.X B.Y C.Z } and { D.R E.S } (plus associated sortops).
Any pathkey item that belongs to an equivalence set implies that all the
other items in its set apply to the relation too, or at least all the ones
that are for fields present in the relation. (Some of the items in the
set might be for as-yet-unjoined relations.) Furthermore, any multi-item
pathkey sublist that appears at any stage of planning the query *must* be
a subset of one or another of these equivalence sets; there's no way we'd
have put two items in the same pathkey sublist unless they were equijoined
in WHERE.
Now suppose that we allow a pathkey sublist to contain pathkey items for
vars that are not yet part of the pathkey's relation. This introduces
no logical difficulty, because such items can easily be seen to be
irrelevant; we just mandate that they be ignored. But having allowed
this, we can declare (by fiat) that any multiple-item pathkey sublist
must be "equal()" to the appropriate equivalence set. In effect,
whenever we make a pathkey sublist that mentions any var appearing in an
equivalence set, we instantly add all the other vars equivalenced to it,
whether they appear yet in the pathkey's relation or not. And we also
mandate that the pathkey sublist appear in the same order as the
equivalence set it comes from.
In fact, we can go even further, and say that the canonical representation
of a pathkey sublist is a pointer directly to the relevant equivalence set,
which is kept in a list of pathkey equivalence sets for the query. Then
pathkey sublist comparison reduces to pointer-equality checking! To do this
we also have to add single-element pathkey sublists to the query's list of
equivalence sets, but that's a small price to pay.
By the way, it's OK and even useful for us to build equivalence sets
that mention multiple vars from the same relation. For example, if
we have WHERE A.X = A.Y and we are scanning A using an index on X,
we can legitimately conclude that the path is sorted by Y as well;
and this could be handy if Y is the variable used in other join clauses
or ORDER BY. So, any WHERE clause with a mergejoinable operator can
contribute to an equivalence set, even if it's not a join clause.
As sketched so far, equijoin operators allow us to conclude that
A.X = B.Y and B.Y = C.Z together imply A.X = C.Z, even when different
datatypes are involved. What is not immediately obvious is that to use
the "canonical pathkey" representation, we *must* make this deduction.
An example (from a real bug in Postgres 7.0) is a mergejoin for a query
like
SELECT * FROM t1, t2 WHERE t1.f2 = t2.f3 AND t1.f1 = t2.f3;
The canonical-pathkey mechanism is able to deduce that t1.f1 = t1.f2
(ie, both appear in the same canonical pathkey set). If we sort t1
and then apply a mergejoin, we *must* filter the t1 tuples using the
implied qualification f1 = f2, because otherwise the output of the sort
will be ordered by f1 or f2 (whichever we sort on) but not both. The
merge will then fail since (depending on which qual clause it applies
first) it's expecting either ORDER BY f1,f2 or ORDER BY f2,f1, but the
actual output of the sort has neither of these orderings. The best fix
for this is to generate all the implied equality constraints for each
equijoin set and add these clauses to the query's qualification list.
In other words, we *explicitly* deduce f1 = f2 and add this to the WHERE
clause. The constraint will be applied as a qpqual to the output of the
scan on t1, resulting in sort output that is indeed ordered by both vars.
This approach provides more information to the selectivity estimation
code than it would otherwise have, and reduces the number of tuples
processed in join stages, so it's a win to make these deductions even
if we weren't forced to.
When we generate implied equality constraints, we may find ourselves
adding redundant clauses to specific relations. For example, consider
SELECT * FROM t1, t2, t3 WHERE t1.a = t2.b AND t2.b = t3.c;
We will generate the implied clause t1.a = t3.c and add it to the tree.
This is good since it allows us to consider joining t1 and t3 directly,
which we otherwise wouldn't do. But when we reach the stage of joining
all three relations, we will have redundant join clauses --- eg, if we
join t1 and t2 first, then the path that joins (t1 t2) to t3 will have
both t2.b = t3.c and t1.a = t3.c as restriction clauses. This is bad;
not only is evaluation of the extra clause useless work at runtime,
but the selectivity estimator routines will underestimate the number
of tuples produced since they won't know that the two clauses are
perfectly redundant. We fix this by detecting and removing redundant
clauses as the restriction clause list is built for each join. (We
can't do it sooner, since which clauses are redundant will vary depending
on the join order.)
Yet another implication of all this is that mergejoinable operators
must form closed equivalence sets. For example, if "int2 = int4"
and "int4 = int8" are both marked mergejoinable, then there had better
be a mergejoinable "int2 = int8" operator as well. Otherwise, when
we're given WHERE int2var = int4var AND int4var = int8var, we'll fail
while trying to create a representation of the implied clause
int2var = int8var.
An additional refinement we can make is to insist that canonical pathkey An additional refinement we can make is to insist that canonical pathkey
lists (sort orderings) do not mention the same pathkey set more than once. lists (sort orderings) do not mention the same EquivalenceClass more than
For example, a pathkey list ((A) (B) (A)) is redundant --- the second once. For example, in all these cases the second sort column is redundant,
occurrence of (A) does not change the ordering, since the data must already because it cannot distinguish values that are the same according to the
be sorted by A. Although a user probably wouldn't write ORDER BY A,B,A first sort column:
directly, such redundancies are more probable once equijoin equivalences SELECT ... ORDER BY x, x
have been considered. Also, the system is likely to generate redundant SELECT ... ORDER BY x, x DESC
pathkey lists when computing the sort ordering needed for a mergejoin. By SELECT ... WHERE x = y ORDER BY x, y
eliminating the redundancy, we save time and improve planning, since the Although a user probably wouldn't write "ORDER BY x,x" directly, such
planner will more easily recognize equivalent orderings as being equivalent. redundancies are more probable once equivalence classes have been
considered. Also, the system may generate redundant pathkey lists when
computing the sort ordering needed for a mergejoin. By eliminating the
redundancy, we save time and improve planning, since the planner will more
easily recognize equivalent orderings as being equivalent.
Another interesting property is that if the underlying EquivalenceClass
contains a constant and is not below an outer join, then the pathkey is
completely redundant and need not be sorted by at all! Every row must
contain the same constant value, so there's no need to sort. (If the EC is
below an outer join, we still have to sort, since some of the rows might
have gone to null and others not. In this case we must be careful to pick
a non-const member to sort by. The assumption that all the non-const
members go to null at the same plan level is critical here, else they might
not produce the same sort order.) This might seem pointless because users
are unlikely to write "... WHERE x = 42 ORDER BY x", but it allows us to
recognize when particular index columns are irrelevant to the sort order:
if we have "... WHERE x = 42 ORDER BY y", scanning an index on (x,y)
produces correctly ordered data without a sort step. We used to have very
ugly ad-hoc code to recognize that in limited contexts, but discarding
constant ECs from pathkeys makes it happen cleanly and automatically.
You might object that a below-outer-join EquivalenceClass doesn't always
represent the same values at every level of the join tree, and so using
it to uniquely identify a sort order is dubious. This is true, but we
can avoid dealing with the fact explicitly because we always consider that
an outer join destroys any ordering of its nullable inputs. Thus, even
if a path was sorted by {a.x} below an outer join, we'll re-sort if that
sort ordering was important; and so using the same PathKey for both sort
orderings doesn't create any real problem.
Though Bob Devine <bob.devine@worldnet.att.net> was not involved in the Though Bob Devine <bob.devine@worldnet.att.net> was not involved in the
coding of our optimizer, he is available to field questions about coding of our optimizer, he is available to field questions about

View File

@ -4,7 +4,7 @@
# Makefile for optimizer/path # Makefile for optimizer/path
# #
# IDENTIFICATION # IDENTIFICATION
# $PostgreSQL: pgsql/src/backend/optimizer/path/Makefile,v 1.17 2007/01/20 17:16:11 petere Exp $ # $PostgreSQL: pgsql/src/backend/optimizer/path/Makefile,v 1.18 2007/01/20 20:45:38 tgl Exp $
# #
#------------------------------------------------------------------------- #-------------------------------------------------------------------------
@ -12,7 +12,7 @@ subdir = src/backend/optimizer/path
top_builddir = ../../../.. top_builddir = ../../../..
include $(top_builddir)/src/Makefile.global include $(top_builddir)/src/Makefile.global
OBJS = allpaths.o clausesel.o costsize.o indxpath.o \ OBJS = allpaths.o clausesel.o costsize.o equivclass.o indxpath.o \
joinpath.o joinrels.o orindxpath.o pathkeys.o tidpath.o joinpath.o joinrels.o orindxpath.o pathkeys.o tidpath.o
all: SUBSYS.o all: SUBSYS.o

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/allpaths.c,v 1.156 2007/01/09 02:14:12 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/path/allpaths.c,v 1.157 2007/01/20 20:45:38 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -325,6 +325,16 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
adjust_appendrel_attrs((Node *) rel->joininfo, adjust_appendrel_attrs((Node *) rel->joininfo,
appinfo); appinfo);
/*
* We have to make child entries in the EquivalenceClass data
* structures as well.
*/
if (rel->has_eclass_joins)
{
add_child_rel_equivalences(root, appinfo, rel, childrel);
childrel->has_eclass_joins = true;
}
/* /*
* Copy the parent's attr_needed data as well, with appropriate * Copy the parent's attr_needed data as well, with appropriate
* adjustment of relids and attribute numbers. * adjustment of relids and attribute numbers.

View File

@ -54,7 +54,7 @@
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.174 2007/01/10 18:06:03 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.175 2007/01/20 20:45:38 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -1258,8 +1258,6 @@ cost_mergejoin(MergePath *path, PlannerInfo *root)
Path *outer_path = path->jpath.outerjoinpath; Path *outer_path = path->jpath.outerjoinpath;
Path *inner_path = path->jpath.innerjoinpath; Path *inner_path = path->jpath.innerjoinpath;
List *mergeclauses = path->path_mergeclauses; List *mergeclauses = path->path_mergeclauses;
Oid *mergeFamilies = path->path_mergeFamilies;
int *mergeStrategies = path->path_mergeStrategies;
List *outersortkeys = path->outersortkeys; List *outersortkeys = path->outersortkeys;
List *innersortkeys = path->innersortkeys; List *innersortkeys = path->innersortkeys;
Cost startup_cost = 0; Cost startup_cost = 0;
@ -1268,7 +1266,6 @@ cost_mergejoin(MergePath *path, PlannerInfo *root)
Selectivity merge_selec; Selectivity merge_selec;
QualCost merge_qual_cost; QualCost merge_qual_cost;
QualCost qp_qual_cost; QualCost qp_qual_cost;
RestrictInfo *firstclause;
double outer_path_rows = PATH_ROWS(outer_path); double outer_path_rows = PATH_ROWS(outer_path);
double inner_path_rows = PATH_ROWS(inner_path); double inner_path_rows = PATH_ROWS(inner_path);
double outer_rows, double outer_rows,
@ -1347,32 +1344,47 @@ cost_mergejoin(MergePath *path, PlannerInfo *root)
* inputs that will actually need to be scanned. We use only the first * inputs that will actually need to be scanned. We use only the first
* (most significant) merge clause for this purpose. * (most significant) merge clause for this purpose.
* *
* Since this calculation is somewhat expensive, and will be the same for * XXX mergejoinscansel is a bit expensive, can we cache its results?
* all mergejoin paths associated with the merge clause, we cache the
* results in the RestrictInfo node. XXX that won't work anymore once
* we support multiple possible orderings!
*/ */
if (mergeclauses && path->jpath.jointype != JOIN_FULL) if (mergeclauses && path->jpath.jointype != JOIN_FULL)
{ {
firstclause = (RestrictInfo *) linitial(mergeclauses); RestrictInfo *firstclause = (RestrictInfo *) linitial(mergeclauses);
if (firstclause->left_mergescansel < 0) /* not computed yet? */ List *opathkeys;
mergejoinscansel(root, (Node *) firstclause->clause, List *ipathkeys;
mergeFamilies[0], PathKey *opathkey;
mergeStrategies[0], PathKey *ipathkey;
&firstclause->left_mergescansel, Selectivity leftscansel,
&firstclause->right_mergescansel); rightscansel;
if (bms_is_subset(firstclause->left_relids, outer_path->parent->relids)) /* Get the input pathkeys to determine the sort-order details */
opathkeys = outersortkeys ? outersortkeys : outer_path->pathkeys;
ipathkeys = innersortkeys ? innersortkeys : inner_path->pathkeys;
Assert(opathkeys);
Assert(ipathkeys);
opathkey = (PathKey *) linitial(opathkeys);
ipathkey = (PathKey *) linitial(ipathkeys);
/* debugging check */
if (opathkey->pk_opfamily != ipathkey->pk_opfamily ||
opathkey->pk_strategy != ipathkey->pk_strategy ||
opathkey->pk_nulls_first != ipathkey->pk_nulls_first)
elog(ERROR, "left and right pathkeys do not match in mergejoin");
mergejoinscansel(root, (Node *) firstclause->clause,
opathkey->pk_opfamily, opathkey->pk_strategy,
&leftscansel, &rightscansel);
if (bms_is_subset(firstclause->left_relids,
outer_path->parent->relids))
{ {
/* left side of clause is outer */ /* left side of clause is outer */
outerscansel = firstclause->left_mergescansel; outerscansel = leftscansel;
innerscansel = firstclause->right_mergescansel; innerscansel = rightscansel;
} }
else else
{ {
/* left side of clause is inner */ /* left side of clause is inner */
outerscansel = firstclause->right_mergescansel; outerscansel = rightscansel;
innerscansel = firstclause->left_mergescansel; innerscansel = leftscansel;
} }
if (path->jpath.jointype == JOIN_LEFT) if (path->jpath.jointype == JOIN_LEFT)
outerscansel = 1.0; outerscansel = 1.0;

File diff suppressed because it is too large Load Diff

View File

@ -9,7 +9,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/indxpath.c,v 1.215 2007/01/09 02:14:12 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/path/indxpath.c,v 1.216 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -32,7 +32,6 @@
#include "optimizer/var.h" #include "optimizer/var.h"
#include "utils/builtins.h" #include "utils/builtins.h"
#include "utils/lsyscache.h" #include "utils/lsyscache.h"
#include "utils/memutils.h"
#include "utils/pg_locale.h" #include "utils/pg_locale.h"
#include "utils/selfuncs.h" #include "utils/selfuncs.h"
@ -72,21 +71,11 @@ static bool match_rowcompare_to_indexcol(IndexOptInfo *index,
Oid opfamily, Oid opfamily,
RowCompareExpr *clause, RowCompareExpr *clause,
Relids outer_relids); Relids outer_relids);
static Relids indexable_outerrelids(RelOptInfo *rel); static Relids indexable_outerrelids(PlannerInfo *root, RelOptInfo *rel);
static bool matches_any_index(RestrictInfo *rinfo, RelOptInfo *rel, static bool matches_any_index(RestrictInfo *rinfo, RelOptInfo *rel,
Relids outer_relids); Relids outer_relids);
static List *find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel, static List *find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
Relids outer_relids, bool isouterjoin); Relids outer_relids, bool isouterjoin);
static ScanDirection match_variant_ordering(PlannerInfo *root,
IndexOptInfo *index,
List *restrictclauses);
static List *identify_ignorable_ordering_cols(PlannerInfo *root,
IndexOptInfo *index,
List *restrictclauses);
static bool match_index_to_query_keys(PlannerInfo *root,
IndexOptInfo *index,
ScanDirection indexscandir,
List *ignorables);
static bool match_boolean_index_clause(Node *clause, int indexcol, static bool match_boolean_index_clause(Node *clause, int indexcol,
IndexOptInfo *index); IndexOptInfo *index);
static bool match_special_index_operator(Expr *clause, Oid opfamily, static bool match_special_index_operator(Expr *clause, Oid opfamily,
@ -157,7 +146,7 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
* participate in such join clauses. We'll use this set later to * participate in such join clauses. We'll use this set later to
* recognize outer rels that are equivalent for joining purposes. * recognize outer rels that are equivalent for joining purposes.
*/ */
rel->index_outer_relids = indexable_outerrelids(rel); rel->index_outer_relids = indexable_outerrelids(root, rel);
/* /*
* Find all the index paths that are directly usable for this relation * Find all the index paths that are directly usable for this relation
@ -351,8 +340,7 @@ find_usable_indexes(PlannerInfo *root, RelOptInfo *rel,
if (index_is_ordered && istoplevel && outer_rel == NULL) if (index_is_ordered && istoplevel && outer_rel == NULL)
{ {
index_pathkeys = build_index_pathkeys(root, index, index_pathkeys = build_index_pathkeys(root, index,
ForwardScanDirection, ForwardScanDirection);
true);
useful_pathkeys = truncate_useless_pathkeys(root, rel, useful_pathkeys = truncate_useless_pathkeys(root, rel,
index_pathkeys); index_pathkeys);
} }
@ -378,23 +366,21 @@ find_usable_indexes(PlannerInfo *root, RelOptInfo *rel,
} }
/* /*
* 4. If the index is ordered, and there is a requested query ordering * 4. If the index is ordered, a backwards scan might be
* that we failed to match, consider variant ways of achieving the * interesting. Again, this is only interesting at top level.
* ordering. Again, this is only interesting at top level.
*/ */
if (index_is_ordered && istoplevel && outer_rel == NULL && if (index_is_ordered && istoplevel && outer_rel == NULL)
root->query_pathkeys != NIL &&
pathkeys_useful_for_ordering(root, useful_pathkeys) == 0)
{ {
ScanDirection scandir; index_pathkeys = build_index_pathkeys(root, index,
BackwardScanDirection);
scandir = match_variant_ordering(root, index, restrictclauses); useful_pathkeys = truncate_useless_pathkeys(root, rel,
if (!ScanDirectionIsNoMovement(scandir)) index_pathkeys);
if (useful_pathkeys != NIL)
{ {
ipath = create_index_path(root, index, ipath = create_index_path(root, index,
restrictclauses, restrictclauses,
root->query_pathkeys, useful_pathkeys,
scandir, BackwardScanDirection,
outer_rel); outer_rel);
result = lappend(result, ipath); result = lappend(result, ipath);
} }
@ -1207,19 +1193,6 @@ check_partial_indexes(PlannerInfo *root, RelOptInfo *rel)
List *restrictinfo_list = rel->baserestrictinfo; List *restrictinfo_list = rel->baserestrictinfo;
ListCell *ilist; ListCell *ilist;
/*
* Note: if Postgres tried to optimize queries by forming equivalence
* classes over equi-joined attributes (i.e., if it recognized that a
* qualification such as "where a.b=c.d and a.b=5" could make use of an
* index on c.d), then we could use that equivalence class info here with
* joininfo lists to do more complete tests for the usability of a partial
* index. For now, the test only uses restriction clauses (those in
* baserestrictinfo). --Nels, Dec '92
*
* XXX as of 7.1, equivalence class info *is* available. Consider
* improving this code as foreseen by Nels.
*/
foreach(ilist, rel->indexlist) foreach(ilist, rel->indexlist)
{ {
IndexOptInfo *index = (IndexOptInfo *) lfirst(ilist); IndexOptInfo *index = (IndexOptInfo *) lfirst(ilist);
@ -1242,18 +1215,19 @@ check_partial_indexes(PlannerInfo *root, RelOptInfo *rel)
* for the specified table. Returns a set of relids. * for the specified table. Returns a set of relids.
*/ */
static Relids static Relids
indexable_outerrelids(RelOptInfo *rel) indexable_outerrelids(PlannerInfo *root, RelOptInfo *rel)
{ {
Relids outer_relids = NULL; Relids outer_relids = NULL;
ListCell *l; bool is_child_rel = (rel->reloptkind == RELOPT_OTHER_MEMBER_REL);
ListCell *lc1;
/* /*
* Examine each joinclause in the joininfo list to see if it matches any * Examine each joinclause in the joininfo list to see if it matches any
* key of any index. If so, add the clause's other rels to the result. * key of any index. If so, add the clause's other rels to the result.
*/ */
foreach(l, rel->joininfo) foreach(lc1, rel->joininfo)
{ {
RestrictInfo *joininfo = (RestrictInfo *) lfirst(l); RestrictInfo *joininfo = (RestrictInfo *) lfirst(lc1);
Relids other_rels; Relids other_rels;
other_rels = bms_difference(joininfo->required_relids, rel->relids); other_rels = bms_difference(joininfo->required_relids, rel->relids);
@ -1263,6 +1237,71 @@ indexable_outerrelids(RelOptInfo *rel)
bms_free(other_rels); bms_free(other_rels);
} }
/*
* We also have to look through the query's EquivalenceClasses to see
* if any of them could generate indexable join conditions for this rel.
*/
if (rel->has_eclass_joins)
{
foreach(lc1, root->eq_classes)
{
EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1);
Relids other_rels = NULL;
bool found_index = false;
ListCell *lc2;
/*
* Won't generate joinclauses if const or single-member (the latter
* test covers the volatile case too)
*/
if (cur_ec->ec_has_const || list_length(cur_ec->ec_members) <= 1)
continue;
/*
* Note we don't test ec_broken; if we did, we'd need a separate
* code path to look through ec_sources. Checking the members
* anyway is OK as a possibly-overoptimistic heuristic.
*/
/*
* No point in searching if rel not mentioned in eclass (but we
* can't tell that for a child rel).
*/
if (!is_child_rel &&
!bms_is_subset(rel->relids, cur_ec->ec_relids))
continue;
/*
* Scan members, looking for both an index match and join
* candidates
*/
foreach(lc2, cur_ec->ec_members)
{
EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
/* Join candidate? */
if (!cur_em->em_is_child &&
!bms_overlap(cur_em->em_relids, rel->relids))
{
other_rels = bms_add_members(other_rels,
cur_em->em_relids);
continue;
}
/* Check for index match (only need one) */
if (!found_index &&
bms_equal(cur_em->em_relids, rel->relids) &&
eclass_matches_any_index(cur_ec, cur_em, rel))
found_index = true;
}
if (found_index)
outer_relids = bms_join(outer_relids, other_rels);
else
bms_free(other_rels);
}
}
return outer_relids; return outer_relids;
} }
@ -1339,6 +1378,42 @@ matches_any_index(RestrictInfo *rinfo, RelOptInfo *rel, Relids outer_relids)
return false; return false;
} }
/*
* eclass_matches_any_index
* Workhorse for indexable_outerrelids: see if an EquivalenceClass member
* can be matched to any index column of the given rel.
*
* This is also exported for use by find_eclass_clauses_for_index_join.
*/
bool
eclass_matches_any_index(EquivalenceClass *ec, EquivalenceMember *em,
RelOptInfo *rel)
{
ListCell *l;
foreach(l, rel->indexlist)
{
IndexOptInfo *index = (IndexOptInfo *) lfirst(l);
int indexcol = 0;
Oid *families = index->opfamily;
do
{
Oid curFamily = families[0];
if (list_member_oid(ec->ec_opfamilies, curFamily) &&
match_index_to_operand((Node *) em->em_expr, indexcol, index))
return true;
indexcol++;
families++;
} while (!DoneMatchingIndexKeys(families));
}
return false;
}
/* /*
* best_inner_indexscan * best_inner_indexscan
* Finds the best available inner indexscan for a nestloop join * Finds the best available inner indexscan for a nestloop join
@ -1393,12 +1468,12 @@ best_inner_indexscan(PlannerInfo *root, RelOptInfo *rel,
return NULL; return NULL;
/* /*
* Otherwise, we have to do path selection in the memory context of the * Otherwise, we have to do path selection in the main planning context,
* given rel, so that any created path can be safely attached to the rel's * so that any created path can be safely attached to the rel's cache of
* cache of best inner paths. (This is not currently an issue for normal * best inner paths. (This is not currently an issue for normal planning,
* planning, but it is an issue for GEQO planning.) * but it is an issue for GEQO planning.)
*/ */
oldcontext = MemoryContextSwitchTo(GetMemoryChunkContext(rel)); oldcontext = MemoryContextSwitchTo(root->planner_cxt);
/* /*
* Intersect the given outer relids with index_outer_relids to find the * Intersect the given outer relids with index_outer_relids to find the
@ -1539,7 +1614,12 @@ find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
Relids join_relids; Relids join_relids;
ListCell *l; ListCell *l;
/* Look for joinclauses that are usable with given outer_relids */ /*
* Look for joinclauses that are usable with given outer_relids. Note
* we'll take anything that's applicable to the join whether it has
* anything to do with an index or not; since we're only building a list,
* it's not worth filtering more finely here.
*/
join_relids = bms_union(rel->relids, outer_relids); join_relids = bms_union(rel->relids, outer_relids);
foreach(l, rel->joininfo) foreach(l, rel->joininfo)
@ -1557,276 +1637,27 @@ find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
bms_free(join_relids); bms_free(join_relids);
/* if no join clause was matched then forget it, per comments above */ /*
* Also check to see if any EquivalenceClasses can produce a relevant
* joinclause. Since all such clauses are effectively pushed-down,
* this doesn't apply to outer joins.
*/
if (!isouterjoin && rel->has_eclass_joins)
clause_list = list_concat(clause_list,
find_eclass_clauses_for_index_join(root,
rel,
outer_relids));
/* If no join clause was matched then forget it, per comments above */
if (clause_list == NIL) if (clause_list == NIL)
return NIL; return NIL;
/* /* We can also use any plain restriction clauses for the rel */
* We can also use any plain restriction clauses for the rel. We put
* these at the front of the clause list for the convenience of
* remove_redundant_join_clauses, which can never remove non-join clauses
* and hence won't be able to get rid of a non-join clause if it appears
* after a join clause it is redundant with.
*/
clause_list = list_concat(list_copy(rel->baserestrictinfo), clause_list); clause_list = list_concat(list_copy(rel->baserestrictinfo), clause_list);
/*
* We may now have clauses that are known redundant. Get rid of 'em.
*/
if (list_length(clause_list) > 1)
{
clause_list = remove_redundant_join_clauses(root,
clause_list,
isouterjoin);
}
return clause_list; return clause_list;
} }
/****************************************************************************
* ---- ROUTINES TO HANDLE PATHKEYS ----
****************************************************************************/
/*
* match_variant_ordering
* Try to match an index's ordering to the query's requested ordering
*
* This is used when the index is ordered but a naive comparison fails to
* match its ordering (pathkeys) to root->query_pathkeys. It may be that
* we need to scan the index backwards. Also, a less naive comparison can
* help for both forward and backward indexscans. Columns of the index
* that have an equality restriction clause can be ignored in the match;
* that is, an index on (x,y) can be considered to match the ordering of
* ... WHERE x = 42 ORDER BY y;
*
* Note: it would be possible to similarly ignore useless ORDER BY items;
* that is, an index on just y could be considered to match the ordering of
* ... WHERE x = 42 ORDER BY x, y;
* But proving that this is safe would require finding a btree opfamily
* containing both the = operator and the < or > operator in the ORDER BY
* item. That's significantly more expensive than what we do here, since
* we'd have to look at restriction clauses unrelated to the current index
* and search for opfamilies without any hint from the index. The practical
* use-cases seem to be mostly covered by ignoring index columns, so that's
* all we do for now.
*
* Inputs:
* 'index' is the index of interest.
* 'restrictclauses' is the list of sublists of restriction clauses
* matching the columns of the index (NIL if none)
*
* If able to match the requested query pathkeys, returns either
* ForwardScanDirection or BackwardScanDirection to indicate the proper index
* scan direction. If no match, returns NoMovementScanDirection.
*/
static ScanDirection
match_variant_ordering(PlannerInfo *root,
IndexOptInfo *index,
List *restrictclauses)
{
List *ignorables;
/*
* Forget the whole thing if not a btree index; our check for ignorable
* columns assumes we are dealing with btree opfamilies. (It'd be possible
* to factor out just the try for backwards indexscan, but considering
* that we presently have no orderable indexes except btrees anyway, it's
* hardly worth contorting this code for that case.)
*
* Note: if you remove this, you probably need to put in a check on
* amoptionalkey to prevent possible clauseless scan on an index that
* won't cope.
*/
if (index->relam != BTREE_AM_OID)
return NoMovementScanDirection;
/*
* Figure out which index columns can be optionally ignored because they
* have an equality constraint. This is the same set for either forward
* or backward scan, so we do it just once.
*/
ignorables = identify_ignorable_ordering_cols(root, index,
restrictclauses);
/*
* Try to match to forward scan, then backward scan. However, we can skip
* the forward-scan case if there are no ignorable columns, because
* find_usable_indexes() would have found the match already.
*/
if (ignorables &&
match_index_to_query_keys(root, index, ForwardScanDirection,
ignorables))
return ForwardScanDirection;
if (match_index_to_query_keys(root, index, BackwardScanDirection,
ignorables))
return BackwardScanDirection;
return NoMovementScanDirection;
}
/*
* identify_ignorable_ordering_cols
* Determine which index columns can be ignored for ordering purposes
*
* Returns an integer List of column numbers (1-based) of ignorable
* columns. The ignorable columns are those that have equality constraints
* against pseudoconstants.
*/
static List *
identify_ignorable_ordering_cols(PlannerInfo *root,
IndexOptInfo *index,
List *restrictclauses)
{
List *result = NIL;
int indexcol = 0; /* note this is 0-based */
ListCell *l;
/* restrictclauses is either NIL or has a sublist per column */
foreach(l, restrictclauses)
{
List *sublist = (List *) lfirst(l);
Oid opfamily = index->opfamily[indexcol];
ListCell *l2;
foreach(l2, sublist)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(l2);
OpExpr *clause = (OpExpr *) rinfo->clause;
Oid clause_op;
int op_strategy;
bool varonleft;
bool ispc;
/* First check for boolean-index cases. */
if (IsBooleanOpfamily(opfamily))
{
if (match_boolean_index_clause((Node *) clause, indexcol,
index))
{
/*
* The clause means either col = TRUE or col = FALSE; we
* do not care which, it's an equality constraint either
* way.
*/
result = lappend_int(result, indexcol + 1);
break;
}
}
/* Otherwise, ignore if not a binary opclause */
if (!is_opclause(clause) || list_length(clause->args) != 2)
continue;
/* Determine left/right sides and check the operator */
clause_op = clause->opno;
if (match_index_to_operand(linitial(clause->args), indexcol,
index))
{
/* clause_op is correct */
varonleft = true;
}
else
{
Assert(match_index_to_operand(lsecond(clause->args), indexcol,
index));
/* Must flip operator to get the opfamily member */
clause_op = get_commutator(clause_op);
varonleft = false;
}
if (!OidIsValid(clause_op))
continue; /* ignore non match, per next comment */
op_strategy = get_op_opfamily_strategy(clause_op, opfamily);
/*
* You might expect to see Assert(op_strategy != 0) here, but you
* won't: the clause might contain a special indexable operator
* rather than an ordinary opfamily member. Currently none of the
* special operators are very likely to expand to an equality
* operator; we do not bother to check, but just assume no match.
*/
if (op_strategy != BTEqualStrategyNumber)
continue;
/* Now check that other side is pseudoconstant */
if (varonleft)
ispc = is_pseudo_constant_clause_relids(lsecond(clause->args),
rinfo->right_relids);
else
ispc = is_pseudo_constant_clause_relids(linitial(clause->args),
rinfo->left_relids);
if (ispc)
{
result = lappend_int(result, indexcol + 1);
break;
}
}
indexcol++;
}
return result;
}
/*
* match_index_to_query_keys
* Check a single scan direction for "intelligent" match to query keys
*
* 'index' is the index of interest.
* 'indexscandir' is the scan direction to consider
* 'ignorables' is an integer list of indexes of ignorable index columns
*
* Returns TRUE on successful match (ie, the query_pathkeys can be considered
* to match this index).
*/
static bool
match_index_to_query_keys(PlannerInfo *root,
IndexOptInfo *index,
ScanDirection indexscandir,
List *ignorables)
{
List *index_pathkeys;
ListCell *index_cell;
int index_col;
ListCell *r;
/* Get the pathkeys that exactly describe the index */
index_pathkeys = build_index_pathkeys(root, index, indexscandir, false);
/*
* Can we match to the query's requested pathkeys? The inner loop skips
* over ignorable index columns while trying to match.
*/
index_cell = list_head(index_pathkeys);
index_col = 0;
foreach(r, root->query_pathkeys)
{
List *rsubkey = (List *) lfirst(r);
for (;;)
{
List *isubkey;
if (index_cell == NULL)
return false;
isubkey = (List *) lfirst(index_cell);
index_cell = lnext(index_cell);
index_col++; /* index_col is now 1-based */
/*
* Since we are dealing with canonicalized pathkeys, pointer
* comparison is sufficient to determine a match.
*/
if (rsubkey == isubkey)
break; /* matched current query pathkey */
if (!list_member_int(ignorables, index_col))
return false; /* definite failure to match */
/* otherwise loop around and try to match to next index col */
}
}
return true;
}
/**************************************************************************** /****************************************************************************
* ---- PATH CREATION UTILITIES ---- * ---- PATH CREATION UTILITIES ----

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/joinpath.c,v 1.110 2007/01/10 18:06:03 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/path/joinpath.c,v 1.111 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -16,7 +16,6 @@
#include <math.h> #include <math.h>
#include "access/skey.h"
#include "optimizer/cost.h" #include "optimizer/cost.h"
#include "optimizer/pathnode.h" #include "optimizer/pathnode.h"
#include "optimizer/paths.h" #include "optimizer/paths.h"
@ -40,10 +39,6 @@ static List *select_mergejoin_clauses(RelOptInfo *joinrel,
RelOptInfo *innerrel, RelOptInfo *innerrel,
List *restrictlist, List *restrictlist,
JoinType jointype); JoinType jointype);
static void build_mergejoin_strat_arrays(List *mergeclauses,
Oid **mergefamilies,
int **mergestrategies,
bool **mergenullsfirst);
/* /*
@ -205,9 +200,9 @@ sort_inner_and_outer(PlannerInfo *root,
* *
* Actually, it's not quite true that every mergeclause ordering will * Actually, it's not quite true that every mergeclause ordering will
* generate a different path order, because some of the clauses may be * generate a different path order, because some of the clauses may be
* redundant. Therefore, what we do is convert the mergeclause list to a * partially redundant (refer to the same EquivalenceClasses). Therefore,
* list of canonical pathkeys, and then consider different orderings of * what we do is convert the mergeclause list to a list of canonical
* the pathkeys. * pathkeys, and then consider different orderings of the pathkeys.
* *
* Generating a path for *every* permutation of the pathkeys doesn't seem * Generating a path for *every* permutation of the pathkeys doesn't seem
* like a winning strategy; the cost in planning time is too high. For * like a winning strategy; the cost in planning time is too high. For
@ -216,76 +211,59 @@ sort_inner_and_outer(PlannerInfo *root,
* mergejoin without re-sorting against any other possible mergejoin * mergejoin without re-sorting against any other possible mergejoin
* partner path. But if we've not guessed the right ordering of secondary * partner path. But if we've not guessed the right ordering of secondary
* keys, we may end up evaluating clauses as qpquals when they could have * keys, we may end up evaluating clauses as qpquals when they could have
* been done as mergeclauses. We need to figure out a better way. (Two * been done as mergeclauses. (In practice, it's rare that there's more
* possible approaches: look at all the relevant index relations to * than two or three mergeclauses, so expending a huge amount of thought
* suggest plausible sort orders, or make just one output path and somehow * on that is probably not worth it.)
* mark it as having a sort-order that can be rearranged freely.) *
* The pathkey order returned by select_outer_pathkeys_for_merge() has
* some heuristics behind it (see that function), so be sure to try it
* exactly as-is as well as making variants.
*/ */
all_pathkeys = make_pathkeys_for_mergeclauses(root, all_pathkeys = select_outer_pathkeys_for_merge(root,
mergeclause_list, mergeclause_list,
outerrel); joinrel);
foreach(l, all_pathkeys) foreach(l, all_pathkeys)
{ {
List *front_pathkey = (List *) lfirst(l); List *front_pathkey = (List *) lfirst(l);
List *cur_pathkeys;
List *cur_mergeclauses; List *cur_mergeclauses;
Oid *mergefamilies;
int *mergestrategies;
bool *mergenullsfirst;
List *outerkeys; List *outerkeys;
List *innerkeys; List *innerkeys;
List *merge_pathkeys; List *merge_pathkeys;
/* Make a pathkey list with this guy first. */ /* Make a pathkey list with this guy first */
if (l != list_head(all_pathkeys)) if (l != list_head(all_pathkeys))
cur_pathkeys = lcons(front_pathkey, outerkeys = lcons(front_pathkey,
list_delete_ptr(list_copy(all_pathkeys), list_delete_ptr(list_copy(all_pathkeys),
front_pathkey)); front_pathkey));
else else
cur_pathkeys = all_pathkeys; /* no work at first one... */ outerkeys = all_pathkeys; /* no work at first one... */
/* /* Sort the mergeclauses into the corresponding ordering */
* Select mergeclause(s) that match this sort ordering. If we had
* redundant merge clauses then we will get a subset of the original
* clause list. There had better be some match, however...
*/
cur_mergeclauses = find_mergeclauses_for_pathkeys(root, cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
cur_pathkeys, outerkeys,
true,
mergeclause_list); mergeclause_list);
Assert(cur_mergeclauses != NIL);
/* Forget it if can't use all the clauses in right/full join */ /* Should have used them all... */
if (useallclauses && Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
list_length(cur_mergeclauses) != list_length(mergeclause_list))
continue; /* Build sort pathkeys for the inner side */
innerkeys = make_inner_pathkeys_for_merge(root,
cur_mergeclauses,
outerkeys);
/* Build pathkeys representing output sort order */
merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
outerkeys);
/* /*
* Build sort pathkeys for both sides. * And now we can make the path.
* *
* Note: it's possible that the cheapest paths will already be sorted * Note: it's possible that the cheapest paths will already be sorted
* properly. create_mergejoin_path will detect that case and suppress * properly. create_mergejoin_path will detect that case and suppress
* an explicit sort step, so we needn't do so here. * an explicit sort step, so we needn't do so here.
*/ */
outerkeys = make_pathkeys_for_mergeclauses(root,
cur_mergeclauses,
outerrel);
innerkeys = make_pathkeys_for_mergeclauses(root,
cur_mergeclauses,
innerrel);
/* Build pathkeys representing output sort order. */
merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
outerkeys);
/* Build opfamily info for execution */
build_mergejoin_strat_arrays(cur_mergeclauses,
&mergefamilies,
&mergestrategies,
&mergenullsfirst);
/*
* And now we can make the path.
*/
add_path(joinrel, (Path *) add_path(joinrel, (Path *)
create_mergejoin_path(root, create_mergejoin_path(root,
joinrel, joinrel,
@ -295,9 +273,6 @@ sort_inner_and_outer(PlannerInfo *root,
restrictlist, restrictlist,
merge_pathkeys, merge_pathkeys,
cur_mergeclauses, cur_mergeclauses,
mergefamilies,
mergestrategies,
mergenullsfirst,
outerkeys, outerkeys,
innerkeys)); innerkeys));
} }
@ -427,9 +402,6 @@ match_unsorted_outer(PlannerInfo *root,
Path *outerpath = (Path *) lfirst(l); Path *outerpath = (Path *) lfirst(l);
List *merge_pathkeys; List *merge_pathkeys;
List *mergeclauses; List *mergeclauses;
Oid *mergefamilies;
int *mergestrategies;
bool *mergenullsfirst;
List *innersortkeys; List *innersortkeys;
List *trialsortkeys; List *trialsortkeys;
Path *cheapest_startup_inner; Path *cheapest_startup_inner;
@ -510,6 +482,7 @@ match_unsorted_outer(PlannerInfo *root,
/* Look for useful mergeclauses (if any) */ /* Look for useful mergeclauses (if any) */
mergeclauses = find_mergeclauses_for_pathkeys(root, mergeclauses = find_mergeclauses_for_pathkeys(root,
outerpath->pathkeys, outerpath->pathkeys,
true,
mergeclause_list); mergeclause_list);
/* /*
@ -532,15 +505,9 @@ match_unsorted_outer(PlannerInfo *root,
continue; continue;
/* Compute the required ordering of the inner path */ /* Compute the required ordering of the inner path */
innersortkeys = make_pathkeys_for_mergeclauses(root, innersortkeys = make_inner_pathkeys_for_merge(root,
mergeclauses, mergeclauses,
innerrel); outerpath->pathkeys);
/* Build opfamily info for execution */
build_mergejoin_strat_arrays(mergeclauses,
&mergefamilies,
&mergestrategies,
&mergenullsfirst);
/* /*
* Generate a mergejoin on the basis of sorting the cheapest inner. * Generate a mergejoin on the basis of sorting the cheapest inner.
@ -557,9 +524,6 @@ match_unsorted_outer(PlannerInfo *root,
restrictlist, restrictlist,
merge_pathkeys, merge_pathkeys,
mergeclauses, mergeclauses,
mergefamilies,
mergestrategies,
mergenullsfirst,
NIL, NIL,
innersortkeys)); innersortkeys));
@ -613,18 +577,12 @@ match_unsorted_outer(PlannerInfo *root,
newclauses = newclauses =
find_mergeclauses_for_pathkeys(root, find_mergeclauses_for_pathkeys(root,
trialsortkeys, trialsortkeys,
false,
mergeclauses); mergeclauses);
Assert(newclauses != NIL); Assert(newclauses != NIL);
} }
else else
newclauses = mergeclauses; newclauses = mergeclauses;
/* Build opfamily info for execution */
build_mergejoin_strat_arrays(newclauses,
&mergefamilies,
&mergestrategies,
&mergenullsfirst);
add_path(joinrel, (Path *) add_path(joinrel, (Path *)
create_mergejoin_path(root, create_mergejoin_path(root,
joinrel, joinrel,
@ -634,9 +592,6 @@ match_unsorted_outer(PlannerInfo *root,
restrictlist, restrictlist,
merge_pathkeys, merge_pathkeys,
newclauses, newclauses,
mergefamilies,
mergestrategies,
mergenullsfirst,
NIL, NIL,
NIL)); NIL));
cheapest_total_inner = innerpath; cheapest_total_inner = innerpath;
@ -666,19 +621,13 @@ match_unsorted_outer(PlannerInfo *root,
newclauses = newclauses =
find_mergeclauses_for_pathkeys(root, find_mergeclauses_for_pathkeys(root,
trialsortkeys, trialsortkeys,
false,
mergeclauses); mergeclauses);
Assert(newclauses != NIL); Assert(newclauses != NIL);
} }
else else
newclauses = mergeclauses; newclauses = mergeclauses;
} }
/* Build opfamily info for execution */
build_mergejoin_strat_arrays(newclauses,
&mergefamilies,
&mergestrategies,
&mergenullsfirst);
add_path(joinrel, (Path *) add_path(joinrel, (Path *)
create_mergejoin_path(root, create_mergejoin_path(root,
joinrel, joinrel,
@ -688,9 +637,6 @@ match_unsorted_outer(PlannerInfo *root,
restrictlist, restrictlist,
merge_pathkeys, merge_pathkeys,
newclauses, newclauses,
mergefamilies,
mergestrategies,
mergenullsfirst,
NIL, NIL,
NIL)); NIL));
} }
@ -909,6 +855,10 @@ best_appendrel_indexscan(PlannerInfo *root, RelOptInfo *rel,
* Select mergejoin clauses that are usable for a particular join. * Select mergejoin clauses that are usable for a particular join.
* Returns a list of RestrictInfo nodes for those clauses. * Returns a list of RestrictInfo nodes for those clauses.
* *
* We also mark each selected RestrictInfo to show which side is currently
* being considered as outer. These are transient markings that are only
* good for the duration of the current add_paths_to_joinrel() call!
*
* We examine each restrictinfo clause known for the join to see * We examine each restrictinfo clause known for the join to see
* if it is mergejoinable and involves vars from the two sub-relations * if it is mergejoinable and involves vars from the two sub-relations
* currently of interest. * currently of interest.
@ -939,7 +889,7 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
continue; continue;
if (!restrictinfo->can_join || if (!restrictinfo->can_join ||
restrictinfo->mergejoinoperator == InvalidOid) restrictinfo->mergeopfamilies == NIL)
{ {
have_nonmergeable_joinclause = true; have_nonmergeable_joinclause = true;
continue; /* not mergejoinable */ continue; /* not mergejoinable */
@ -954,11 +904,13 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
bms_is_subset(restrictinfo->right_relids, innerrel->relids)) bms_is_subset(restrictinfo->right_relids, innerrel->relids))
{ {
/* righthand side is inner */ /* righthand side is inner */
restrictinfo->outer_is_left = true;
} }
else if (bms_is_subset(restrictinfo->left_relids, innerrel->relids) && else if (bms_is_subset(restrictinfo->left_relids, innerrel->relids) &&
bms_is_subset(restrictinfo->right_relids, outerrel->relids)) bms_is_subset(restrictinfo->right_relids, outerrel->relids))
{ {
/* lefthand side is inner */ /* lefthand side is inner */
restrictinfo->outer_is_left = false;
} }
else else
{ {
@ -966,7 +918,7 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
continue; /* no good for these input relations */ continue; /* no good for these input relations */
} }
result_list = lcons(restrictinfo, result_list); result_list = lappend(result_list, restrictinfo);
} }
/* /*
@ -995,46 +947,3 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
return result_list; return result_list;
} }
/*
* Temporary hack to build opfamily and strategy info needed for mergejoin
* by the executor. We need to rethink the planner's handling of merge
* planning so that it can deal with multiple possible merge orders, but
* that's not done yet.
*/
static void
build_mergejoin_strat_arrays(List *mergeclauses,
Oid **mergefamilies,
int **mergestrategies,
bool **mergenullsfirst)
{
int nClauses = list_length(mergeclauses);
int i;
ListCell *l;
*mergefamilies = (Oid *) palloc(nClauses * sizeof(Oid));
*mergestrategies = (int *) palloc(nClauses * sizeof(int));
*mergenullsfirst = (bool *) palloc(nClauses * sizeof(bool));
i = 0;
foreach(l, mergeclauses)
{
RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
/*
* We do not need to worry about whether the mergeclause will be
* commuted at runtime --- it's the same opfamily either way.
*/
(*mergefamilies)[i] = restrictinfo->mergeopfamily;
/*
* For the moment, strategy must always be LessThan --- see
* hack version of get_op_mergejoin_info
*/
(*mergestrategies)[i] = BTLessStrategyNumber;
/* And we only allow NULLS LAST, too */
(*mergenullsfirst)[i] = false;
i++;
}
}

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/path/joinrels.c,v 1.83 2007/01/05 22:19:31 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/path/joinrels.c,v 1.84 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -72,7 +72,7 @@ make_rels_by_joins(PlannerInfo *root, int level, List **joinrels)
other_rels = list_head(joinrels[1]); /* consider all initial other_rels = list_head(joinrels[1]); /* consider all initial
* rels */ * rels */
if (old_rel->joininfo != NIL) if (old_rel->joininfo != NIL || old_rel->has_eclass_joins)
{ {
/* /*
* Note that if all available join clauses for this rel require * Note that if all available join clauses for this rel require
@ -152,7 +152,8 @@ make_rels_by_joins(PlannerInfo *root, int level, List **joinrels)
* outer joins --- then we might have to force a bushy outer * outer joins --- then we might have to force a bushy outer
* join. See have_relevant_joinclause(). * join. See have_relevant_joinclause().
*/ */
if (old_rel->joininfo == NIL && root->oj_info_list == NIL) if (old_rel->joininfo == NIL && !old_rel->has_eclass_joins &&
root->oj_info_list == NIL)
continue; continue;
if (k == other_level) if (k == other_level)
@ -251,8 +252,7 @@ make_rels_by_joins(PlannerInfo *root, int level, List **joinrels)
/* /*
* make_rels_by_clause_joins * make_rels_by_clause_joins
* Build joins between the given relation 'old_rel' and other relations * Build joins between the given relation 'old_rel' and other relations
* that are mentioned within old_rel's joininfo list (i.e., relations * that participate in join clauses that 'old_rel' also participates in.
* that participate in join clauses that 'old_rel' also participates in).
* The join rel nodes are returned in a list. * The join rel nodes are returned in a list.
* *
* 'old_rel' is the relation entry for the relation to be joined * 'old_rel' is the relation entry for the relation to be joined

File diff suppressed because it is too large Load Diff

View File

@ -10,7 +10,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/plan/createplan.c,v 1.221 2007/01/10 18:06:03 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/plan/createplan.c,v 1.222 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -121,8 +121,6 @@ static MergeJoin *make_mergejoin(List *tlist,
JoinType jointype); JoinType jointype);
static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols, static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
AttrNumber *sortColIdx, Oid *sortOperators, bool *nullsFirst); AttrNumber *sortColIdx, Oid *sortOperators, bool *nullsFirst);
static Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
List *pathkeys);
/* /*
@ -1425,23 +1423,21 @@ create_nestloop_plan(PlannerInfo *root,
* that have to be checked as qpquals at the join node. * that have to be checked as qpquals at the join node.
* *
* We can also remove any join clauses that are redundant with those * We can also remove any join clauses that are redundant with those
* being used in the index scan; prior redundancy checks will not have * being used in the index scan; this check is needed because
* caught this case because the join clauses would never have been put * find_eclass_clauses_for_index_join() may emit different clauses
* in the same joininfo list. * than generate_join_implied_equalities() did.
* *
* We can skip this if the index path is an ordinary indexpath and not * We can skip this if the index path is an ordinary indexpath and not
* a special innerjoin path. * a special innerjoin path, since it then wouldn't be using any join
* clauses.
*/ */
IndexPath *innerpath = (IndexPath *) best_path->innerjoinpath; IndexPath *innerpath = (IndexPath *) best_path->innerjoinpath;
if (innerpath->isjoininner) if (innerpath->isjoininner)
{
joinrestrictclauses = joinrestrictclauses =
select_nonredundant_join_clauses(root, select_nonredundant_join_clauses(root,
joinrestrictclauses, joinrestrictclauses,
innerpath->indexclauses, innerpath->indexclauses);
IS_OUTER_JOIN(best_path->jointype));
}
} }
else if (IsA(best_path->innerjoinpath, BitmapHeapPath)) else if (IsA(best_path->innerjoinpath, BitmapHeapPath))
{ {
@ -1471,8 +1467,7 @@ create_nestloop_plan(PlannerInfo *root,
joinrestrictclauses = joinrestrictclauses =
select_nonredundant_join_clauses(root, select_nonredundant_join_clauses(root,
joinrestrictclauses, joinrestrictclauses,
bitmapclauses, bitmapclauses);
IS_OUTER_JOIN(best_path->jointype));
} }
} }
@ -1516,7 +1511,21 @@ create_mergejoin_plan(PlannerInfo *root,
List *joinclauses; List *joinclauses;
List *otherclauses; List *otherclauses;
List *mergeclauses; List *mergeclauses;
List *outerpathkeys;
List *innerpathkeys;
int nClauses;
Oid *mergefamilies;
int *mergestrategies;
bool *mergenullsfirst;
MergeJoin *join_plan; MergeJoin *join_plan;
int i;
EquivalenceClass *lastoeclass;
EquivalenceClass *lastieclass;
PathKey *opathkey;
PathKey *ipathkey;
ListCell *lc;
ListCell *lop;
ListCell *lip;
/* Get the join qual clauses (in plain expression form) */ /* Get the join qual clauses (in plain expression form) */
/* Any pseudoconstant clauses are ignored here */ /* Any pseudoconstant clauses are ignored here */
@ -1542,7 +1551,8 @@ create_mergejoin_plan(PlannerInfo *root,
/* /*
* Rearrange mergeclauses, if needed, so that the outer variable is always * Rearrange mergeclauses, if needed, so that the outer variable is always
* on the left. * on the left; mark the mergeclause restrictinfos with correct
* outer_is_left status.
*/ */
mergeclauses = get_switched_clauses(best_path->path_mergeclauses, mergeclauses = get_switched_clauses(best_path->path_mergeclauses,
best_path->jpath.outerjoinpath->parent->relids); best_path->jpath.outerjoinpath->parent->relids);
@ -1564,7 +1574,10 @@ create_mergejoin_plan(PlannerInfo *root,
make_sort_from_pathkeys(root, make_sort_from_pathkeys(root,
outer_plan, outer_plan,
best_path->outersortkeys); best_path->outersortkeys);
outerpathkeys = best_path->outersortkeys;
} }
else
outerpathkeys = best_path->jpath.outerjoinpath->pathkeys;
if (best_path->innersortkeys) if (best_path->innersortkeys)
{ {
@ -1573,7 +1586,86 @@ create_mergejoin_plan(PlannerInfo *root,
make_sort_from_pathkeys(root, make_sort_from_pathkeys(root,
inner_plan, inner_plan,
best_path->innersortkeys); best_path->innersortkeys);
innerpathkeys = best_path->innersortkeys;
} }
else
innerpathkeys = best_path->jpath.innerjoinpath->pathkeys;
/*
* Compute the opfamily/strategy/nullsfirst arrays needed by the executor.
* The information is in the pathkeys for the two inputs, but we need to
* be careful about the possibility of mergeclauses sharing a pathkey
* (compare find_mergeclauses_for_pathkeys()).
*/
nClauses = list_length(mergeclauses);
Assert(nClauses == list_length(best_path->path_mergeclauses));
mergefamilies = (Oid *) palloc(nClauses * sizeof(Oid));
mergestrategies = (int *) palloc(nClauses * sizeof(int));
mergenullsfirst = (bool *) palloc(nClauses * sizeof(bool));
lastoeclass = NULL;
lastieclass = NULL;
opathkey = NULL;
ipathkey = NULL;
lop = list_head(outerpathkeys);
lip = list_head(innerpathkeys);
i = 0;
foreach(lc, best_path->path_mergeclauses)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(lc);
EquivalenceClass *oeclass;
EquivalenceClass *ieclass;
/* fetch outer/inner eclass from mergeclause */
Assert(IsA(rinfo, RestrictInfo));
if (rinfo->outer_is_left)
{
oeclass = rinfo->left_ec;
ieclass = rinfo->right_ec;
}
else
{
oeclass = rinfo->right_ec;
ieclass = rinfo->left_ec;
}
Assert(oeclass != NULL);
Assert(ieclass != NULL);
/* should match current or next pathkeys */
/* we check this carefully for debugging reasons */
if (oeclass != lastoeclass)
{
if (!lop)
elog(ERROR, "too few pathkeys for mergeclauses");
opathkey = (PathKey *) lfirst(lop);
lop = lnext(lop);
lastoeclass = opathkey->pk_eclass;
if (oeclass != lastoeclass)
elog(ERROR, "outer pathkeys do not match mergeclause");
}
if (ieclass != lastieclass)
{
if (!lip)
elog(ERROR, "too few pathkeys for mergeclauses");
ipathkey = (PathKey *) lfirst(lip);
lip = lnext(lip);
lastieclass = ipathkey->pk_eclass;
if (ieclass != lastieclass)
elog(ERROR, "inner pathkeys do not match mergeclause");
}
/* pathkeys should match each other too (more debugging) */
if (opathkey->pk_opfamily != ipathkey->pk_opfamily ||
opathkey->pk_strategy != ipathkey->pk_strategy ||
opathkey->pk_nulls_first != ipathkey->pk_nulls_first)
elog(ERROR, "left and right pathkeys do not match in mergejoin");
/* OK, save info for executor */
mergefamilies[i] = opathkey->pk_opfamily;
mergestrategies[i] = opathkey->pk_strategy;
mergenullsfirst[i] = opathkey->pk_nulls_first;
i++;
}
/* /*
* Now we can build the mergejoin node. * Now we can build the mergejoin node.
@ -1582,9 +1674,9 @@ create_mergejoin_plan(PlannerInfo *root,
joinclauses, joinclauses,
otherclauses, otherclauses,
mergeclauses, mergeclauses,
best_path->path_mergeFamilies, mergefamilies,
best_path->path_mergeStrategies, mergestrategies,
best_path->path_mergeNullsFirst, mergenullsfirst,
outer_plan, outer_plan,
inner_plan, inner_plan,
best_path->jpath.jointype); best_path->jpath.jointype);
@ -1921,8 +2013,9 @@ fix_indexqual_operand(Node *node, IndexOptInfo *index, Oid *opfamily)
* Given a list of merge or hash joinclauses (as RestrictInfo nodes), * Given a list of merge or hash joinclauses (as RestrictInfo nodes),
* extract the bare clauses, and rearrange the elements within the * extract the bare clauses, and rearrange the elements within the
* clauses, if needed, so the outer join variable is on the left and * clauses, if needed, so the outer join variable is on the left and
* the inner is on the right. The original data structure is not touched; * the inner is on the right. The original clause data structure is not
* a modified list is returned. * touched; a modified list is returned. We do, however, set the transient
* outer_is_left field in each RestrictInfo to show which side was which.
*/ */
static List * static List *
get_switched_clauses(List *clauses, Relids outerrelids) get_switched_clauses(List *clauses, Relids outerrelids)
@ -1953,9 +2046,14 @@ get_switched_clauses(List *clauses, Relids outerrelids)
/* Commute it --- note this modifies the temp node in-place. */ /* Commute it --- note this modifies the temp node in-place. */
CommuteOpExpr(temp); CommuteOpExpr(temp);
t_list = lappend(t_list, temp); t_list = lappend(t_list, temp);
restrictinfo->outer_is_left = false;
} }
else else
{
Assert(bms_is_subset(restrictinfo->left_relids, outerrelids));
t_list = lappend(t_list, clause); t_list = lappend(t_list, clause);
restrictinfo->outer_is_left = true;
}
} }
return t_list; return t_list;
} }
@ -2490,7 +2588,7 @@ add_sort_column(AttrNumber colIdx, Oid sortOp, bool nulls_first,
* If the input plan type isn't one that can do projections, this means * If the input plan type isn't one that can do projections, this means
* adding a Result node just to do the projection. * adding a Result node just to do the projection.
*/ */
static Sort * Sort *
make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys) make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
{ {
List *tlist = lefttree->targetlist; List *tlist = lefttree->targetlist;
@ -2512,41 +2610,55 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
foreach(i, pathkeys) foreach(i, pathkeys)
{ {
List *keysublist = (List *) lfirst(i); PathKey *pathkey = (PathKey *) lfirst(i);
PathKeyItem *pathkey = NULL;
TargetEntry *tle = NULL; TargetEntry *tle = NULL;
Oid pk_datatype = InvalidOid;
Oid sortop;
ListCell *j; ListCell *j;
/* /*
* We can sort by any one of the sort key items listed in this * We can sort by any non-constant expression listed in the pathkey's
* sublist. For now, we take the first one that corresponds to an * EquivalenceClass. For now, we take the first one that corresponds
* available Var in the tlist. If there isn't any, use the first one * to an available Var in the tlist. If there isn't any, use the first
* that is an expression in the input's vars. * one that is an expression in the input's vars. (The non-const
* restriction only matters if the EC is below_outer_join; but if it
* isn't, it won't contain consts anyway, else we'd have discarded
* the pathkey as redundant.)
* *
* XXX if we have a choice, is there any way of figuring out which * XXX if we have a choice, is there any way of figuring out which
* might be cheapest to execute? (For example, int4lt is likely much * might be cheapest to execute? (For example, int4lt is likely much
* cheaper to execute than numericlt, but both might appear in the * cheaper to execute than numericlt, but both might appear in the
* same pathkey sublist...) Not clear that we ever will have a choice * same equivalence class...) Not clear that we ever will have an
* in practice, so it may not matter. * interesting choice in practice, so it may not matter.
*/ */
foreach(j, keysublist) foreach(j, pathkey->pk_eclass->ec_members)
{ {
pathkey = (PathKeyItem *) lfirst(j); EquivalenceMember *em = (EquivalenceMember *) lfirst(j);
Assert(IsA(pathkey, PathKeyItem));
tle = tlist_member(pathkey->key, tlist); if (em->em_is_const || em->em_is_child)
continue;
tle = tlist_member((Node *) em->em_expr, tlist);
if (tle) if (tle)
break; {
pk_datatype = em->em_datatype;
break; /* found expr already in tlist */
}
} }
if (!tle) if (!tle)
{ {
/* No matching Var; look for a computable expression */ /* No matching Var; look for a computable expression */
foreach(j, keysublist) Expr *sortexpr = NULL;
foreach(j, pathkey->pk_eclass->ec_members)
{ {
EquivalenceMember *em = (EquivalenceMember *) lfirst(j);
List *exprvars; List *exprvars;
ListCell *k; ListCell *k;
pathkey = (PathKeyItem *) lfirst(j); if (em->em_is_const || em->em_is_child)
exprvars = pull_var_clause(pathkey->key, false); continue;
sortexpr = em->em_expr;
exprvars = pull_var_clause((Node *) sortexpr, false);
foreach(k, exprvars) foreach(k, exprvars)
{ {
if (!tlist_member(lfirst(k), tlist)) if (!tlist_member(lfirst(k), tlist))
@ -2554,8 +2666,11 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
} }
list_free(exprvars); list_free(exprvars);
if (!k) if (!k)
{
pk_datatype = em->em_datatype;
break; /* found usable expression */ break; /* found usable expression */
} }
}
if (!j) if (!j)
elog(ERROR, "could not find pathkey item to sort"); elog(ERROR, "could not find pathkey item to sort");
@ -2571,7 +2686,7 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
/* /*
* Add resjunk entry to input's tlist * Add resjunk entry to input's tlist
*/ */
tle = makeTargetEntry((Expr *) pathkey->key, tle = makeTargetEntry(sortexpr,
list_length(tlist) + 1, list_length(tlist) + 1,
NULL, NULL,
true); true);
@ -2579,14 +2694,28 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
lefttree->targetlist = tlist; /* just in case NIL before */ lefttree->targetlist = tlist; /* just in case NIL before */
} }
/*
* Look up the correct sort operator from the PathKey's slightly
* abstracted representation.
*/
sortop = get_opfamily_member(pathkey->pk_opfamily,
pk_datatype,
pk_datatype,
pathkey->pk_strategy);
if (!OidIsValid(sortop)) /* should not happen */
elog(ERROR, "could not find member %d(%u,%u) of opfamily %u",
pathkey->pk_strategy, pk_datatype, pk_datatype,
pathkey->pk_opfamily);
/* /*
* The column might already be selected as a sort key, if the pathkeys * The column might already be selected as a sort key, if the pathkeys
* contain duplicate entries. (This can happen in scenarios where * contain duplicate entries. (This can happen in scenarios where
* multiple mergejoinable clauses mention the same var, for example.) * multiple mergejoinable clauses mention the same var, for example.)
* So enter it only once in the sort arrays. * So enter it only once in the sort arrays.
*/ */
numsortkeys = add_sort_column(tle->resno, pathkey->sortop, numsortkeys = add_sort_column(tle->resno,
pathkey->nulls_first, sortop,
pathkey->pk_nulls_first,
numsortkeys, numsortkeys,
sortColIdx, sortOperators, nullsFirst); sortColIdx, sortOperators, nullsFirst);
} }

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/plan/initsplan.c,v 1.127 2007/01/08 16:47:30 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/plan/initsplan.c,v 1.128 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -37,8 +37,6 @@ int from_collapse_limit;
int join_collapse_limit; int join_collapse_limit;
static void add_vars_to_targetlist(PlannerInfo *root, List *vars,
Relids where_needed);
static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode, static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
bool below_outer_join, Relids *qualscope); bool below_outer_join, Relids *qualscope);
static OuterJoinInfo *make_outerjoininfo(PlannerInfo *root, static OuterJoinInfo *make_outerjoininfo(PlannerInfo *root,
@ -51,8 +49,7 @@ static void distribute_qual_to_rels(PlannerInfo *root, Node *clause,
Relids qualscope, Relids qualscope,
Relids ojscope, Relids ojscope,
Relids outerjoin_nonnullable); Relids outerjoin_nonnullable);
static bool qual_is_redundant(PlannerInfo *root, RestrictInfo *restrictinfo, static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p);
List *restrictlist);
static void check_mergejoinable(RestrictInfo *restrictinfo); static void check_mergejoinable(RestrictInfo *restrictinfo);
static void check_hashjoinable(RestrictInfo *restrictinfo); static void check_hashjoinable(RestrictInfo *restrictinfo);
@ -144,7 +141,7 @@ build_base_rel_tlists(PlannerInfo *root, List *final_tlist)
* as being needed for the indicated join (or for final output if * as being needed for the indicated join (or for final output if
* where_needed includes "relation 0"). * where_needed includes "relation 0").
*/ */
static void void
add_vars_to_targetlist(PlannerInfo *root, List *vars, Relids where_needed) add_vars_to_targetlist(PlannerInfo *root, List *vars, Relids where_needed)
{ {
ListCell *temp; ListCell *temp;
@ -590,17 +587,17 @@ make_outerjoininfo(PlannerInfo *root,
* Add clause information to either the baserestrictinfo or joininfo list * Add clause information to either the baserestrictinfo or joininfo list
* (depending on whether the clause is a join) of each base relation * (depending on whether the clause is a join) of each base relation
* mentioned in the clause. A RestrictInfo node is created and added to * mentioned in the clause. A RestrictInfo node is created and added to
* the appropriate list for each rel. Also, if the clause uses a * the appropriate list for each rel. Alternatively, if the clause uses a
* mergejoinable operator and is not delayed by outer-join rules, enter * mergejoinable operator and is not delayed by outer-join rules, enter
* the left- and right-side expressions into the query's lists of * the left- and right-side expressions into the query's list of
* equijoined vars. * EquivalenceClasses.
* *
* 'clause': the qual clause to be distributed * 'clause': the qual clause to be distributed
* 'is_pushed_down': if TRUE, force the clause to be marked 'is_pushed_down' * 'is_pushed_down': if TRUE, force the clause to be marked 'is_pushed_down'
* (this indicates the clause came from a FromExpr, not a JoinExpr) * (this indicates the clause came from a FromExpr, not a JoinExpr)
* 'is_deduced': TRUE if the qual came from implied-equality deduction * 'is_deduced': TRUE if the qual came from implied-equality deduction
* 'below_outer_join': TRUE if the qual is from a JOIN/ON that is below the * 'below_outer_join': TRUE if the qual is from a JOIN/ON that is below the
* nullable side of a higher-level outer join. * nullable side of a higher-level outer join
* 'qualscope': set of baserels the qual's syntactic scope covers * 'qualscope': set of baserels the qual's syntactic scope covers
* 'ojscope': NULL if not an outer-join qual, else the minimum set of baserels * 'ojscope': NULL if not an outer-join qual, else the minimum set of baserels
* needed to form this join * needed to form this join
@ -625,11 +622,9 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
Relids relids; Relids relids;
bool outerjoin_delayed; bool outerjoin_delayed;
bool pseudoconstant = false; bool pseudoconstant = false;
bool maybe_equijoin; bool maybe_equivalence;
bool maybe_outer_join; bool maybe_outer_join;
RestrictInfo *restrictinfo; RestrictInfo *restrictinfo;
RelOptInfo *rel;
List *vars;
/* /*
* Retrieve all relids mentioned within the clause. * Retrieve all relids mentioned within the clause.
@ -705,108 +700,57 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
if (is_deduced) if (is_deduced)
{ {
/* /*
* If the qual came from implied-equality deduction, we always * If the qual came from implied-equality deduction, it should
* evaluate the qual at its natural semantic level. It is the * not be outerjoin-delayed, else deducer blew it. But we can't
* responsibility of the deducer not to create any quals that should * check this because the ojinfo list may now contain OJs above
* be delayed by outer-join rules. * where the qual belongs.
*/ */
Assert(bms_equal(relids, qualscope));
Assert(!ojscope); Assert(!ojscope);
Assert(!pseudoconstant);
/* Needn't feed it back for more deductions */
outerjoin_delayed = false; outerjoin_delayed = false;
maybe_equijoin = false; /* Don't feed it back for more deductions */
maybe_equivalence = false;
maybe_outer_join = false; maybe_outer_join = false;
} }
else if (bms_overlap(relids, outerjoin_nonnullable)) else if (bms_overlap(relids, outerjoin_nonnullable))
{ {
/* /*
* The qual is attached to an outer join and mentions (some of the) * The qual is attached to an outer join and mentions (some of the)
* rels on the nonnullable side. Force the qual to be evaluated * rels on the nonnullable side.
* exactly at the level of joining corresponding to the outer join. We
* cannot let it get pushed down into the nonnullable side, since then
* we'd produce no output rows, rather than the intended single
* null-extended row, for any nonnullable-side rows failing the qual.
* *
* Note: an outer-join qual that mentions only nullable-side rels can * Note: an outer-join qual that mentions only nullable-side rels can
* be pushed down into the nullable side without changing the join * be pushed down into the nullable side without changing the join
* result, so we treat it the same as an ordinary inner-join qual, * result, so we treat it almost the same as an ordinary inner-join
* except for not setting maybe_equijoin (see below). * qual (see below).
*
* We can't use such a clause to deduce equivalence (the left and right
* sides might be unequal above the join because one of them has gone
* to NULL) ... but we might be able to use it for more limited
* deductions, if there are no lower outer joins that delay its
* application. If so, consider adding it to the lists of set-aside
* clauses.
*/
maybe_equivalence = false;
maybe_outer_join = !check_outerjoin_delay(root, &relids);
/*
* Now force the qual to be evaluated exactly at the level of joining
* corresponding to the outer join. We cannot let it get pushed down
* into the nonnullable side, since then we'd produce no output rows,
* rather than the intended single null-extended row, for any
* nonnullable-side rows failing the qual.
*
* (Do this step after calling check_outerjoin_delay, because that
* trashes relids.)
*/ */
Assert(ojscope); Assert(ojscope);
relids = ojscope; relids = ojscope;
outerjoin_delayed = true; outerjoin_delayed = true;
Assert(!pseudoconstant); Assert(!pseudoconstant);
/*
* We can't use such a clause to deduce equijoin (the left and right
* sides might be unequal above the join because one of them has gone
* to NULL) ... but we might be able to use it for more limited
* purposes. Note: for the current uses of deductions from an
* outer-join clause, it seems safe to make the deductions even when
* the clause is below a higher-level outer join; so we do not check
* below_outer_join here.
*/
maybe_equijoin = false;
maybe_outer_join = true;
} }
else else
{ {
/* /* Normal qual clause; check to see if must be delayed by outer join */
* For a non-outer-join qual, we can evaluate the qual as soon as (1) outerjoin_delayed = check_outerjoin_delay(root, &relids);
* we have all the rels it mentions, and (2) we are at or above any
* outer joins that can null any of these rels and are below the
* syntactic location of the given qual. We must enforce (2) because
* pushing down such a clause below the OJ might cause the OJ to emit
* null-extended rows that should not have been formed, or that should
* have been rejected by the clause. (This is only an issue for
* non-strict quals, since if we can prove a qual mentioning only
* nullable rels is strict, we'd have reduced the outer join to an
* inner join in reduce_outer_joins().)
*
* To enforce (2), scan the oj_info_list and merge the required-relid
* sets of any such OJs into the clause's own reference list. At the
* time we are called, the oj_info_list contains only outer joins
* below this qual. We have to repeat the scan until no new relids
* get added; this ensures that the qual is suitably delayed regardless
* of the order in which OJs get executed. As an example, if we have
* one OJ with LHS=A, RHS=B, and one with LHS=B, RHS=C, it is implied
* that these can be done in either order; if the B/C join is done
* first then the join to A can null C, so a qual actually mentioning
* only C cannot be applied below the join to A.
*/
bool found_some;
outerjoin_delayed = false;
do {
ListCell *l;
found_some = false;
foreach(l, root->oj_info_list)
{
OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
/* do we have any nullable rels of this OJ? */
if (bms_overlap(relids, ojinfo->min_righthand) ||
(ojinfo->is_full_join &&
bms_overlap(relids, ojinfo->min_lefthand)))
{
/* yes; do we have all its rels? */
if (!bms_is_subset(ojinfo->min_lefthand, relids) ||
!bms_is_subset(ojinfo->min_righthand, relids))
{
/* no, so add them in */
relids = bms_add_members(relids,
ojinfo->min_lefthand);
relids = bms_add_members(relids,
ojinfo->min_righthand);
outerjoin_delayed = true;
/* we'll need another iteration */
found_some = true;
}
}
}
} while (found_some);
if (outerjoin_delayed) if (outerjoin_delayed)
{ {
@ -816,26 +760,27 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
* Because application of the qual will be delayed by outer join, * Because application of the qual will be delayed by outer join,
* we mustn't assume its vars are equal everywhere. * we mustn't assume its vars are equal everywhere.
*/ */
maybe_equijoin = false; maybe_equivalence = false;
} }
else else
{ {
/* /*
* Qual is not delayed by any lower outer-join restriction. If it * Qual is not delayed by any lower outer-join restriction, so
* is not itself below or within an outer join, we can consider it * we can consider feeding it to the equivalence machinery.
* "valid everywhere", so consider feeding it to the equijoin * However, if it's itself within an outer-join clause, treat it
* machinery. (If it is within an outer join, we can't consider * as though it appeared below that outer join (note that we can
* it "valid everywhere": once the contained variables have gone * only get here when the clause references only nullable-side
* to NULL, we'd be asserting things like NULL = NULL, which is * rels).
* not true.)
*/ */
if (!below_outer_join && outerjoin_nonnullable == NULL) maybe_equivalence = true;
maybe_equijoin = true; if (outerjoin_nonnullable != NULL)
else below_outer_join = true;
maybe_equijoin = false;
} }
/* Since it doesn't mention the LHS, it's certainly not an OJ clause */ /*
* Since it doesn't mention the LHS, it's certainly not useful as a
* set-aside OJ clause, even if it's in an OJ.
*/
maybe_outer_join = false; maybe_outer_join = false;
} }
@ -860,118 +805,65 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
relids); relids);
/* /*
* Figure out where to attach it. * If it's a join clause (either naturally, or because delayed by
*/ * outer-join rules), add vars used in the clause to targetlists of
switch (bms_membership(relids)) * their relations, so that they will be emitted by the plan nodes that
{ * scan those relations (else they won't be available at the join node!).
case BMS_SINGLETON:
/*
* There is only one relation participating in 'clause', so
* 'clause' is a restriction clause for that relation.
*/
rel = find_base_rel(root, bms_singleton_member(relids));
/*
* Check for a "mergejoinable" clause even though it's not a join
* clause. This is so that we can recognize that "a.x = a.y"
* makes x and y eligible to be considered equal, even when they
* belong to the same rel. Without this, we would not recognize
* that "a.x = a.y AND a.x = b.z AND a.y = c.q" allows us to
* consider z and q equal after their rels are joined.
*/
check_mergejoinable(restrictinfo);
/*
* If the clause was deduced from implied equality, check to see
* whether it is redundant with restriction clauses we already
* have for this rel. Note we cannot apply this check to
* user-written clauses, since we haven't found the canonical
* pathkey sets yet while processing user clauses. (NB: no
* comparable check is done in the join-clause case; redundancy
* will be detected when the join clause is moved into a join
* rel's restriction list.)
*/
if (!is_deduced ||
!qual_is_redundant(root, restrictinfo,
rel->baserestrictinfo))
{
/* Add clause to rel's restriction list */
rel->baserestrictinfo = lappend(rel->baserestrictinfo,
restrictinfo);
}
break;
case BMS_MULTIPLE:
/*
* 'clause' is a join clause, since there is more than one rel in
* the relid set.
*/
/*
* Check for hash or mergejoinable operators.
* *
* We don't bother setting the hashjoin info if we're not going to * Note: if the clause gets absorbed into an EquivalenceClass then this
* need it. We do want to know about mergejoinable ops in all * may be unnecessary, but for now we have to do it to cover the case
* cases, however, because we use mergejoinable ops for other * where the EC becomes ec_broken and we end up reinserting the original
* purposes such as detecting redundant clauses. * clauses into the plan.
*/ */
check_mergejoinable(restrictinfo); if (bms_membership(relids) == BMS_MULTIPLE)
if (enable_hashjoin) {
check_hashjoinable(restrictinfo); List *vars = pull_var_clause(clause, false);
/*
* Add clause to the join lists of all the relevant relations.
*/
add_join_clause_to_rels(root, restrictinfo, relids);
/*
* Add vars used in the join clause to targetlists of their
* relations, so that they will be emitted by the plan nodes that
* scan those relations (else they won't be available at the join
* node!).
*/
vars = pull_var_clause(clause, false);
add_vars_to_targetlist(root, vars, relids); add_vars_to_targetlist(root, vars, relids);
list_free(vars); list_free(vars);
break;
default:
/*
* 'clause' references no rels, and therefore we have no place to
* attach it. Shouldn't get here if callers are working properly.
*/
elog(ERROR, "cannot cope with variable-free clause");
break;
} }
/* /*
* If the clause has a mergejoinable operator, we may be able to deduce * We check "mergejoinability" of every clause, not only join clauses,
* more things from it under the principle of transitivity. * because we want to know about equivalences between vars of the same
* relation, or between vars and consts.
*/
check_mergejoinable(restrictinfo);
/*
* If it is a true equivalence clause, send it to the EquivalenceClass
* machinery. We do *not* attach it directly to any restriction or join
* lists. The EC code will propagate it to the appropriate places later.
* *
* If it is not an outer-join qualification nor bubbled up due to an outer * If the clause has a mergejoinable operator and is not outerjoin-delayed,
* join, then the two sides represent equivalent PathKeyItems for path * yet isn't an equivalence because it is an outer-join clause, the EC
* keys: any path that is sorted by one side will also be sorted by the * code may yet be able to do something with it. We add it to appropriate
* other (as soon as the two rels are joined, that is). Pass such clauses * lists for further consideration later. Specifically:
* to add_equijoined_keys.
* *
* If it is a left or right outer-join qualification that relates the two * If it is a left or right outer-join qualification that relates the
* sides of the outer join (no funny business like leftvar1 = leftvar2 + * two sides of the outer join (no funny business like leftvar1 =
* rightvar), we add it to root->left_join_clauses or * leftvar2 + rightvar), we add it to root->left_join_clauses or
* root->right_join_clauses according to which side the nonnullable * root->right_join_clauses according to which side the nonnullable
* variable appears on. * variable appears on.
* *
* If it is a full outer-join qualification, we add it to * If it is a full outer-join qualification, we add it to
* root->full_join_clauses. (Ideally we'd discard cases that aren't * root->full_join_clauses. (Ideally we'd discard cases that aren't
* leftvar = rightvar, as we do for left/right joins, but this routine * leftvar = rightvar, as we do for left/right joins, but this routine
* doesn't have the info needed to do that; and the current usage of the * doesn't have the info needed to do that; and the current usage of
* full_join_clauses list doesn't require that, so it's not currently * the full_join_clauses list doesn't require that, so it's not
* worth complicating this routine's API to make it possible.) * currently worth complicating this routine's API to make it possible.)
*
* If none of the above hold, pass it off to
* distribute_restrictinfo_to_rels().
*/ */
if (restrictinfo->mergejoinoperator != InvalidOid) if (restrictinfo->mergeopfamilies)
{ {
if (maybe_equijoin) if (maybe_equivalence)
add_equijoined_keys(root, restrictinfo); {
if (process_equivalence(root, restrictinfo, below_outer_join))
return;
/* EC rejected it, so pass to distribute_restrictinfo_to_rels */
}
else if (maybe_outer_join && restrictinfo->can_join) else if (maybe_outer_join && restrictinfo->can_join)
{ {
if (bms_is_subset(restrictinfo->left_relids, if (bms_is_subset(restrictinfo->left_relids,
@ -982,8 +874,9 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
/* we have outervar = innervar */ /* we have outervar = innervar */
root->left_join_clauses = lappend(root->left_join_clauses, root->left_join_clauses = lappend(root->left_join_clauses,
restrictinfo); restrictinfo);
return;
} }
else if (bms_is_subset(restrictinfo->right_relids, if (bms_is_subset(restrictinfo->right_relids,
outerjoin_nonnullable) && outerjoin_nonnullable) &&
!bms_overlap(restrictinfo->left_relids, !bms_overlap(restrictinfo->left_relids,
outerjoin_nonnullable)) outerjoin_nonnullable))
@ -991,166 +884,213 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
/* we have innervar = outervar */ /* we have innervar = outervar */
root->right_join_clauses = lappend(root->right_join_clauses, root->right_join_clauses = lappend(root->right_join_clauses,
restrictinfo); restrictinfo);
return;
} }
else if (bms_equal(outerjoin_nonnullable, qualscope)) if (bms_equal(outerjoin_nonnullable, qualscope))
{ {
/* FULL JOIN (above tests cannot match in this case) */ /* FULL JOIN (above tests cannot match in this case) */
root->full_join_clauses = lappend(root->full_join_clauses, root->full_join_clauses = lappend(root->full_join_clauses,
restrictinfo); restrictinfo);
return;
} }
} }
} }
/* No EC special case applies, so push it into the clause lists */
distribute_restrictinfo_to_rels(root, restrictinfo);
}
/*
* check_outerjoin_delay
* Detect whether a qual referencing the given relids must be delayed
* in application due to the presence of a lower outer join.
*
* If so, add relids to *relids_p to reflect the lowest safe level for
* evaluating the qual, and return TRUE.
*
* For a non-outer-join qual, we can evaluate the qual as soon as (1) we have
* all the rels it mentions, and (2) we are at or above any outer joins that
* can null any of these rels and are below the syntactic location of the
* given qual. We must enforce (2) because pushing down such a clause below
* the OJ might cause the OJ to emit null-extended rows that should not have
* been formed, or that should have been rejected by the clause. (This is
* only an issue for non-strict quals, since if we can prove a qual mentioning
* only nullable rels is strict, we'd have reduced the outer join to an inner
* join in reduce_outer_joins().)
*
* To enforce (2), scan the oj_info_list and merge the required-relid sets of
* any such OJs into the clause's own reference list. At the time we are
* called, the oj_info_list contains only outer joins below this qual. We
* have to repeat the scan until no new relids get added; this ensures that
* the qual is suitably delayed regardless of the order in which OJs get
* executed. As an example, if we have one OJ with LHS=A, RHS=B, and one with
* LHS=B, RHS=C, it is implied that these can be done in either order; if the
* B/C join is done first then the join to A can null C, so a qual actually
* mentioning only C cannot be applied below the join to A.
*
* For an outer-join qual, this isn't going to determine where we place the
* qual, but we need to determine outerjoin_delayed anyway so we can decide
* whether the qual is potentially useful for equivalence deductions.
*/
static bool
check_outerjoin_delay(PlannerInfo *root, Relids *relids_p)
{
Relids relids = *relids_p;
bool outerjoin_delayed;
bool found_some;
outerjoin_delayed = false;
do {
ListCell *l;
found_some = false;
foreach(l, root->oj_info_list)
{
OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
/* do we reference any nullable rels of this OJ? */
if (bms_overlap(relids, ojinfo->min_righthand) ||
(ojinfo->is_full_join &&
bms_overlap(relids, ojinfo->min_lefthand)))
{
/* yes; have we included all its rels in relids? */
if (!bms_is_subset(ojinfo->min_lefthand, relids) ||
!bms_is_subset(ojinfo->min_righthand, relids))
{
/* no, so add them in */
relids = bms_add_members(relids, ojinfo->min_lefthand);
relids = bms_add_members(relids, ojinfo->min_righthand);
outerjoin_delayed = true;
/* we'll need another iteration */
found_some = true;
}
}
}
} while (found_some);
*relids_p = relids;
return outerjoin_delayed;
}
/*
* distribute_restrictinfo_to_rels
* Push a completed RestrictInfo into the proper restriction or join
* clause list(s).
*
* This is the last step of distribute_qual_to_rels() for ordinary qual
* clauses. Clauses that are interesting for equivalence-class processing
* are diverted to the EC machinery, but may ultimately get fed back here.
*/
void
distribute_restrictinfo_to_rels(PlannerInfo *root,
RestrictInfo *restrictinfo)
{
Relids relids = restrictinfo->required_relids;
RelOptInfo *rel;
switch (bms_membership(relids))
{
case BMS_SINGLETON:
/*
* There is only one relation participating in the clause, so
* it is a restriction clause for that relation.
*/
rel = find_base_rel(root, bms_singleton_member(relids));
/* Add clause to rel's restriction list */
rel->baserestrictinfo = lappend(rel->baserestrictinfo,
restrictinfo);
break;
case BMS_MULTIPLE:
/*
* The clause is a join clause, since there is more than one rel
* in its relid set.
*/
/*
* Check for hashjoinable operators. (We don't bother setting
* the hashjoin info if we're not going to need it.)
*/
if (enable_hashjoin)
check_hashjoinable(restrictinfo);
/*
* Add clause to the join lists of all the relevant relations.
*/
add_join_clause_to_rels(root, restrictinfo, relids);
break;
default:
/*
* clause references no rels, and therefore we have no place to
* attach it. Shouldn't get here if callers are working properly.
*/
elog(ERROR, "cannot cope with variable-free clause");
break;
}
} }
/* /*
* process_implied_equality * process_implied_equality
* Check to see whether we already have a restrictinfo item that says * Create a restrictinfo item that says "item1 op item2", and push it
* item1 = item2, and create one if not; or if delete_it is true, * into the appropriate lists. (In practice opno is always a btree
* remove any such restrictinfo item. * equality operator.)
* *
* This processing is a consequence of transitivity of mergejoin equality: * "qualscope" is the nominal syntactic level to impute to the restrictinfo.
* if we have mergejoinable clauses A = B and B = C, we can deduce A = C * This must contain at least all the rels used in the expressions, but it
* (where = is an appropriate mergejoinable operator). See path/pathkeys.c * is used only to set the qual application level when both exprs are
* for more details. * variable-free. Otherwise the qual is applied at the lowest join level
* that provides all its variables.
*
* "both_const" indicates whether both items are known pseudo-constant;
* in this case it is worth applying eval_const_expressions() in case we
* can produce constant TRUE or constant FALSE. (Otherwise it's not,
* because the expressions went through eval_const_expressions already.)
*
* This is currently used only when an EquivalenceClass is found to
* contain pseudoconstants. See path/pathkeys.c for more details.
*/ */
void void
process_implied_equality(PlannerInfo *root, process_implied_equality(PlannerInfo *root,
Node *item1, Node *item2, Oid opno,
Oid sortop1, Oid sortop2, Expr *item1,
Relids item1_relids, Relids item2_relids, Expr *item2,
bool delete_it) Relids qualscope,
bool below_outer_join,
bool both_const)
{ {
Relids relids;
BMS_Membership membership;
RelOptInfo *rel1;
List *restrictlist;
ListCell *itm;
Oid ltype,
rtype;
Operator eq_operator;
Form_pg_operator pgopform;
Expr *clause; Expr *clause;
/* Get set of relids referenced in the two expressions */
relids = bms_union(item1_relids, item2_relids);
membership = bms_membership(relids);
/* /*
* generate_implied_equalities() shouldn't call me on two constants. * Build the new clause. Copy to ensure it shares no substructure with
* original (this is necessary in case there are subselects in there...)
*/ */
Assert(membership != BMS_EMPTY_SET); clause = make_opclause(opno,
/*
* If the exprs involve a single rel, we need to look at that rel's
* baserestrictinfo list. If multiple rels, we can scan the joininfo list
* of any of 'em.
*/
if (membership == BMS_SINGLETON)
{
rel1 = find_base_rel(root, bms_singleton_member(relids));
restrictlist = rel1->baserestrictinfo;
}
else
{
Relids other_rels;
int first_rel;
/* Copy relids, find and remove one member */
other_rels = bms_copy(relids);
first_rel = bms_first_member(other_rels);
bms_free(other_rels);
rel1 = find_base_rel(root, first_rel);
restrictlist = rel1->joininfo;
}
/*
* Scan to see if equality is already known. If so, we're done in the add
* case, and done after removing it in the delete case.
*/
foreach(itm, restrictlist)
{
RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(itm);
Node *left,
*right;
if (restrictinfo->mergejoinoperator == InvalidOid)
continue; /* ignore non-mergejoinable clauses */
/* We now know the restrictinfo clause is a binary opclause */
left = get_leftop(restrictinfo->clause);
right = get_rightop(restrictinfo->clause);
if ((equal(item1, left) && equal(item2, right)) ||
(equal(item2, left) && equal(item1, right)))
{
/* found a matching clause */
if (delete_it)
{
if (membership == BMS_SINGLETON)
{
/* delete it from local restrictinfo list */
rel1->baserestrictinfo = list_delete_ptr(rel1->baserestrictinfo,
restrictinfo);
}
else
{
/* let joininfo.c do it */
remove_join_clause_from_rels(root, restrictinfo, relids);
}
}
return; /* done */
}
}
/* Didn't find it. Done if deletion requested */
if (delete_it)
return;
/*
* This equality is new information, so construct a clause representing it
* to add to the query data structures.
*/
ltype = exprType(item1);
rtype = exprType(item2);
eq_operator = compatible_oper(NULL, list_make1(makeString("=")),
ltype, rtype,
true, -1);
if (!HeapTupleIsValid(eq_operator))
{
/*
* Would it be safe to just not add the equality to the query if we
* have no suitable equality operator for the combination of
* datatypes? NO, because sortkey selection may screw up anyway.
*/
ereport(ERROR,
(errcode(ERRCODE_UNDEFINED_FUNCTION),
errmsg("could not identify an equality operator for types %s and %s",
format_type_be(ltype), format_type_be(rtype))));
}
pgopform = (Form_pg_operator) GETSTRUCT(eq_operator);
/*
* Let's just make sure this appears to be a compatible operator.
*
* XXX needs work
*/
if (pgopform->oprresult != BOOLOID)
ereport(ERROR,
(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
errmsg("equality operator for types %s and %s should be merge-joinable, but isn't",
format_type_be(ltype), format_type_be(rtype))));
/*
* Now we can build the new clause. Copy to ensure it shares no
* substructure with original (this is necessary in case there are
* subselects in there...)
*/
clause = make_opclause(oprid(eq_operator), /* opno */
BOOLOID, /* opresulttype */ BOOLOID, /* opresulttype */
false, /* opretset */ false, /* opretset */
(Expr *) copyObject(item1), (Expr *) copyObject(item1),
(Expr *) copyObject(item2)); (Expr *) copyObject(item2));
ReleaseSysCache(eq_operator); /* If both constant, try to reduce to a boolean constant. */
if (both_const)
{
clause = (Expr *) eval_const_expressions((Node *) clause);
/* If we produced const TRUE, just drop the clause */
if (clause && IsA(clause, Const))
{
Const *cclause = (Const *) clause;
Assert(cclause->consttype == BOOLOID);
if (!cclause->constisnull && DatumGetBool(cclause->constvalue))
return;
}
}
/* Make a copy of qualscope to avoid problems if source EC changes */
qualscope = bms_copy(qualscope);
/* /*
* Push the new clause into all the appropriate restrictinfo lists. * Push the new clause into all the appropriate restrictinfo lists.
@ -1159,119 +1099,53 @@ process_implied_equality(PlannerInfo *root,
* taken for an original JOIN/ON clause. * taken for an original JOIN/ON clause.
*/ */
distribute_qual_to_rels(root, (Node *) clause, distribute_qual_to_rels(root, (Node *) clause,
true, true, false, relids, NULL, NULL); true, true, below_outer_join,
qualscope, NULL, NULL);
} }
/* /*
* qual_is_redundant * build_implied_join_equality --- build a RestrictInfo for a derived equality
* Detect whether an implied-equality qual that turns out to be a
* restriction clause for a single base relation is redundant with
* already-known restriction clauses for that rel. This occurs with,
* for example,
* SELECT * FROM tab WHERE f1 = f2 AND f2 = f3;
* We need to suppress the redundant condition to avoid computing
* too-small selectivity, not to mention wasting time at execution.
* *
* Note: quals of the form "var = const" are never considered redundant, * This overlaps the functionality of process_implied_equality(), but we
* only those of the form "var = var". This is needed because when we * must return the RestrictInfo, not push it into the joininfo tree.
* have constants in an implied-equality set, we use a different strategy
* that suppresses all "var = var" deductions. We must therefore keep
* all the "var = const" quals.
*/ */
static bool RestrictInfo *
qual_is_redundant(PlannerInfo *root, build_implied_join_equality(Oid opno,
RestrictInfo *restrictinfo, Expr *item1,
List *restrictlist) Expr *item2,
Relids qualscope)
{ {
Node *newleft; RestrictInfo *restrictinfo;
Node *newright; Expr *clause;
List *oldquals;
ListCell *olditem;
List *equalexprs;
bool someadded;
/* Never redundant unless vars appear on both sides */
if (bms_is_empty(restrictinfo->left_relids) ||
bms_is_empty(restrictinfo->right_relids))
return false;
newleft = get_leftop(restrictinfo->clause);
newright = get_rightop(restrictinfo->clause);
/* /*
* Set cached pathkeys. NB: it is okay to do this now because this * Build the new clause. Copy to ensure it shares no substructure with
* routine is only invoked while we are generating implied equalities. * original (this is necessary in case there are subselects in there...)
* Therefore, the equi_key_list is already complete and so we can
* correctly determine canonical pathkeys.
*/ */
cache_mergeclause_pathkeys(root, restrictinfo); clause = make_opclause(opno,
/* If different, say "not redundant" (should never happen) */ BOOLOID, /* opresulttype */
if (restrictinfo->left_pathkey != restrictinfo->right_pathkey) false, /* opretset */
return false; (Expr *) copyObject(item1),
(Expr *) copyObject(item2));
/* Make a copy of qualscope to avoid problems if source EC changes */
qualscope = bms_copy(qualscope);
/* /*
* Scan existing quals to find those referencing same pathkeys. Usually * Build the RestrictInfo node itself.
* there will be few, if any, so build a list of just the interesting
* ones.
*/ */
oldquals = NIL; restrictinfo = make_restrictinfo(clause,
foreach(olditem, restrictlist) true, /* is_pushed_down */
{ false, /* outerjoin_delayed */
RestrictInfo *oldrinfo = (RestrictInfo *) lfirst(olditem); false, /* pseudoconstant */
qualscope);
if (oldrinfo->mergejoinoperator != InvalidOid) /* Set mergejoinability info always, and hashjoinability if enabled */
{ check_mergejoinable(restrictinfo);
cache_mergeclause_pathkeys(root, oldrinfo); if (enable_hashjoin)
if (restrictinfo->left_pathkey == oldrinfo->left_pathkey && check_hashjoinable(restrictinfo);
restrictinfo->right_pathkey == oldrinfo->right_pathkey)
oldquals = lcons(oldrinfo, oldquals);
}
}
if (oldquals == NIL)
return false;
/* return restrictinfo;
* Now, we want to develop a list of exprs that are known equal to the
* left side of the new qual. We traverse the old-quals list repeatedly
* to transitively expand the exprs list. If at any point we find we can
* reach the right-side expr of the new qual, we are done. We give up
* when we can't expand the equalexprs list any more.
*/
equalexprs = list_make1(newleft);
do
{
someadded = false;
/* cannot use foreach here because of possible list_delete */
olditem = list_head(oldquals);
while (olditem)
{
RestrictInfo *oldrinfo = (RestrictInfo *) lfirst(olditem);
Node *oldleft = get_leftop(oldrinfo->clause);
Node *oldright = get_rightop(oldrinfo->clause);
Node *newguy = NULL;
/* must advance olditem before list_delete possibly pfree's it */
olditem = lnext(olditem);
if (list_member(equalexprs, oldleft))
newguy = oldright;
else if (list_member(equalexprs, oldright))
newguy = oldleft;
else
continue;
if (equal(newguy, newright))
return true; /* we proved new clause is redundant */
equalexprs = lcons(newguy, equalexprs);
someadded = true;
/*
* Remove this qual from list, since we don't need it anymore.
*/
oldquals = list_delete_ptr(oldquals, oldrinfo);
}
} while (someadded);
return false; /* it's not redundant */
} }
@ -1294,10 +1168,7 @@ static void
check_mergejoinable(RestrictInfo *restrictinfo) check_mergejoinable(RestrictInfo *restrictinfo)
{ {
Expr *clause = restrictinfo->clause; Expr *clause = restrictinfo->clause;
Oid opno, Oid opno;
leftOp,
rightOp;
Oid opfamily;
if (restrictinfo->pseudoconstant) if (restrictinfo->pseudoconstant)
return; return;
@ -1310,16 +1181,13 @@ check_mergejoinable(RestrictInfo *restrictinfo)
if (op_mergejoinable(opno) && if (op_mergejoinable(opno) &&
!contain_volatile_functions((Node *) clause)) !contain_volatile_functions((Node *) clause))
{ restrictinfo->mergeopfamilies = get_mergejoin_opfamilies(opno);
/* XXX for the moment, continue to force use of particular sortops */
if (get_op_mergejoin_info(opno, &leftOp, &rightOp, &opfamily)) /*
{ * Note: op_mergejoinable is just a hint; if we fail to find the
restrictinfo->mergejoinoperator = opno; * operator in any btree opfamilies, mergeopfamilies remains NIL
restrictinfo->left_sortop = leftOp; * and so the clause is not treated as mergejoinable.
restrictinfo->right_sortop = rightOp; */
restrictinfo->mergeopfamily = opfamily;
}
}
} }
/* /*

View File

@ -14,7 +14,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.98 2007/01/05 22:19:32 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.99 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -110,14 +110,14 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
* for "simple" rels. * for "simple" rels.
* *
* NOTE: in_info_list and append_rel_list were set up by subquery_planner, * NOTE: in_info_list and append_rel_list were set up by subquery_planner,
* do not touch here * do not touch here; eq_classes may contain data already, too.
*/ */
root->simple_rel_array_size = list_length(parse->rtable) + 1; root->simple_rel_array_size = list_length(parse->rtable) + 1;
root->simple_rel_array = (RelOptInfo **) root->simple_rel_array = (RelOptInfo **)
palloc0(root->simple_rel_array_size * sizeof(RelOptInfo *)); palloc0(root->simple_rel_array_size * sizeof(RelOptInfo *));
root->join_rel_list = NIL; root->join_rel_list = NIL;
root->join_rel_hash = NULL; root->join_rel_hash = NULL;
root->equi_key_list = NIL; root->canon_pathkeys = NIL;
root->left_join_clauses = NIL; root->left_join_clauses = NIL;
root->right_join_clauses = NIL; root->right_join_clauses = NIL;
root->full_join_clauses = NIL; root->full_join_clauses = NIL;
@ -165,8 +165,8 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
* Examine the targetlist and qualifications, adding entries to baserel * Examine the targetlist and qualifications, adding entries to baserel
* targetlists for all referenced Vars. Restrict and join clauses are * targetlists for all referenced Vars. Restrict and join clauses are
* added to appropriate lists belonging to the mentioned relations. We * added to appropriate lists belonging to the mentioned relations. We
* also build lists of equijoined keys for pathkey construction, and form * also build EquivalenceClasses for provably equivalent expressions,
* a target joinlist for make_one_rel() to work from. * and form a target joinlist for make_one_rel() to work from.
* *
* Note: all subplan nodes will have "flat" (var-only) tlists. This * Note: all subplan nodes will have "flat" (var-only) tlists. This
* implies that all expression evaluations are done at the root of the * implies that all expression evaluations are done at the root of the
@ -179,16 +179,23 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
joinlist = deconstruct_jointree(root); joinlist = deconstruct_jointree(root);
/* /*
* Use the completed lists of equijoined keys to deduce any implied but * Reconsider any postponed outer-join quals now that we have built up
* unstated equalities (for example, A=B and B=C imply A=C). * equivalence classes. (This could result in further additions or
* mergings of classes.)
*/ */
generate_implied_equalities(root); reconsider_outer_join_clauses(root);
/* /*
* We should now have all the pathkey equivalence sets built, so it's now * If we formed any equivalence classes, generate additional restriction
* possible to convert the requested query_pathkeys to canonical form. * clauses as appropriate. (Implied join clauses are formed on-the-fly
* Also canonicalize the groupClause and sortClause pathkeys for use * later.)
* later. */
generate_base_implied_equalities(root);
/*
* We have completed merging equivalence sets, so it's now possible to
* convert the requested query_pathkeys to canonical form. Also
* canonicalize the groupClause and sortClause pathkeys for use later.
*/ */
root->query_pathkeys = canonicalize_pathkeys(root, root->query_pathkeys); root->query_pathkeys = canonicalize_pathkeys(root, root->query_pathkeys);
root->group_pathkeys = canonicalize_pathkeys(root, root->group_pathkeys); root->group_pathkeys = canonicalize_pathkeys(root, root->group_pathkeys);

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.211 2007/01/10 18:06:03 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.212 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -206,6 +206,8 @@ subquery_planner(Query *parse, double tuple_fraction,
/* Create a PlannerInfo data structure for this subquery */ /* Create a PlannerInfo data structure for this subquery */
root = makeNode(PlannerInfo); root = makeNode(PlannerInfo);
root->parse = parse; root->parse = parse;
root->planner_cxt = CurrentMemoryContext;
root->eq_classes = NIL;
root->in_info_list = NIL; root->in_info_list = NIL;
root->append_rel_list = NIL; root->append_rel_list = NIL;
@ -715,9 +717,10 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
* operation's result. We have to do this before overwriting the sort * operation's result. We have to do this before overwriting the sort
* key information... * key information...
*/ */
current_pathkeys = make_pathkeys_for_sortclauses(set_sortclauses, current_pathkeys = make_pathkeys_for_sortclauses(root,
result_plan->targetlist); set_sortclauses,
current_pathkeys = canonicalize_pathkeys(root, current_pathkeys); result_plan->targetlist,
true);
/* /*
* We should not need to call preprocess_targetlist, since we must be * We should not need to call preprocess_targetlist, since we must be
@ -742,9 +745,10 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
/* /*
* Calculate pathkeys that represent result ordering requirements * Calculate pathkeys that represent result ordering requirements
*/ */
sort_pathkeys = make_pathkeys_for_sortclauses(parse->sortClause, sort_pathkeys = make_pathkeys_for_sortclauses(root,
tlist); parse->sortClause,
sort_pathkeys = canonicalize_pathkeys(root, sort_pathkeys); tlist,
true);
} }
else else
{ {
@ -778,12 +782,18 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
/* /*
* Calculate pathkeys that represent grouping/ordering requirements. * Calculate pathkeys that represent grouping/ordering requirements.
* Stash them in PlannerInfo so that query_planner can canonicalize * Stash them in PlannerInfo so that query_planner can canonicalize
* them. * them after EquivalenceClasses have been formed.
*/ */
root->group_pathkeys = root->group_pathkeys =
make_pathkeys_for_sortclauses(parse->groupClause, tlist); make_pathkeys_for_sortclauses(root,
parse->groupClause,
tlist,
false);
root->sort_pathkeys = root->sort_pathkeys =
make_pathkeys_for_sortclauses(parse->sortClause, tlist); make_pathkeys_for_sortclauses(root,
parse->sortClause,
tlist,
false);
/* /*
* Will need actual number of aggregates for estimating costs. * Will need actual number of aggregates for estimating costs.
@ -1069,10 +1079,9 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
{ {
if (!pathkeys_contained_in(sort_pathkeys, current_pathkeys)) if (!pathkeys_contained_in(sort_pathkeys, current_pathkeys))
{ {
result_plan = (Plan *) result_plan = (Plan *) make_sort_from_pathkeys(root,
make_sort_from_sortclauses(root, result_plan,
parse->sortClause, sort_pathkeys);
result_plan);
current_pathkeys = sort_pathkeys; current_pathkeys = sort_pathkeys;
} }
} }

View File

@ -15,7 +15,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/prep/prepjointree.c,v 1.45 2007/01/05 22:19:32 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/prep/prepjointree.c,v 1.46 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -292,6 +292,7 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
*/ */
subroot = makeNode(PlannerInfo); subroot = makeNode(PlannerInfo);
subroot->parse = subquery; subroot->parse = subquery;
subroot->planner_cxt = CurrentMemoryContext;
subroot->in_info_list = NIL; subroot->in_info_list = NIL;
subroot->append_rel_list = NIL; subroot->append_rel_list = NIL;

View File

@ -22,7 +22,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/prep/prepunion.c,v 1.135 2007/01/05 22:19:32 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/prep/prepunion.c,v 1.136 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -1195,10 +1195,8 @@ adjust_appendrel_attrs_mutator(Node *node, AppendRelInfo *context)
*/ */
newinfo->eval_cost.startup = -1; newinfo->eval_cost.startup = -1;
newinfo->this_selec = -1; newinfo->this_selec = -1;
newinfo->left_pathkey = NIL; newinfo->left_ec = NULL;
newinfo->right_pathkey = NIL; newinfo->right_ec = NULL;
newinfo->left_mergescansel = -1;
newinfo->right_mergescansel = -1;
newinfo->left_bucketsize = -1; newinfo->left_bucketsize = -1;
newinfo->right_bucketsize = -1; newinfo->right_bucketsize = -1;

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/util/joininfo.c,v 1.46 2007/01/05 22:19:32 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/util/joininfo.c,v 1.47 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -16,6 +16,7 @@
#include "optimizer/joininfo.h" #include "optimizer/joininfo.h"
#include "optimizer/pathnode.h" #include "optimizer/pathnode.h"
#include "optimizer/paths.h"
/* /*
@ -54,6 +55,13 @@ have_relevant_joinclause(PlannerInfo *root,
} }
} }
/*
* We also need to check the EquivalenceClass data structure, which
* might contain relationships not emitted into the joininfo lists.
*/
if (!result && rel1->has_eclass_joins && rel2->has_eclass_joins)
result = have_relevant_eclass_joinclause(root, rel1, rel2);
/* /*
* It's possible that the rels correspond to the left and right sides * It's possible that the rels correspond to the left and right sides
* of a degenerate outer join, that is, one with no joinclause mentioning * of a degenerate outer join, that is, one with no joinclause mentioning
@ -124,37 +132,3 @@ add_join_clause_to_rels(PlannerInfo *root,
} }
bms_free(tmprelids); bms_free(tmprelids);
} }
/*
* remove_join_clause_from_rels
* Delete 'restrictinfo' from all the joininfo lists it is in
*
* This reverses the effect of add_join_clause_to_rels. It's used when we
* discover that a join clause is redundant.
*
* 'restrictinfo' describes the join clause
* 'join_relids' is the list of relations participating in the join clause
* (there must be more than one)
*/
void
remove_join_clause_from_rels(PlannerInfo *root,
RestrictInfo *restrictinfo,
Relids join_relids)
{
Relids tmprelids;
int cur_relid;
tmprelids = bms_copy(join_relids);
while ((cur_relid = bms_first_member(tmprelids)) >= 0)
{
RelOptInfo *rel = find_base_rel(root, cur_relid);
/*
* Remove the restrictinfo from the list. Pointer comparison is
* sufficient.
*/
Assert(list_member_ptr(rel->joininfo, restrictinfo));
rel->joininfo = list_delete_ptr(rel->joininfo, restrictinfo);
}
bms_free(tmprelids);
}

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/util/pathnode.c,v 1.136 2007/01/10 18:06:04 tgl Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/util/pathnode.c,v 1.137 2007/01/20 20:45:39 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -26,7 +26,6 @@
#include "parser/parse_expr.h" #include "parser/parse_expr.h"
#include "parser/parse_oper.h" #include "parser/parse_oper.h"
#include "parser/parsetree.h" #include "parser/parsetree.h"
#include "utils/memutils.h"
#include "utils/selfuncs.h" #include "utils/selfuncs.h"
#include "utils/lsyscache.h" #include "utils/lsyscache.h"
#include "utils/syscache.h" #include "utils/syscache.h"
@ -747,11 +746,11 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath)
return (UniquePath *) rel->cheapest_unique_path; return (UniquePath *) rel->cheapest_unique_path;
/* /*
* We must ensure path struct is allocated in same context as parent rel; * We must ensure path struct is allocated in main planning context;
* otherwise GEQO memory management causes trouble. (Compare * otherwise GEQO memory management causes trouble. (Compare
* best_inner_indexscan().) * best_inner_indexscan().)
*/ */
oldcontext = MemoryContextSwitchTo(GetMemoryChunkContext(rel)); oldcontext = MemoryContextSwitchTo(root->planner_cxt);
pathnode = makeNode(UniquePath); pathnode = makeNode(UniquePath);
@ -1198,11 +1197,6 @@ create_nestloop_path(PlannerInfo *root,
* 'pathkeys' are the path keys of the new join path * 'pathkeys' are the path keys of the new join path
* 'mergeclauses' are the RestrictInfo nodes to use as merge clauses * 'mergeclauses' are the RestrictInfo nodes to use as merge clauses
* (this should be a subset of the restrict_clauses list) * (this should be a subset of the restrict_clauses list)
* 'mergefamilies' are the btree opfamily OIDs identifying the merge
* ordering for each merge clause
* 'mergestrategies' are the btree operator strategies identifying the merge
* ordering for each merge clause
* 'mergenullsfirst' are the nulls first/last flags for each merge clause
* 'outersortkeys' are the sort varkeys for the outer relation * 'outersortkeys' are the sort varkeys for the outer relation
* 'innersortkeys' are the sort varkeys for the inner relation * 'innersortkeys' are the sort varkeys for the inner relation
*/ */
@ -1215,9 +1209,6 @@ create_mergejoin_path(PlannerInfo *root,
List *restrict_clauses, List *restrict_clauses,
List *pathkeys, List *pathkeys,
List *mergeclauses, List *mergeclauses,
Oid *mergefamilies,
int *mergestrategies,
bool *mergenullsfirst,
List *outersortkeys, List *outersortkeys,
List *innersortkeys) List *innersortkeys)
{ {
@ -1258,9 +1249,6 @@ create_mergejoin_path(PlannerInfo *root,
pathnode->jpath.joinrestrictinfo = restrict_clauses; pathnode->jpath.joinrestrictinfo = restrict_clauses;
pathnode->jpath.path.pathkeys = pathkeys; pathnode->jpath.path.pathkeys = pathkeys;
pathnode->path_mergeclauses = mergeclauses; pathnode->path_mergeclauses = mergeclauses;
pathnode->path_mergeFamilies = mergefamilies;
pathnode->path_mergeStrategies = mergestrategies;
pathnode->path_mergeNullsFirst = mergenullsfirst;
pathnode->outersortkeys = outersortkeys; pathnode->outersortkeys = outersortkeys;
pathnode->innersortkeys = innersortkeys; pathnode->innersortkeys = innersortkeys;

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/util/relnode.c,v 1.84 2007/01/05 22:19:33 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/util/relnode.c,v 1.85 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -16,6 +16,7 @@
#include "optimizer/cost.h" #include "optimizer/cost.h"
#include "optimizer/pathnode.h" #include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "optimizer/plancat.h" #include "optimizer/plancat.h"
#include "optimizer/restrictinfo.h" #include "optimizer/restrictinfo.h"
#include "parser/parsetree.h" #include "parser/parsetree.h"
@ -33,15 +34,16 @@ static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
static List *build_joinrel_restrictlist(PlannerInfo *root, static List *build_joinrel_restrictlist(PlannerInfo *root,
RelOptInfo *joinrel, RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *outer_rel,
RelOptInfo *inner_rel, RelOptInfo *inner_rel);
JoinType jointype);
static void build_joinrel_joinlist(RelOptInfo *joinrel, static void build_joinrel_joinlist(RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *outer_rel,
RelOptInfo *inner_rel); RelOptInfo *inner_rel);
static List *subbuild_joinrel_restrictlist(RelOptInfo *joinrel, static List *subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
List *joininfo_list); List *joininfo_list,
static void subbuild_joinrel_joinlist(RelOptInfo *joinrel, List *new_restrictlist);
List *joininfo_list); static List *subbuild_joinrel_joinlist(RelOptInfo *joinrel,
List *joininfo_list,
List *new_joininfo);
/* /*
@ -84,6 +86,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
rel->baserestrictcost.startup = 0; rel->baserestrictcost.startup = 0;
rel->baserestrictcost.per_tuple = 0; rel->baserestrictcost.per_tuple = 0;
rel->joininfo = NIL; rel->joininfo = NIL;
rel->has_eclass_joins = false;
rel->index_outer_relids = NULL; rel->index_outer_relids = NULL;
rel->index_inner_paths = NIL; rel->index_inner_paths = NIL;
@ -303,8 +306,7 @@ build_join_rel(PlannerInfo *root,
*restrictlist_ptr = build_joinrel_restrictlist(root, *restrictlist_ptr = build_joinrel_restrictlist(root,
joinrel, joinrel,
outer_rel, outer_rel,
inner_rel, inner_rel);
jointype);
return joinrel; return joinrel;
} }
@ -335,6 +337,7 @@ build_join_rel(PlannerInfo *root,
joinrel->baserestrictcost.startup = 0; joinrel->baserestrictcost.startup = 0;
joinrel->baserestrictcost.per_tuple = 0; joinrel->baserestrictcost.per_tuple = 0;
joinrel->joininfo = NIL; joinrel->joininfo = NIL;
joinrel->has_eclass_joins = false;
joinrel->index_outer_relids = NULL; joinrel->index_outer_relids = NULL;
joinrel->index_inner_paths = NIL; joinrel->index_inner_paths = NIL;
@ -354,15 +357,18 @@ build_join_rel(PlannerInfo *root,
* caller might or might not need the restrictlist, but I need it anyway * caller might or might not need the restrictlist, but I need it anyway
* for set_joinrel_size_estimates().) * for set_joinrel_size_estimates().)
*/ */
restrictlist = build_joinrel_restrictlist(root, restrictlist = build_joinrel_restrictlist(root, joinrel,
joinrel, outer_rel, inner_rel);
outer_rel,
inner_rel,
jointype);
if (restrictlist_ptr) if (restrictlist_ptr)
*restrictlist_ptr = restrictlist; *restrictlist_ptr = restrictlist;
build_joinrel_joinlist(joinrel, outer_rel, inner_rel); build_joinrel_joinlist(joinrel, outer_rel, inner_rel);
/*
* This is also the right place to check whether the joinrel has any
* pending EquivalenceClass joins.
*/
joinrel->has_eclass_joins = has_relevant_eclass_joinclause(root, joinrel);
/* /*
* Set estimates of the joinrel's size. * Set estimates of the joinrel's size.
*/ */
@ -468,15 +474,15 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
* join paths made from this pair of sub-relations. (It will not need to * join paths made from this pair of sub-relations. (It will not need to
* be considered further up the join tree.) * be considered further up the join tree.)
* *
* When building a restriction list, we eliminate redundant clauses. * In many case we will find the same RestrictInfos in both input
* We don't try to do that for join clause lists, since the join clauses * relations' joinlists, so be careful to eliminate duplicates.
* aren't really doing anything, just waiting to become part of higher * Pointer equality should be a sufficient test for dups, since all
* levels' restriction lists. * the various joinlist entries ultimately refer to RestrictInfos
* pushed into them by distribute_restrictinfo_to_rels().
* *
* 'joinrel' is a join relation node * 'joinrel' is a join relation node
* 'outer_rel' and 'inner_rel' are a pair of relations that can be joined * 'outer_rel' and 'inner_rel' are a pair of relations that can be joined
* to form joinrel. * to form joinrel.
* 'jointype' is the type of join used.
* *
* build_joinrel_restrictlist() returns a list of relevant restrictinfos, * build_joinrel_restrictlist() returns a list of relevant restrictinfos,
* whereas build_joinrel_joinlist() stores its results in the joinrel's * whereas build_joinrel_joinlist() stores its results in the joinrel's
@ -491,33 +497,27 @@ static List *
build_joinrel_restrictlist(PlannerInfo *root, build_joinrel_restrictlist(PlannerInfo *root,
RelOptInfo *joinrel, RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *outer_rel,
RelOptInfo *inner_rel, RelOptInfo *inner_rel)
JoinType jointype)
{ {
List *result; List *result;
List *rlist;
/* /*
* Collect all the clauses that syntactically belong at this level. * Collect all the clauses that syntactically belong at this level,
* eliminating any duplicates (important since we will see many of the
* same clauses arriving from both input relations).
*/ */
rlist = list_concat(subbuild_joinrel_restrictlist(joinrel, result = subbuild_joinrel_restrictlist(joinrel, outer_rel->joininfo, NIL);
outer_rel->joininfo), result = subbuild_joinrel_restrictlist(joinrel, inner_rel->joininfo, result);
subbuild_joinrel_restrictlist(joinrel,
inner_rel->joininfo));
/* /*
* Eliminate duplicate and redundant clauses. * Add on any clauses derived from EquivalenceClasses. These cannot be
* * redundant with the clauses in the joininfo lists, so don't bother
* We must eliminate duplicates, since we will see many of the same * checking.
* clauses arriving from both input relations. Also, if a clause is a
* mergejoinable clause, it's possible that it is redundant with previous
* clauses (see optimizer/README for discussion). We detect that case and
* omit the redundant clause from the result list.
*/ */
result = remove_redundant_join_clauses(root, rlist, result = list_concat(result,
IS_OUTER_JOIN(jointype)); generate_join_implied_equalities(root,
joinrel,
list_free(rlist); outer_rel,
inner_rel));
return result; return result;
} }
@ -527,15 +527,24 @@ build_joinrel_joinlist(RelOptInfo *joinrel,
RelOptInfo *outer_rel, RelOptInfo *outer_rel,
RelOptInfo *inner_rel) RelOptInfo *inner_rel)
{ {
subbuild_joinrel_joinlist(joinrel, outer_rel->joininfo); List *result;
subbuild_joinrel_joinlist(joinrel, inner_rel->joininfo);
/*
* Collect all the clauses that syntactically belong above this level,
* eliminating any duplicates (important since we will see many of the
* same clauses arriving from both input relations).
*/
result = subbuild_joinrel_joinlist(joinrel, outer_rel->joininfo, NIL);
result = subbuild_joinrel_joinlist(joinrel, inner_rel->joininfo, result);
joinrel->joininfo = result;
} }
static List * static List *
subbuild_joinrel_restrictlist(RelOptInfo *joinrel, subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
List *joininfo_list) List *joininfo_list,
List *new_restrictlist)
{ {
List *restrictlist = NIL;
ListCell *l; ListCell *l;
foreach(l, joininfo_list) foreach(l, joininfo_list)
@ -546,10 +555,12 @@ subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
{ {
/* /*
* This clause becomes a restriction clause for the joinrel, since * This clause becomes a restriction clause for the joinrel, since
* it refers to no outside rels. We don't bother to check for * it refers to no outside rels. Add it to the list, being
* duplicates here --- build_joinrel_restrictlist will do that. * careful to eliminate duplicates. (Since RestrictInfo nodes in
* different joinlists will have been multiply-linked rather than
* copied, pointer equality should be a sufficient test.)
*/ */
restrictlist = lappend(restrictlist, rinfo); new_restrictlist = list_append_unique_ptr(new_restrictlist, rinfo);
} }
else else
{ {
@ -560,12 +571,13 @@ subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
} }
} }
return restrictlist; return new_restrictlist;
} }
static void static List *
subbuild_joinrel_joinlist(RelOptInfo *joinrel, subbuild_joinrel_joinlist(RelOptInfo *joinrel,
List *joininfo_list) List *joininfo_list,
List *new_joininfo)
{ {
ListCell *l; ListCell *l;
@ -585,15 +597,14 @@ subbuild_joinrel_joinlist(RelOptInfo *joinrel,
{ {
/* /*
* This clause is still a join clause at this level, so add it to * This clause is still a join clause at this level, so add it to
* the joininfo list for the joinrel, being careful to eliminate * the new joininfo list, being careful to eliminate
* duplicates. (Since RestrictInfo nodes are normally * duplicates. (Since RestrictInfo nodes in different joinlists
* multiply-linked rather than copied, pointer equality should be * will have been multiply-linked rather than copied, pointer
* a sufficient test. If two equal() nodes should happen to sneak * equality should be a sufficient test.)
* in, no great harm is done --- they'll be detected by
* redundant-clause testing when they reach a restriction list.)
*/ */
joinrel->joininfo = list_append_unique_ptr(joinrel->joininfo, new_joininfo = list_append_unique_ptr(new_joininfo, rinfo);
rinfo);
} }
} }
return new_joininfo;
} }

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/optimizer/util/restrictinfo.c,v 1.51 2007/01/05 22:19:33 momjian Exp $ * $PostgreSQL: pgsql/src/backend/optimizer/util/restrictinfo.c,v 1.52 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -33,10 +33,9 @@ static Expr *make_sub_restrictinfos(Expr *clause,
bool outerjoin_delayed, bool outerjoin_delayed,
bool pseudoconstant, bool pseudoconstant,
Relids required_relids); Relids required_relids);
static RestrictInfo *join_clause_is_redundant(PlannerInfo *root, static bool join_clause_is_redundant(PlannerInfo *root,
RestrictInfo *rinfo, RestrictInfo *rinfo,
List *reference_list, List *reference_list);
bool isouterjoin);
/* /*
@ -336,19 +335,17 @@ make_restrictinfo_internal(Expr *clause,
* that happens only if it appears in the right context (top level of a * that happens only if it appears in the right context (top level of a
* joinclause list). * joinclause list).
*/ */
restrictinfo->parent_ec = NULL;
restrictinfo->eval_cost.startup = -1; restrictinfo->eval_cost.startup = -1;
restrictinfo->this_selec = -1; restrictinfo->this_selec = -1;
restrictinfo->mergejoinoperator = InvalidOid; restrictinfo->mergeopfamilies = NIL;
restrictinfo->left_sortop = InvalidOid;
restrictinfo->right_sortop = InvalidOid;
restrictinfo->mergeopfamily = InvalidOid;
restrictinfo->left_pathkey = NIL; restrictinfo->left_ec = NULL;
restrictinfo->right_pathkey = NIL; restrictinfo->right_ec = NULL;
restrictinfo->left_mergescansel = -1; restrictinfo->outer_is_left = false;
restrictinfo->right_mergescansel = -1;
restrictinfo->hashjoinoperator = InvalidOid; restrictinfo->hashjoinoperator = InvalidOid;
@ -529,78 +526,18 @@ extract_actual_join_clauses(List *restrictinfo_list,
} }
} }
/*
* remove_redundant_join_clauses
*
* Given a list of RestrictInfo clauses that are to be applied in a join,
* remove any duplicate or redundant clauses.
*
* We must eliminate duplicates when forming the restrictlist for a joinrel,
* since we will see many of the same clauses arriving from both input
* relations. Also, if a clause is a mergejoinable clause, it's possible that
* it is redundant with previous clauses (see optimizer/README for
* discussion). We detect that case and omit the redundant clause from the
* result list.
*
* The result is a fresh List, but it points to the same member nodes
* as were in the input.
*/
List *
remove_redundant_join_clauses(PlannerInfo *root, List *restrictinfo_list,
bool isouterjoin)
{
List *result = NIL;
ListCell *item;
QualCost cost;
/*
* If there are any redundant clauses, we want to eliminate the ones that
* are more expensive in favor of the ones that are less so. Run
* cost_qual_eval() to ensure the eval_cost fields are set up.
*/
cost_qual_eval(&cost, restrictinfo_list);
/*
* We don't have enough knowledge yet to be able to estimate the number of
* times a clause might be evaluated, so it's hard to weight the startup
* and per-tuple costs appropriately. For now just weight 'em the same.
*/
#define CLAUSECOST(r) ((r)->eval_cost.startup + (r)->eval_cost.per_tuple)
foreach(item, restrictinfo_list)
{
RestrictInfo *rinfo = (RestrictInfo *) lfirst(item);
RestrictInfo *prevrinfo;
/* is it redundant with any prior clause? */
prevrinfo = join_clause_is_redundant(root, rinfo, result, isouterjoin);
if (prevrinfo == NULL)
{
/* no, so add it to result list */
result = lappend(result, rinfo);
}
else if (CLAUSECOST(rinfo) < CLAUSECOST(prevrinfo))
{
/* keep this one, drop the previous one */
result = list_delete_ptr(result, prevrinfo);
result = lappend(result, rinfo);
}
/* else, drop this one */
}
return result;
}
/* /*
* select_nonredundant_join_clauses * select_nonredundant_join_clauses
* *
* Given a list of RestrictInfo clauses that are to be applied in a join, * Given a list of RestrictInfo clauses that are to be applied in a join,
* select the ones that are not redundant with any clause in the * select the ones that are not redundant with any clause in the
* reference_list. * reference_list. This is used only for nestloop-with-inner-indexscan
* joins: any clauses being checked by the index should be removed from
* the qpquals list.
* *
* This is similar to remove_redundant_join_clauses, but we are looking for * "Redundant" means either equal() or derived from the same EquivalenceClass.
* redundancies with a separate list of clauses (i.e., clauses that have * We have to check the latter because indxqual.c may select different derived
* already been applied below the join itself). * clauses than were selected by generate_join_implied_equalities().
* *
* Note that we assume the given restrictinfo_list has already been checked * Note that we assume the given restrictinfo_list has already been checked
* for local redundancies, so we don't check again. * for local redundancies, so we don't check again.
@ -608,8 +545,7 @@ remove_redundant_join_clauses(PlannerInfo *root, List *restrictinfo_list,
List * List *
select_nonredundant_join_clauses(PlannerInfo *root, select_nonredundant_join_clauses(PlannerInfo *root,
List *restrictinfo_list, List *restrictinfo_list,
List *reference_list, List *reference_list)
bool isouterjoin)
{ {
List *result = NIL; List *result = NIL;
ListCell *item; ListCell *item;
@ -619,7 +555,7 @@ select_nonredundant_join_clauses(PlannerInfo *root,
RestrictInfo *rinfo = (RestrictInfo *) lfirst(item); RestrictInfo *rinfo = (RestrictInfo *) lfirst(item);
/* drop it if redundant with any reference clause */ /* drop it if redundant with any reference clause */
if (join_clause_is_redundant(root, rinfo, reference_list, isouterjoin) != NULL) if (join_clause_is_redundant(root, rinfo, reference_list))
continue; continue;
/* otherwise, add it to result list */ /* otherwise, add it to result list */
@ -631,79 +567,28 @@ select_nonredundant_join_clauses(PlannerInfo *root,
/* /*
* join_clause_is_redundant * join_clause_is_redundant
* If rinfo is redundant with any clause in reference_list, * Test whether rinfo is redundant with any clause in reference_list.
* return one such clause; otherwise return NULL.
*
* This is the guts of both remove_redundant_join_clauses and
* select_nonredundant_join_clauses. See the docs above for motivation.
*
* We can detect redundant mergejoinable clauses very cheaply by using their
* left and right pathkeys, which uniquely identify the sets of equijoined
* variables in question. All the members of a pathkey set that are in the
* left relation have already been forced to be equal; likewise for those in
* the right relation. So, we need to have only one clause that checks
* equality between any set member on the left and any member on the right;
* by transitivity, all the rest are then equal.
*
* However, clauses that are of the form "var expr = const expr" cannot be
* eliminated as redundant. This is because when there are const expressions
* in a pathkey set, generate_implied_equalities() suppresses "var = var"
* clauses in favor of "var = const" clauses. We cannot afford to drop any
* of the latter, even though they might seem redundant by the pathkey
* membership test.
*
* Weird special case: if we have two clauses that seem redundant
* except one is pushed down into an outer join and the other isn't,
* then they're not really redundant, because one constrains the
* joined rows after addition of null fill rows, and the other doesn't.
*/ */
static RestrictInfo * static bool
join_clause_is_redundant(PlannerInfo *root, join_clause_is_redundant(PlannerInfo *root,
RestrictInfo *rinfo, RestrictInfo *rinfo,
List *reference_list, List *reference_list)
bool isouterjoin)
{ {
ListCell *refitem; ListCell *refitem;
foreach(refitem, reference_list)
{
RestrictInfo *refrinfo = (RestrictInfo *) lfirst(refitem);
/* always consider exact duplicates redundant */ /* always consider exact duplicates redundant */
foreach(refitem, reference_list)
{
RestrictInfo *refrinfo = (RestrictInfo *) lfirst(refitem);
if (equal(rinfo, refrinfo)) if (equal(rinfo, refrinfo))
return refrinfo; return true;
/* check if derived from same EquivalenceClass */
if (rinfo->parent_ec != NULL &&
rinfo->parent_ec == refrinfo->parent_ec)
return true;
} }
/* check for redundant merge clauses */ return false;
if (rinfo->mergejoinoperator != InvalidOid)
{
/* do the cheap test first: is it a "var = const" clause? */
if (bms_is_empty(rinfo->left_relids) ||
bms_is_empty(rinfo->right_relids))
return NULL; /* var = const, so not redundant */
cache_mergeclause_pathkeys(root, rinfo);
foreach(refitem, reference_list)
{
RestrictInfo *refrinfo = (RestrictInfo *) lfirst(refitem);
if (refrinfo->mergejoinoperator != InvalidOid)
{
cache_mergeclause_pathkeys(root, refrinfo);
if (rinfo->left_pathkey == refrinfo->left_pathkey &&
rinfo->right_pathkey == refrinfo->right_pathkey &&
(rinfo->is_pushed_down == refrinfo->is_pushed_down ||
!isouterjoin))
{
/* Yup, it's redundant */
return refrinfo;
}
}
}
}
/* otherwise, not redundant */
return NULL;
} }

View File

@ -8,7 +8,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.75 2007/01/05 22:19:33 momjian Exp $ * $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.76 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -171,6 +171,7 @@ parseCheckAggregates(ParseState *pstate, Query *qry)
{ {
root = makeNode(PlannerInfo); root = makeNode(PlannerInfo);
root->parse = qry; root->parse = qry;
root->planner_cxt = CurrentMemoryContext;
root->hasJoinRTEs = true; root->hasJoinRTEs = true;
groupClauses = (List *) flatten_join_alias_vars(root, groupClauses = (List *) flatten_join_alias_vars(root,

View File

@ -15,7 +15,7 @@
* *
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.219 2007/01/09 02:14:14 tgl Exp $ * $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.220 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -2345,7 +2345,7 @@ add_unique_group_var(PlannerInfo *root, List *varinfos,
* expressional index for which we have statistics, then we treat the * expressional index for which we have statistics, then we treat the
* whole expression as though it were just a Var. * whole expression as though it were just a Var.
* 2. If the list contains Vars of different relations that are known equal * 2. If the list contains Vars of different relations that are known equal
* due to equijoin clauses, then drop all but one of the Vars from each * due to equivalence classes, then drop all but one of the Vars from each
* known-equal set, keeping the one with smallest estimated # of values * known-equal set, keeping the one with smallest estimated # of values
* (since the extra values of the others can't appear in joined rows). * (since the extra values of the others can't appear in joined rows).
* Note the reason we only consider Vars of different relations is that * Note the reason we only consider Vars of different relations is that
@ -2365,10 +2365,9 @@ add_unique_group_var(PlannerInfo *root, List *varinfos,
* 4. If there are Vars from multiple rels, we repeat step 3 for each such * 4. If there are Vars from multiple rels, we repeat step 3 for each such
* rel, and multiply the results together. * rel, and multiply the results together.
* Note that rels not containing grouped Vars are ignored completely, as are * Note that rels not containing grouped Vars are ignored completely, as are
* join clauses other than the equijoin clauses used in step 2. Such rels * join clauses. Such rels cannot increase the number of groups, and we
* cannot increase the number of groups, and we assume such clauses do not * assume such clauses do not reduce the number either (somewhat bogus,
* reduce the number either (somewhat bogus, but we don't have the info to * but we don't have the info to do better).
* do better).
*/ */
double double
estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows) estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows)

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* IDENTIFICATION * IDENTIFICATION
* $PostgreSQL: pgsql/src/backend/utils/cache/lsyscache.c,v 1.143 2007/01/10 18:06:04 tgl Exp $ * $PostgreSQL: pgsql/src/backend/utils/cache/lsyscache.c,v 1.144 2007/01/20 20:45:40 tgl Exp $
* *
* NOTES * NOTES
* Eventually, the index information should go through here, too. * Eventually, the index information should go through here, too.
@ -138,153 +138,6 @@ get_opfamily_member(Oid opfamily, Oid lefttype, Oid righttype,
return result; return result;
} }
/*
* get_op_mergejoin_info
* Given the OIDs of a (putatively) mergejoinable equality operator
* and a sortop defining the sort ordering of the lefthand input of
* the merge clause, determine whether this sort ordering is actually
* usable for merging. If so, return the required sort ordering op
* for the righthand input, as well as the btree opfamily OID containing
* these operators and the operator strategy number of the two sortops
* (either BTLessStrategyNumber or BTGreaterStrategyNumber).
*
* We can mergejoin if we find the two operators in the same opfamily as
* equality and either less-than or greater-than respectively. If there
* are multiple such opfamilies, assume we can use any one.
*/
#ifdef NOT_YET
/* eventually should look like this */
bool
get_op_mergejoin_info(Oid eq_op, Oid left_sortop,
Oid *right_sortop, Oid *opfamily, int *opstrategy)
{
bool result = false;
Oid lefttype;
Oid righttype;
CatCList *catlist;
int i;
/* Make sure output args are initialized even on failure */
*right_sortop = InvalidOid;
*opfamily = InvalidOid;
*opstrategy = 0;
/* Need the righthand input datatype */
op_input_types(eq_op, &lefttype, &righttype);
/*
* Search through all the pg_amop entries containing the equality operator
*/
catlist = SearchSysCacheList(AMOPOPID, 1,
ObjectIdGetDatum(eq_op),
0, 0, 0);
for (i = 0; i < catlist->n_members; i++)
{
HeapTuple op_tuple = &catlist->members[i]->tuple;
Form_pg_amop op_form = (Form_pg_amop) GETSTRUCT(op_tuple);
Oid opfamily_id;
StrategyNumber op_strategy;
/* must be btree */
if (op_form->amopmethod != BTREE_AM_OID)
continue;
/* must use the operator as equality */
if (op_form->amopstrategy != BTEqualStrategyNumber)
continue;
/* See if sort operator is also in this opfamily with OK semantics */
opfamily_id = op_form->amopfamily;
op_strategy = get_op_opfamily_strategy(left_sortop, opfamily_id);
if (op_strategy == BTLessStrategyNumber ||
op_strategy == BTGreaterStrategyNumber)
{
/* Yes, so find the corresponding righthand sortop */
*right_sortop = get_opfamily_member(opfamily_id,
righttype,
righttype,
op_strategy);
if (OidIsValid(*right_sortop))
{
/* Found a workable mergejoin semantics */
*opfamily = opfamily_id;
*opstrategy = op_strategy;
result = true;
break;
}
}
}
ReleaseSysCacheList(catlist);
return result;
}
#else
/* temp implementation until planner gets smarter: left_sortop is output */
bool
get_op_mergejoin_info(Oid eq_op, Oid *left_sortop,
Oid *right_sortop, Oid *opfamily)
{
bool result = false;
Oid lefttype;
Oid righttype;
CatCList *catlist;
int i;
/* Make sure output args are initialized even on failure */
*left_sortop = InvalidOid;
*right_sortop = InvalidOid;
*opfamily = InvalidOid;
/* Need the input datatypes */
op_input_types(eq_op, &lefttype, &righttype);
/*
* Search through all the pg_amop entries containing the equality operator
*/
catlist = SearchSysCacheList(AMOPOPID, 1,
ObjectIdGetDatum(eq_op),
0, 0, 0);
for (i = 0; i < catlist->n_members; i++)
{
HeapTuple op_tuple = &catlist->members[i]->tuple;
Form_pg_amop op_form = (Form_pg_amop) GETSTRUCT(op_tuple);
Oid opfamily_id;
/* must be btree */
if (op_form->amopmethod != BTREE_AM_OID)
continue;
/* must use the operator as equality */
if (op_form->amopstrategy != BTEqualStrategyNumber)
continue;
opfamily_id = op_form->amopfamily;
/* Find the matching sortops */
*left_sortop = get_opfamily_member(opfamily_id,
lefttype,
lefttype,
BTLessStrategyNumber);
*right_sortop = get_opfamily_member(opfamily_id,
righttype,
righttype,
BTLessStrategyNumber);
if (OidIsValid(*left_sortop) && OidIsValid(*right_sortop))
{
/* Found a workable mergejoin semantics */
*opfamily = opfamily_id;
result = true;
break;
}
}
ReleaseSysCacheList(catlist);
return result;
}
#endif
/* /*
* get_compare_function_for_ordering_op * get_compare_function_for_ordering_op
* Get the OID of the datatype-specific btree comparison function * Get the OID of the datatype-specific btree comparison function
@ -469,6 +322,56 @@ get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type)
return result; return result;
} }
/*
* get_mergejoin_opfamilies
* Given a putatively mergejoinable operator, return a list of the OIDs
* of the btree opfamilies in which it represents equality.
*
* It is possible (though at present unusual) for an operator to be equality
* in more than one opfamily, hence the result is a list. This also lets us
* return NIL if the operator is not found in any opfamilies.
*
* The planner currently uses simple equal() tests to compare the lists
* returned by this function, which makes the list order relevant, though
* strictly speaking it should not be. Because of the way syscache list
* searches are handled, in normal operation the result will be sorted by OID
* so everything works fine. If running with system index usage disabled,
* the result ordering is unspecified and hence the planner might fail to
* recognize optimization opportunities ... but that's hardly a scenario in
* which performance is good anyway, so there's no point in expending code
* or cycles here to guarantee the ordering in that case.
*/
List *
get_mergejoin_opfamilies(Oid opno)
{
List *result = NIL;
CatCList *catlist;
int i;
/*
* Search pg_amop to see if the target operator is registered as the "="
* operator of any btree opfamily.
*/
catlist = SearchSysCacheList(AMOPOPID, 1,
ObjectIdGetDatum(opno),
0, 0, 0);
for (i = 0; i < catlist->n_members; i++)
{
HeapTuple tuple = &catlist->members[i]->tuple;
Form_pg_amop aform = (Form_pg_amop) GETSTRUCT(tuple);
/* must be btree equality */
if (aform->amopmethod == BTREE_AM_OID &&
aform->amopstrategy == BTEqualStrategyNumber)
result = lappend_oid(result, aform->amopfamily);
}
ReleaseSysCacheList(catlist);
return result;
}
/* /*
* get_compatible_hash_operator * get_compatible_hash_operator
* Get the OID of a hash equality operator compatible with the given * Get the OID of a hash equality operator compatible with the given

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/nodes/nodes.h,v 1.191 2007/01/05 22:19:55 momjian Exp $ * $PostgreSQL: pgsql/src/include/nodes/nodes.h,v 1.192 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -190,7 +190,9 @@ typedef enum NodeTag
T_ResultPath, T_ResultPath,
T_MaterialPath, T_MaterialPath,
T_UniquePath, T_UniquePath,
T_PathKeyItem, T_EquivalenceClass,
T_EquivalenceMember,
T_PathKey,
T_RestrictInfo, T_RestrictInfo,
T_InnerIndexscanInfo, T_InnerIndexscanInfo,
T_OuterJoinInfo, T_OuterJoinInfo,

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.132 2007/01/10 18:06:04 tgl Exp $ * $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.133 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -69,7 +69,7 @@ typedef struct PlannerInfo
* does not correspond to a base relation, such as a join RTE or an * does not correspond to a base relation, such as a join RTE or an
* unreferenced view RTE; or if the RelOptInfo hasn't been made yet. * unreferenced view RTE; or if the RelOptInfo hasn't been made yet.
*/ */
struct RelOptInfo **simple_rel_array; /* All 1-relation RelOptInfos */ struct RelOptInfo **simple_rel_array; /* All 1-rel RelOptInfos */
int simple_rel_array_size; /* allocated size of array */ int simple_rel_array_size; /* allocated size of array */
/* /*
@ -84,18 +84,20 @@ typedef struct PlannerInfo
List *join_rel_list; /* list of join-relation RelOptInfos */ List *join_rel_list; /* list of join-relation RelOptInfos */
struct HTAB *join_rel_hash; /* optional hashtable for join relations */ struct HTAB *join_rel_hash; /* optional hashtable for join relations */
List *equi_key_list; /* list of lists of equijoined PathKeyItems */ List *eq_classes; /* list of active EquivalenceClasses */
List *left_join_clauses; /* list of RestrictInfos for outer List *canon_pathkeys; /* list of "canonical" PathKeys */
* join clauses w/nonnullable var on
* left */
List *right_join_clauses; /* list of RestrictInfos for outer List *left_join_clauses; /* list of RestrictInfos for
* join clauses w/nonnullable var on * mergejoinable outer join clauses
* right */ * w/nonnullable var on left */
List *full_join_clauses; /* list of RestrictInfos for full List *right_join_clauses; /* list of RestrictInfos for
* outer join clauses */ * mergejoinable outer join clauses
* w/nonnullable var on right */
List *full_join_clauses; /* list of RestrictInfos for
* mergejoinable full join clauses */
List *oj_info_list; /* list of OuterJoinInfos */ List *oj_info_list; /* list of OuterJoinInfos */
@ -109,6 +111,8 @@ typedef struct PlannerInfo
List *group_pathkeys; /* groupClause pathkeys, if any */ List *group_pathkeys; /* groupClause pathkeys, if any */
List *sort_pathkeys; /* sortClause pathkeys, if any */ List *sort_pathkeys; /* sortClause pathkeys, if any */
MemoryContext planner_cxt; /* context holding PlannerInfo */
double total_table_pages; /* # of pages in all tables of query */ double total_table_pages; /* # of pages in all tables of query */
double tuple_fraction; /* tuple_fraction passed to query_planner */ double tuple_fraction; /* tuple_fraction passed to query_planner */
@ -209,7 +213,10 @@ typedef struct PlannerInfo
* baserestrictcost - Estimated cost of evaluating the baserestrictinfo * baserestrictcost - Estimated cost of evaluating the baserestrictinfo
* clauses at a single tuple (only used for base rels) * clauses at a single tuple (only used for base rels)
* joininfo - List of RestrictInfo nodes, containing info about each * joininfo - List of RestrictInfo nodes, containing info about each
* join clause in which this relation participates * join clause in which this relation participates (but
* note this excludes clauses that might be derivable from
* EquivalenceClasses)
* has_eclass_joins - flag that EquivalenceClass joins are possible
* index_outer_relids - only used for base rels; set of outer relids * index_outer_relids - only used for base rels; set of outer relids
* that participate in indexable joinclauses for this rel * that participate in indexable joinclauses for this rel
* index_inner_paths - only used for base rels; list of InnerIndexscanInfo * index_inner_paths - only used for base rels; list of InnerIndexscanInfo
@ -278,6 +285,7 @@ typedef struct RelOptInfo
QualCost baserestrictcost; /* cost of evaluating the above */ QualCost baserestrictcost; /* cost of evaluating the above */
List *joininfo; /* RestrictInfo structures for join clauses List *joininfo; /* RestrictInfo structures for join clauses
* involving this rel */ * involving this rel */
bool has_eclass_joins; /* T means joininfo is incomplete */
/* cached info about inner indexscan paths for relation: */ /* cached info about inner indexscan paths for relation: */
Relids index_outer_relids; /* other relids in indexable join Relids index_outer_relids; /* other relids in indexable join
@ -349,31 +357,106 @@ typedef struct IndexOptInfo
/* /*
* PathKeys * EquivalenceClasses
* *
* The sort ordering of a path is represented by a list of sublists of * Whenever we can determine that a mergejoinable equality clause A = B is
* PathKeyItem nodes. An empty list implies no known ordering. Otherwise * not delayed by any outer join, we create an EquivalenceClass containing
* the first sublist represents the primary sort key, the second the * the expressions A and B to record this knowledge. If we later find another
* first secondary sort key, etc. Each sublist contains one or more * equivalence B = C, we add C to the existing EquivalenceClass; this may
* PathKeyItem nodes, each of which can be taken as the attribute that * require merging two existing EquivalenceClasses. At the end of the qual
* appears at that sort position. (See optimizer/README for more * distribution process, we have sets of values that are known all transitively
* information.) * equal to each other, where "equal" is according to the rules of the btree
* operator family(s) shown in ec_opfamilies. (We restrict an EC to contain
* only equalities whose operators belong to the same set of opfamilies. This
* could probably be relaxed, but for now it's not worth the trouble, since
* nearly all equality operators belong to only one btree opclass anyway.)
*
* We also use EquivalenceClasses as the base structure for PathKeys, letting
* us represent knowledge about different sort orderings being equivalent.
* Since every PathKey must reference an EquivalenceClass, we will end up
* with single-member EquivalenceClasses whenever a sort key expression has
* not been equivalenced to anything else. It is also possible that such an
* EquivalenceClass will contain a volatile expression ("ORDER BY random()"),
* which is a case that can't arise otherwise since clauses containing
* volatile functions are never considered mergejoinable. We mark such
* EquivalenceClasses specially to prevent them from being merged with
* ordinary EquivalenceClasses.
*
* We allow equality clauses appearing below the nullable side of an outer join
* to form EquivalenceClasses, but these have a slightly different meaning:
* the included values might be all NULL rather than all the same non-null
* values. See src/backend/optimizer/README for more on that point.
*
* NB: if ec_merged isn't NULL, this class has been merged into another, and
* should be ignored in favor of using the pointed-to class.
*/ */
typedef struct EquivalenceClass
typedef struct PathKeyItem
{ {
NodeTag type; NodeTag type;
Node *key; /* the item that is ordered */ List *ec_opfamilies; /* btree operator family OIDs */
Oid sortop; /* the ordering operator ('<' op) */ List *ec_members; /* list of EquivalenceMembers */
bool nulls_first; /* do NULLs come before normal values? */ List *ec_sources; /* list of generating RestrictInfos */
Relids ec_relids; /* all relids appearing in ec_members */
bool ec_has_const; /* any pseudoconstants in ec_members? */
bool ec_has_volatile; /* the (sole) member is a volatile expr */
bool ec_below_outer_join; /* equivalence applies below an OJ */
bool ec_broken; /* failed to generate needed clauses? */
struct EquivalenceClass *ec_merged; /* set if merged into another EC */
} EquivalenceClass;
/* /*
* key typically points to a Var node, ie a relation attribute, but it can * EquivalenceMember - one member expression of an EquivalenceClass
* also point to an arbitrary expression representing the value indexed by *
* an index expression. * em_is_child signifies that this element was built by transposing a member
* for an inheritance parent relation to represent the corresponding expression
* on an inheritance child. The element should be ignored for all purposes
* except constructing inner-indexscan paths for the child relation. (Other
* types of join are driven from transposed joininfo-list entries.) Note
* that the EC's ec_relids field does NOT include the child relation.
*
* em_datatype is usually the same as exprType(em_expr), but can be
* different when dealing with a binary-compatible opfamily; in particular
* anyarray_ops would never work without this. Use em_datatype when
* looking up a specific btree operator to work with this expression.
*/ */
} PathKeyItem; typedef struct EquivalenceMember
{
NodeTag type;
Expr *em_expr; /* the expression represented */
Relids em_relids; /* all relids appearing in em_expr */
bool em_is_const; /* expression is pseudoconstant? */
bool em_is_child; /* derived version for a child relation? */
Oid em_datatype; /* the "nominal type" used by the opfamily */
} EquivalenceMember;
/*
* PathKeys
*
* The sort ordering of a path is represented by a list of PathKey nodes.
* An empty list implies no known ordering. Otherwise the first item
* represents the primary sort key, the second the first secondary sort key,
* etc. The value being sorted is represented by linking to an
* EquivalenceClass containing that value and including pk_opfamily among its
* ec_opfamilies. This is a convenient method because it makes it trivial
* to detect equivalent and closely-related orderings. (See optimizer/README
* for more information.)
*
* Note: pk_strategy is either BTLessStrategyNumber (for ASC) or
* BTGreaterStrategyNumber (for DESC). We assume that all ordering-capable
* index types will use btree-compatible strategy numbers.
*/
typedef struct PathKey
{
NodeTag type;
EquivalenceClass *pk_eclass; /* the value that is ordered */
Oid pk_opfamily; /* btree opfamily defining the ordering */
int pk_strategy; /* sort direction (ASC or DESC) */
bool pk_nulls_first; /* do NULLs come before normal values? */
} PathKey;
/* /*
* Type "Path" is used as-is for sequential-scan paths. For other * Type "Path" is used as-is for sequential-scan paths. For other
@ -398,7 +481,7 @@ typedef struct Path
Cost total_cost; /* total cost (assuming all tuples fetched) */ Cost total_cost; /* total cost (assuming all tuples fetched) */
List *pathkeys; /* sort ordering of path's output */ List *pathkeys; /* sort ordering of path's output */
/* pathkeys is a List of Lists of PathKeyItem nodes; see above */ /* pathkeys is a List of PathKey nodes; see above */
} Path; } Path;
/*---------- /*----------
@ -618,11 +701,7 @@ typedef JoinPath NestPath;
* A mergejoin path has these fields. * A mergejoin path has these fields.
* *
* path_mergeclauses lists the clauses (in the form of RestrictInfos) * path_mergeclauses lists the clauses (in the form of RestrictInfos)
* that will be used in the merge. The parallel arrays path_mergeFamilies, * that will be used in the merge.
* path_mergeStrategies, and path_mergeNullsFirst specify the merge semantics
* for each clause (i.e., define the relevant sort ordering for each clause).
* (XXX is this the most reasonable path-time representation? It's at least
* partially redundant with the pathkeys of the input paths.)
* *
* Note that the mergeclauses are a subset of the parent relation's * Note that the mergeclauses are a subset of the parent relation's
* restriction-clause list. Any join clauses that are not mergejoinable * restriction-clause list. Any join clauses that are not mergejoinable
@ -639,10 +718,6 @@ typedef struct MergePath
{ {
JoinPath jpath; JoinPath jpath;
List *path_mergeclauses; /* join clauses to be used for merge */ List *path_mergeclauses; /* join clauses to be used for merge */
/* these are arrays, but have the same length as the mergeclauses list: */
Oid *path_mergeFamilies; /* per-clause OIDs of opfamilies */
int *path_mergeStrategies; /* per-clause ordering (ASC or DESC) */
bool *path_mergeNullsFirst; /* per-clause nulls ordering */
List *outersortkeys; /* keys for explicit sort, if any */ List *outersortkeys; /* keys for explicit sort, if any */
List *innersortkeys; /* keys for explicit sort, if any */ List *innersortkeys; /* keys for explicit sort, if any */
} MergePath; } MergePath;
@ -696,6 +771,15 @@ typedef struct HashPath
* sequence we use. So, these clauses cannot be associated directly with * sequence we use. So, these clauses cannot be associated directly with
* the join RelOptInfo, but must be kept track of on a per-join-path basis. * the join RelOptInfo, but must be kept track of on a per-join-path basis.
* *
* RestrictInfos that represent equivalence conditions (i.e., mergejoinable
* equalities that are not outerjoin-delayed) are handled a bit differently.
* Initially we attach them to the EquivalenceClasses that are derived from
* them. When we construct a scan or join path, we look through all the
* EquivalenceClasses and generate derived RestrictInfos representing the
* minimal set of conditions that need to be checked for this particular scan
* or join to enforce that all members of each EquivalenceClass are in fact
* equal in all rows emitted by the scan or join.
*
* When dealing with outer joins we have to be very careful about pushing qual * When dealing with outer joins we have to be very careful about pushing qual
* clauses up and down the tree. An outer join's own JOIN/ON conditions must * clauses up and down the tree. An outer join's own JOIN/ON conditions must
* be evaluated exactly at that join node, and any quals appearing in WHERE or * be evaluated exactly at that join node, and any quals appearing in WHERE or
@ -728,9 +812,9 @@ typedef struct HashPath
* *
* In general, the referenced clause might be arbitrarily complex. The * In general, the referenced clause might be arbitrarily complex. The
* kinds of clauses we can handle as indexscan quals, mergejoin clauses, * kinds of clauses we can handle as indexscan quals, mergejoin clauses,
* or hashjoin clauses are fairly limited --- the code for each kind of * or hashjoin clauses are limited (e.g., no volatile functions). The code
* path is responsible for identifying the restrict clauses it can use * for each kind of path is responsible for identifying the restrict clauses
* and ignoring the rest. Clauses not implemented by an indexscan, * it can use and ignoring the rest. Clauses not implemented by an indexscan,
* mergejoin, or hashjoin will be placed in the plan qual or joinqual field * mergejoin, or hashjoin will be placed in the plan qual or joinqual field
* of the finished Plan node, where they will be enforced by general-purpose * of the finished Plan node, where they will be enforced by general-purpose
* qual-expression-evaluation code. (But we are still entitled to count * qual-expression-evaluation code. (But we are still entitled to count
@ -758,6 +842,12 @@ typedef struct HashPath
* estimates. Note that a pseudoconstant clause can never be an indexqual * estimates. Note that a pseudoconstant clause can never be an indexqual
* or merge or hash join clause, so it's of no interest to large parts of * or merge or hash join clause, so it's of no interest to large parts of
* the planner. * the planner.
*
* When join clauses are generated from EquivalenceClasses, there may be
* several equally valid ways to enforce join equivalence, of which we need
* apply only one. We mark clauses of this kind by setting parent_ec to
* point to the generating EquivalenceClass. Multiple clauses with the same
* parent_ec in the same join are redundant.
*/ */
typedef struct RestrictInfo typedef struct RestrictInfo
@ -787,23 +877,22 @@ typedef struct RestrictInfo
/* This field is NULL unless clause is an OR clause: */ /* This field is NULL unless clause is an OR clause: */
Expr *orclause; /* modified clause with RestrictInfos */ Expr *orclause; /* modified clause with RestrictInfos */
/* This field is NULL unless clause is potentially redundant: */
EquivalenceClass *parent_ec; /* generating EquivalenceClass */
/* cache space for cost and selectivity */ /* cache space for cost and selectivity */
QualCost eval_cost; /* eval cost of clause; -1 if not yet set */ QualCost eval_cost; /* eval cost of clause; -1 if not yet set */
Selectivity this_selec; /* selectivity; -1 if not yet set */ Selectivity this_selec; /* selectivity; -1 if not yet set */
/* valid if clause is mergejoinable, else InvalidOid: */ /* valid if clause is mergejoinable, else NIL */
Oid mergejoinoperator; /* copy of clause operator */ List *mergeopfamilies; /* opfamilies containing clause operator */
Oid left_sortop; /* leftside sortop needed for mergejoin */
Oid right_sortop; /* rightside sortop needed for mergejoin */
Oid mergeopfamily; /* btree opfamily relating these ops */
/* cache space for mergeclause processing; NIL if not yet set */ /* cache space for mergeclause processing; NULL if not yet set */
List *left_pathkey; /* canonical pathkey for left side */ EquivalenceClass *left_ec; /* EquivalenceClass containing lefthand */
List *right_pathkey; /* canonical pathkey for right side */ EquivalenceClass *right_ec; /* EquivalenceClass containing righthand */
/* cache space for mergeclause processing; -1 if not yet set */ /* transient workspace for use while considering a specific join path */
Selectivity left_mergescansel; /* fraction of left side to scan */ bool outer_is_left; /* T = outer var on left, F = on right */
Selectivity right_mergescansel; /* fraction of right side to scan */
/* valid if clause is hashjoinable, else InvalidOid: */ /* valid if clause is hashjoinable, else InvalidOid: */
Oid hashjoinoperator; /* copy of clause operator */ Oid hashjoinoperator; /* copy of clause operator */

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/optimizer/joininfo.h,v 1.33 2007/01/05 22:19:56 momjian Exp $ * $PostgreSQL: pgsql/src/include/optimizer/joininfo.h,v 1.34 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -23,8 +23,5 @@ extern bool have_relevant_joinclause(PlannerInfo *root,
extern void add_join_clause_to_rels(PlannerInfo *root, extern void add_join_clause_to_rels(PlannerInfo *root,
RestrictInfo *restrictinfo, RestrictInfo *restrictinfo,
Relids join_relids); Relids join_relids);
extern void remove_join_clause_from_rels(PlannerInfo *root,
RestrictInfo *restrictinfo,
Relids join_relids);
#endif /* JOININFO_H */ #endif /* JOININFO_H */

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/optimizer/pathnode.h,v 1.75 2007/01/10 18:06:04 tgl Exp $ * $PostgreSQL: pgsql/src/include/optimizer/pathnode.h,v 1.76 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -71,9 +71,6 @@ extern MergePath *create_mergejoin_path(PlannerInfo *root,
List *restrict_clauses, List *restrict_clauses,
List *pathkeys, List *pathkeys,
List *mergeclauses, List *mergeclauses,
Oid *mergefamilies,
int *mergestrategies,
bool *mergenullsfirst,
List *outersortkeys, List *outersortkeys,
List *innersortkeys); List *innersortkeys);

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.94 2007/01/05 22:19:56 momjian Exp $ * $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.95 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -52,6 +52,9 @@ extern List *group_clauses_by_indexkey(IndexOptInfo *index,
Relids outer_relids, Relids outer_relids,
SaOpControl saop_control, SaOpControl saop_control,
bool *found_clause); bool *found_clause);
extern bool eclass_matches_any_index(EquivalenceClass *ec,
EquivalenceMember *em,
RelOptInfo *rel);
extern bool match_index_to_operand(Node *operand, int indexcol, extern bool match_index_to_operand(Node *operand, int indexcol,
IndexOptInfo *index); IndexOptInfo *index);
extern List *expand_indexqual_conditions(IndexOptInfo *index, extern List *expand_indexqual_conditions(IndexOptInfo *index,
@ -89,6 +92,37 @@ extern List *make_rels_by_joins(PlannerInfo *root, int level, List **joinrels);
extern RelOptInfo *make_join_rel(PlannerInfo *root, extern RelOptInfo *make_join_rel(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2); RelOptInfo *rel1, RelOptInfo *rel2);
/*
* equivclass.c
* routines for managing EquivalenceClasses
*/
extern bool process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
bool below_outer_join);
extern void reconsider_outer_join_clauses(PlannerInfo *root);
extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
Expr *expr,
Oid expr_datatype,
List *opfamilies);
extern void generate_base_implied_equalities(PlannerInfo *root);
extern List *generate_join_implied_equalities(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outer_rel,
RelOptInfo *inner_rel);
extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
extern void add_child_rel_equivalences(PlannerInfo *root,
AppendRelInfo *appinfo,
RelOptInfo *parent_rel,
RelOptInfo *child_rel);
extern List *find_eclass_clauses_for_index_join(PlannerInfo *root,
RelOptInfo *rel,
Relids outer_relids);
extern bool have_relevant_eclass_joinclause(PlannerInfo *root,
RelOptInfo *rel1, RelOptInfo *rel2);
extern bool has_relevant_eclass_joinclause(PlannerInfo *root,
RelOptInfo *rel1);
extern bool eclass_useful_for_merging(EquivalenceClass *eclass,
RelOptInfo *rel);
/* /*
* pathkeys.c * pathkeys.c
* utilities for matching and building path keys * utilities for matching and building path keys
@ -101,9 +135,6 @@ typedef enum
PATHKEYS_DIFFERENT /* neither pathkey includes the other */ PATHKEYS_DIFFERENT /* neither pathkey includes the other */
} PathKeysComparison; } PathKeysComparison;
extern void add_equijoined_keys(PlannerInfo *root, RestrictInfo *restrictinfo);
extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
extern void generate_implied_equalities(PlannerInfo *root);
extern List *canonicalize_pathkeys(PlannerInfo *root, List *pathkeys); extern List *canonicalize_pathkeys(PlannerInfo *root, List *pathkeys);
extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2); extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
extern bool pathkeys_contained_in(List *keys1, List *keys2); extern bool pathkeys_contained_in(List *keys1, List *keys2);
@ -113,23 +144,29 @@ extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
List *pathkeys, List *pathkeys,
double fraction); double fraction);
extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index, extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
ScanDirection scandir, bool canonical); ScanDirection scandir);
extern List *convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel, extern List *convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
List *subquery_pathkeys); List *subquery_pathkeys);
extern List *build_join_pathkeys(PlannerInfo *root, extern List *build_join_pathkeys(PlannerInfo *root,
RelOptInfo *joinrel, RelOptInfo *joinrel,
JoinType jointype, JoinType jointype,
List *outer_pathkeys); List *outer_pathkeys);
extern List *make_pathkeys_for_sortclauses(List *sortclauses, extern List *make_pathkeys_for_sortclauses(PlannerInfo *root,
List *tlist); List *sortclauses,
extern void cache_mergeclause_pathkeys(PlannerInfo *root, List *tlist,
bool canonicalize);
extern void cache_mergeclause_eclasses(PlannerInfo *root,
RestrictInfo *restrictinfo); RestrictInfo *restrictinfo);
extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root, extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
List *pathkeys, List *pathkeys,
bool outer_keys,
List *restrictinfos); List *restrictinfos);
extern List *make_pathkeys_for_mergeclauses(PlannerInfo *root, extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses, List *mergeclauses,
RelOptInfo *rel); RelOptInfo *joinrel);
extern List *make_inner_pathkeys_for_merge(PlannerInfo *root,
List *mergeclauses,
List *outer_pathkeys);
extern int pathkeys_useful_for_merging(PlannerInfo *root, extern int pathkeys_useful_for_merging(PlannerInfo *root,
RelOptInfo *rel, RelOptInfo *rel,
List *pathkeys); List *pathkeys);

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.97 2007/01/10 18:06:04 tgl Exp $ * $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.98 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -38,6 +38,8 @@ extern Plan *create_plan(PlannerInfo *root, Path *best_path);
extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual, extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
Index scanrelid, Plan *subplan); Index scanrelid, Plan *subplan);
extern Append *make_append(List *appendplans, bool isTarget, List *tlist); extern Append *make_append(List *appendplans, bool isTarget, List *tlist);
extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
List *pathkeys);
extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls, extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
Plan *lefttree); Plan *lefttree);
extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls, extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
@ -69,12 +71,22 @@ extern int join_collapse_limit;
extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode); extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist); extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
Relids where_needed);
extern List *deconstruct_jointree(PlannerInfo *root); extern List *deconstruct_jointree(PlannerInfo *root);
extern void distribute_restrictinfo_to_rels(PlannerInfo *root,
RestrictInfo *restrictinfo);
extern void process_implied_equality(PlannerInfo *root, extern void process_implied_equality(PlannerInfo *root,
Node *item1, Node *item2, Oid opno,
Oid sortop1, Oid sortop2, Expr *item1,
Relids item1_relids, Relids item2_relids, Expr *item2,
bool delete_it); Relids qualscope,
bool below_outer_join,
bool both_const);
extern RestrictInfo *build_implied_join_equality(Oid opno,
Expr *item1,
Expr *item2,
Relids qualscope);
/* /*
* prototypes for plan/setrefs.c * prototypes for plan/setrefs.c

View File

@ -7,7 +7,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/optimizer/restrictinfo.h,v 1.39 2007/01/05 22:19:56 momjian Exp $ * $PostgreSQL: pgsql/src/include/optimizer/restrictinfo.h,v 1.40 2007/01/20 20:45:40 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -32,12 +32,8 @@ extern List *extract_actual_clauses(List *restrictinfo_list,
extern void extract_actual_join_clauses(List *restrictinfo_list, extern void extract_actual_join_clauses(List *restrictinfo_list,
List **joinquals, List **joinquals,
List **otherquals); List **otherquals);
extern List *remove_redundant_join_clauses(PlannerInfo *root,
List *restrictinfo_list,
bool isouterjoin);
extern List *select_nonredundant_join_clauses(PlannerInfo *root, extern List *select_nonredundant_join_clauses(PlannerInfo *root,
List *restrictinfo_list, List *restrictinfo_list,
List *reference_list, List *reference_list);
bool isouterjoin);
#endif /* RESTRICTINFO_H */ #endif /* RESTRICTINFO_H */

View File

@ -6,7 +6,7 @@
* Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
* Portions Copyright (c) 1994, Regents of the University of California * Portions Copyright (c) 1994, Regents of the University of California
* *
* $PostgreSQL: pgsql/src/include/utils/lsyscache.h,v 1.112 2007/01/10 18:06:05 tgl Exp $ * $PostgreSQL: pgsql/src/include/utils/lsyscache.h,v 1.113 2007/01/20 20:45:41 tgl Exp $
* *
*------------------------------------------------------------------------- *-------------------------------------------------------------------------
*/ */
@ -35,12 +35,11 @@ extern void get_op_opfamily_properties(Oid opno, Oid opfamily,
bool *recheck); bool *recheck);
extern Oid get_opfamily_member(Oid opfamily, Oid lefttype, Oid righttype, extern Oid get_opfamily_member(Oid opfamily, Oid lefttype, Oid righttype,
int16 strategy); int16 strategy);
extern bool get_op_mergejoin_info(Oid eq_op, Oid *left_sortop,
Oid *right_sortop, Oid *opfamily);
extern bool get_compare_function_for_ordering_op(Oid opno, extern bool get_compare_function_for_ordering_op(Oid opno,
Oid *cmpfunc, bool *reverse); Oid *cmpfunc, bool *reverse);
extern Oid get_equality_op_for_ordering_op(Oid opno); extern Oid get_equality_op_for_ordering_op(Oid opno);
extern Oid get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type); extern Oid get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type);
extern List *get_mergejoin_opfamilies(Oid opno);
extern Oid get_compatible_hash_operator(Oid opno, bool use_lhs_type); extern Oid get_compatible_hash_operator(Oid opno, bool use_lhs_type);
extern Oid get_op_hash_function(Oid opno); extern Oid get_op_hash_function(Oid opno);
extern void get_op_btree_interpretation(Oid opno, extern void get_op_btree_interpretation(Oid opno,