Refactor planner's pathkeys data structure to create a separate, explicit

representation of equivalence classes of variables. This is an extensive rewrite, but it brings a number of benefits: * planner no longer fails in the presence of "incomplete" operator families that don't offer operators for every possible combination of datatypes. * avoid generating and then discarding redundant equality clauses. * remove bogus assumption that derived equalities always use operators named "=". * mergejoins can work with a variety of sort orders (e.g., descending) now, instead of tying each mergejoinable operator to exactly one sort order. * better recognition of redundant sort columns. * can make use of equalities appearing underneath an outer join.
2025-12-07 12:02:30 +03:00 · 2007-01-20 20:45:41 +00:00
parent 2b7334d487
commit f41803bb39
35 changed files with 3882 additions and 2719 deletions
--- a/doc/src/sgml/xoper.sgml
+++ b/doc/src/sgml/xoper.sgml
@@ -1,4 +1,4 @@
-<!-- $PostgreSQL: pgsql/doc/src/sgml/xoper.sgml,v 1.37 2006/12/23 00:43:08 tgl Exp $ -->
+<!-- $PostgreSQL: pgsql/doc/src/sgml/xoper.sgml,v 1.38 2007/01/20 20:45:38 tgl Exp $ -->
 <sect1 id="xoper">
  <title>User-Defined Operators</title>
@@ -145,29 +145,29 @@ SELECT (a + b) AS c FROM test_complex;
     <itemizedlist>
      <listitem>
       <para>
-	One way is to omit the <literal>COMMUTATOR</> clause in the first operator that
+        One way is to omit the <literal>COMMUTATOR</> clause in the first operator that
-	you define, and then provide one in the second operator's definition.
+        you define, and then provide one in the second operator's definition.
-	Since <productname>PostgreSQL</productname> knows that commutative
+        Since <productname>PostgreSQL</productname> knows that commutative
-	operators come in pairs, when it sees the second definition it will
+        operators come in pairs, when it sees the second definition it will
-	automatically go back and fill in the missing <literal>COMMUTATOR</> clause in
+        automatically go back and fill in the missing <literal>COMMUTATOR</> clause in
-	the first definition.
+        the first definition.
       </para>
      </listitem>
      <listitem>
       <para>
-	The other, more straightforward way is just to include <literal>COMMUTATOR</> clauses
+        The other, more straightforward way is just to include <literal>COMMUTATOR</> clauses
-	in both definitions.  When <productname>PostgreSQL</productname> processes
+        in both definitions.  When <productname>PostgreSQL</productname> processes
-	the first definition and realizes that <literal>COMMUTATOR</> refers to a nonexistent
+        the first definition and realizes that <literal>COMMUTATOR</> refers to a nonexistent
-	operator, the system will make a dummy entry for that operator in the
+        operator, the system will make a dummy entry for that operator in the
-	system catalog.  This dummy entry will have valid data only
+        system catalog.  This dummy entry will have valid data only
-	for the operator name, left and right operand types, and result type,
+        for the operator name, left and right operand types, and result type,
-	since that's all that <productname>PostgreSQL</productname> can deduce
+        since that's all that <productname>PostgreSQL</productname> can deduce
-	at this point.  The first operator's catalog entry will link to this
+        at this point.  The first operator's catalog entry will link to this
-	dummy entry.  Later, when you define the second operator, the system
+        dummy entry.  Later, when you define the second operator, the system
-	updates the dummy entry with the additional information from the second
+        updates the dummy entry with the additional information from the second
-	definition.  If you try to use the dummy operator before it's been filled
+        definition.  If you try to use the dummy operator before it's been filled
-	in, you'll just get an error message.
+        in, you'll just get an error message.
       </para>
      </listitem>
     </itemizedlist>
@@ -240,7 +240,7 @@ column OP constant
    one of the system's standard estimators for many of your own operators.
    These are the standard restriction estimators:
    <simplelist>
-     <member><function>eqsel</>	for <literal>=</></member>
+     <member><function>eqsel</> for <literal>=</></member>
     <member><function>neqsel</> for <literal>&lt;&gt;</></member>
     <member><function>scalarltsel</> for <literal>&lt;</> or <literal>&lt;=</></member>
     <member><function>scalargtsel</> for <literal>&gt;</> or <literal>&gt;=</></member>
@@ -337,7 +337,7 @@ table1.column1 OP table2.column2
     join will never compare them at all, implicitly assuming that the
     result of the join operator must be false.  So it never makes sense
     to specify <literal>HASHES</literal> for operators that do not represent
-     equality.
+     some form of equality.
    </para>
    <para>
@@ -347,7 +347,7 @@ table1.column1 OP table2.column2
     exist yet.  But attempts to use the operator in hash joins will fail
     at run time if no such operator family exists.  The system needs the
     operator family to find the data-type-specific hash function for the
-     operator's input data type.  Of course, you must also supply a suitable
+     operator's input data type.  Of course, you must also create a suitable
     hash function before you can create the operator family.
    </para>
@@ -382,8 +382,9 @@ table1.column1 OP table2.column2
     false, never null, for any two nonnull inputs.  If this rule is
     not followed, hash-optimization of <literal>IN</> operations may
     generate wrong results.  (Specifically, <literal>IN</> might return
-     false where the correct answer according to the standard would be null; or it might
+     false where the correct answer according to the standard would be null;
-     yield an error complaining that it wasn't prepared for a null result.)
+     or it might yield an error complaining that it wasn't prepared for a
     null result.)
    </para>
    </note>
@@ -407,19 +408,18 @@ table1.column1 OP table2.column2
     that can only succeed for pairs of values that fall at the
     <quote>same place</>
     in the sort order.  In practice this means that the join operator must
-     behave like equality.  But unlike hash join, where the left and right
+     behave like equality.  But it is possible to merge-join two
     data types had better be the same (or at least bitwise equivalent),
     it is possible to merge-join two
     distinct data types so long as they are logically compatible.  For
-     example, the <type>smallint</type>-versus-<type>integer</type> equality operator
+     example, the <type>smallint</type>-versus-<type>integer</type>
-     is merge-joinable.
+     equality operator is merge-joinable.
     We only need sorting operators that will bring both data types into a
     logically compatible sequence.
    </para>
    <para>
     To be marked <literal>MERGES</literal>, the join operator must appear
-     in a btree index operator family.  This is not enforced when you create
+     as an equality member of a btree index operator family.
     This is not enforced when you create
     the operator, since of course the referencing operator family couldn't
     exist yet.  But the operator will not actually be used for merge joins
     unless a matching operator family can be found.  The
@@ -428,30 +428,14 @@ table1.column1 OP table2.column2
    </para>
    <para>
-     There are additional restrictions on operators that you mark
+     A merge-joinable operator must have a commutator (itself if the two
-     merge-joinable.  These restrictions are not currently checked by
+     operand data types are the same, or a related equality operator
-     <command>CREATE OPERATOR</command>, but errors may occur when
+     if they are different) that appears in the same operator family.
-     the operator is used if any are not true:
+     If this is not the case, planner errors may occur when the operator
-
+     is used.  Also, it is a good idea (but not strictly required) for
-     <itemizedlist>
+     a btree operator family that supports multiple datatypes to provide
-      <listitem>
+     equality operators for every combination of the datatypes; this
-       <para>
+     allows better optimization.
 	A merge-joinable equality operator must have a merge-joinable
        commutator (itself if the two operand data types are the same, or a related
        equality operator if they are different).
       </para>
      </listitem>
      <listitem>
       <para>
        If there is a merge-joinable operator relating any two data types
 	A and B, and another merge-joinable operator relating B to any
 	third data type C, then A and C must also have a merge-joinable
 	operator; in other words, having a merge-joinable operator must
 	be transitive.
       </para>
      </listitem>
     </itemizedlist>
    </para>
    <note>
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -15,7 +15,7 @@
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.361 2007/01/10 18:06:02 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/nodes/copyfuncs.c,v 1.362 2007/01/20 20:45:38 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -1284,16 +1284,18 @@ _copyFromExpr(FromExpr *from)
 */
 /*
- * _copyPathKeyItem
+ * _copyPathKey
 */
-static PathKeyItem *
+static PathKey *
-_copyPathKeyItem(PathKeyItem *from)
+_copyPathKey(PathKey *from)
 {
-	PathKeyItem *newnode = makeNode(PathKeyItem);
+	PathKey *newnode = makeNode(PathKey);
-	COPY_NODE_FIELD(key);
+	/* EquivalenceClasses are never moved, so just shallow-copy the pointer */
-	COPY_SCALAR_FIELD(sortop);
+	COPY_SCALAR_FIELD(pk_eclass);
-	COPY_SCALAR_FIELD(nulls_first);
+	COPY_SCALAR_FIELD(pk_opfamily);
 	COPY_SCALAR_FIELD(pk_strategy);
 	COPY_SCALAR_FIELD(pk_nulls_first);
 	return newnode;
 }
@@ -1316,21 +1318,15 @@ _copyRestrictInfo(RestrictInfo *from)
 	COPY_BITMAPSET_FIELD(left_relids);
 	COPY_BITMAPSET_FIELD(right_relids);
 	COPY_NODE_FIELD(orclause);
 	/* EquivalenceClasses are never copied, so shallow-copy the pointers */
 	COPY_SCALAR_FIELD(parent_ec);
 	COPY_SCALAR_FIELD(eval_cost);
 	COPY_SCALAR_FIELD(this_selec);
-	COPY_SCALAR_FIELD(mergejoinoperator);
+	COPY_NODE_FIELD(mergeopfamilies);
-	COPY_SCALAR_FIELD(left_sortop);
+	/* EquivalenceClasses are never copied, so shallow-copy the pointers */
-	COPY_SCALAR_FIELD(right_sortop);
+	COPY_SCALAR_FIELD(left_ec);
-	COPY_SCALAR_FIELD(mergeopfamily);
+	COPY_SCALAR_FIELD(right_ec);
-
+	COPY_SCALAR_FIELD(outer_is_left);
 	/*
 	 * Do not copy pathkeys, since they'd not be canonical in a copied query
 	 */
 	newnode->left_pathkey = NIL;
 	newnode->right_pathkey = NIL;
 	COPY_SCALAR_FIELD(left_mergescansel);
 	COPY_SCALAR_FIELD(right_mergescansel);
 	COPY_SCALAR_FIELD(hashjoinoperator);
 	COPY_SCALAR_FIELD(left_bucketsize);
 	COPY_SCALAR_FIELD(right_bucketsize);
@@ -3033,8 +3029,8 @@ copyObject(void *from)
 			/*
 			 * RELATION NODES
 			 */
-		case T_PathKeyItem:
+		case T_PathKey:
-			retval = _copyPathKeyItem(from);
+			retval = _copyPathKey(from);
 			break;
 		case T_RestrictInfo:
 			retval = _copyRestrictInfo(from);
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -18,7 +18,7 @@
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.295 2007/01/10 18:06:03 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/nodes/equalfuncs.c,v 1.296 2007/01/20 20:45:38 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -596,11 +596,27 @@ _equalFromExpr(FromExpr *a, FromExpr *b)
 */
 static bool
-_equalPathKeyItem(PathKeyItem *a, PathKeyItem *b)
+_equalPathKey(PathKey *a, PathKey *b)
 {
-	COMPARE_NODE_FIELD(key);
+	/*
-	COMPARE_SCALAR_FIELD(sortop);
+	 * This is normally used on non-canonicalized PathKeys, so must chase
-	COMPARE_SCALAR_FIELD(nulls_first);
+	 * up to the topmost merged EquivalenceClass and see if those are the
 	 * same (by pointer equality).
 	 */
 	EquivalenceClass *a_eclass;
 	EquivalenceClass *b_eclass;
 	a_eclass = a->pk_eclass;
 	while (a_eclass->ec_merged)
 		a_eclass = a_eclass->ec_merged;
 	b_eclass = b->pk_eclass;
 	while (b_eclass->ec_merged)
 		b_eclass = b_eclass->ec_merged;
 	if (a_eclass != b_eclass)
 		return false;
 	COMPARE_SCALAR_FIELD(pk_opfamily);
 	COMPARE_SCALAR_FIELD(pk_strategy);
 	COMPARE_SCALAR_FIELD(pk_nulls_first);
 	return true;
 }
@@ -2016,8 +2032,8 @@ equal(void *a, void *b)
 			/*
 			 * RELATION NODES
 			 */
-		case T_PathKeyItem:
+		case T_PathKey:
-			retval = _equalPathKeyItem(a, b);
+			retval = _equalPathKey(a, b);
 			break;
 		case T_RestrictInfo:
 			retval = _equalRestrictInfo(a, b);
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.293 2007/01/10 18:06:03 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/nodes/outfuncs.c,v 1.294 2007/01/20 20:45:38 tgl Exp $
 *
 * NOTES
 *	  Every node type that can appear in stored rules' parsetrees *must*
@@ -1196,29 +1196,11 @@ _outNestPath(StringInfo str, NestPath *node)
 static void
 _outMergePath(StringInfo str, MergePath *node)
 {
 	int			numCols;
 	int			i;
 	WRITE_NODE_TYPE("MERGEPATH");
 	_outJoinPathInfo(str, (JoinPath *) node);
 	WRITE_NODE_FIELD(path_mergeclauses);
 	numCols = list_length(node->path_mergeclauses);
 	appendStringInfo(str, " :path_mergeFamilies");
 	for (i = 0; i < numCols; i++)
 		appendStringInfo(str, " %u", node->path_mergeFamilies[i]);
 	appendStringInfo(str, " :path_mergeStrategies");
 	for (i = 0; i < numCols; i++)
 		appendStringInfo(str, " %d", node->path_mergeStrategies[i]);
 	appendStringInfo(str, " :path_mergeNullsFirst");
 	for (i = 0; i < numCols; i++)
 		appendStringInfo(str, " %d", (int) node->path_mergeNullsFirst[i]);
 	WRITE_NODE_FIELD(outersortkeys);
 	WRITE_NODE_FIELD(innersortkeys);
 }
@@ -1241,7 +1223,8 @@ _outPlannerInfo(StringInfo str, PlannerInfo *node)
 	/* NB: this isn't a complete set of fields */
 	WRITE_NODE_FIELD(parse);
 	WRITE_NODE_FIELD(join_rel_list);
-	WRITE_NODE_FIELD(equi_key_list);
+	WRITE_NODE_FIELD(eq_classes);
 	WRITE_NODE_FIELD(canon_pathkeys);
 	WRITE_NODE_FIELD(left_join_clauses);
 	WRITE_NODE_FIELD(right_join_clauses);
 	WRITE_NODE_FIELD(full_join_clauses);
@@ -1284,6 +1267,7 @@ _outRelOptInfo(StringInfo str, RelOptInfo *node)
 	WRITE_NODE_FIELD(subplan);
 	WRITE_NODE_FIELD(baserestrictinfo);
 	WRITE_NODE_FIELD(joininfo);
 	WRITE_BOOL_FIELD(has_eclass_joins);
 	WRITE_BITMAPSET_FIELD(index_outer_relids);
 	WRITE_NODE_FIELD(index_inner_paths);
 }
@@ -1306,13 +1290,48 @@ _outIndexOptInfo(StringInfo str, IndexOptInfo *node)
 }
 static void
-_outPathKeyItem(StringInfo str, PathKeyItem *node)
+_outEquivalenceClass(StringInfo str, EquivalenceClass *node)
 {
-	WRITE_NODE_TYPE("PATHKEYITEM");
+	/*
 	 * To simplify reading, we just chase up to the topmost merged EC and
 	 * print that, without bothering to show the merge-ees separately.
 	 */
 	while (node->ec_merged)
 		node = node->ec_merged;
-	WRITE_NODE_FIELD(key);
+	WRITE_NODE_TYPE("EQUIVALENCECLASS");
-	WRITE_OID_FIELD(sortop);
+
-	WRITE_BOOL_FIELD(nulls_first);
+	WRITE_NODE_FIELD(ec_opfamilies);
 	WRITE_NODE_FIELD(ec_members);
 	WRITE_NODE_FIELD(ec_sources);
 	WRITE_BITMAPSET_FIELD(ec_relids);
 	WRITE_BOOL_FIELD(ec_has_const);
 	WRITE_BOOL_FIELD(ec_has_volatile);
 	WRITE_BOOL_FIELD(ec_below_outer_join);
 	WRITE_BOOL_FIELD(ec_broken);
 }
 static void
 _outEquivalenceMember(StringInfo str, EquivalenceMember *node)
 {
 	WRITE_NODE_TYPE("EQUIVALENCEMEMBER");
 	WRITE_NODE_FIELD(em_expr);
 	WRITE_BITMAPSET_FIELD(em_relids);
 	WRITE_BOOL_FIELD(em_is_const);
 	WRITE_BOOL_FIELD(em_is_child);
 	WRITE_OID_FIELD(em_datatype);
 }
 static void
 _outPathKey(StringInfo str, PathKey *node)
 {
 	WRITE_NODE_TYPE("PATHKEY");
 	WRITE_NODE_FIELD(pk_eclass);
 	WRITE_OID_FIELD(pk_opfamily);
 	WRITE_INT_FIELD(pk_strategy);
 	WRITE_BOOL_FIELD(pk_nulls_first);
 }
 static void
@@ -1331,12 +1350,11 @@ _outRestrictInfo(StringInfo str, RestrictInfo *node)
 	WRITE_BITMAPSET_FIELD(left_relids);
 	WRITE_BITMAPSET_FIELD(right_relids);
 	WRITE_NODE_FIELD(orclause);
-	WRITE_OID_FIELD(mergejoinoperator);
+	WRITE_NODE_FIELD(parent_ec);
-	WRITE_OID_FIELD(left_sortop);
+	WRITE_NODE_FIELD(mergeopfamilies);
-	WRITE_OID_FIELD(right_sortop);
+	WRITE_NODE_FIELD(left_ec);
-	WRITE_OID_FIELD(mergeopfamily);
+	WRITE_NODE_FIELD(right_ec);
-	WRITE_NODE_FIELD(left_pathkey);
+	WRITE_BOOL_FIELD(outer_is_left);
 	WRITE_NODE_FIELD(right_pathkey);
 	WRITE_OID_FIELD(hashjoinoperator);
 }
@@ -2163,8 +2181,14 @@ _outNode(StringInfo str, void *obj)
 			case T_IndexOptInfo:
 				_outIndexOptInfo(str, obj);
 				break;
-			case T_PathKeyItem:
+			case T_EquivalenceClass:
-				_outPathKeyItem(str, obj);
+				_outEquivalenceClass(str, obj);
 				break;
 			case T_EquivalenceMember:
 				_outEquivalenceMember(str, obj);
 				break;
 			case T_PathKey:
 				_outPathKey(str, obj);
 				break;
 			case T_RestrictInfo:
 				_outRestrictInfo(str, obj);
--- a/src/backend/nodes/print.c
+++ b/src/backend/nodes/print.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/nodes/print.c,v 1.82 2007/01/05 22:19:30 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/nodes/print.c,v 1.83 2007/01/20 20:45:38 tgl Exp $
 *
 * HISTORY
 *	  AUTHOR			DATE			MAJOR EVENT
@@ -404,7 +404,7 @@ print_expr(Node *expr, List *rtable)
 /*
 * print_pathkeys -
- *	  pathkeys list of list of PathKeyItems
+ *	  pathkeys list of PathKeys
 */
 void
 print_pathkeys(List *pathkeys, List *rtable)
@@ -414,17 +414,26 @@ print_pathkeys(List *pathkeys, List *rtable)
 	printf("(");
 	foreach(i, pathkeys)
 	{
-		List	   *pathkey = (List *) lfirst(i);
+		PathKey	   *pathkey = (PathKey *) lfirst(i);
 		EquivalenceClass *eclass;
 		ListCell   *k;
 		bool		first = true;
 		eclass = pathkey->pk_eclass;
 		/* chase up, in case pathkey is non-canonical */
 		while (eclass->ec_merged)
 			eclass = eclass->ec_merged;
 		printf("(");
-		foreach(k, pathkey)
+		foreach(k, eclass->ec_members)
 		{
-			PathKeyItem *item = (PathKeyItem *) lfirst(k);
+			EquivalenceMember *mem = (EquivalenceMember *) lfirst(k);
-			print_expr(item->key, rtable);
+			if (first)
-			if (lnext(k))
+				first = false;
 			else
 				printf(", ");
 			print_expr((Node *) mem->em_expr, rtable);
 		}
 		printf(")");
 		if (lnext(i))
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -90,21 +90,19 @@ have a list of relations to join.  However, FULL OUTER JOIN clauses are
 never flattened, and other kinds of JOIN might not be either, if the
 flattening process is stopped by join_collapse_limit or from_collapse_limit
 restrictions.  Therefore, we end up with a planning problem that contains
-both lists of relations to be joined in any order, and JOIN nodes that
+lists of relations to be joined in any order, where any individual item
-force a particular join order.  For each un-flattened JOIN node, we join
+might be a sub-list that has to be joined together before we can consider
-exactly that pair of relations (after recursively planning their inputs,
+joining it to its siblings.  We process these sub-problems recursively,
-if the inputs aren't single base relations).  We generate a Path for each
+bottom up.  Note that the join list structure constrains the possible join
-feasible join method, and select the cheapest Path.  Note that the JOIN
+orders, but it doesn't constrain the join implementation method at each
-clause structure determines the join Path structure, but it doesn't
+join (nestloop, merge, hash), nor does it say which rel is considered outer
-constrain the join implementation method at each join (nestloop, merge,
+or inner at each join.  We consider all these possibilities in building
-hash), nor does it say which rel is considered outer or inner at each
+Paths. We generate a Path for each feasible join method, and select the
-join.  We consider all these possibilities in building Paths.
+cheapest Path.
-3) At the top level of the FROM clause we will have a list of relations
+For each planning problem, therefore, we will have a list of relations
-that are either base rels or joinrels constructed per un-flattened JOIN
+that are either base rels or joinrels constructed per sub-join-lists.
-directives.  (This is also the situation, recursively, when we can flatten
+We can join these rels together in any order the planner sees fit.
 sub-joins underneath an un-flattenable JOIN into a list of relations to
 join.)  We can join these rels together in any order the planner sees fit.
 The standard (non-GEQO) planner does this as follows:
 Consider joining each RelOptInfo to each other RelOptInfo specified in its
@@ -114,17 +112,17 @@ choice but to generate a clauseless Cartesian-product join; so we consider
 joining that rel to each other available rel.  But in the presence of join
 clauses we will only consider joins that use available join clauses.)
-If we only had two relations in the FROM list, we are done: we just pick
+If we only had two relations in the list, we are done: we just pick
 the cheapest path for the join RelOptInfo.  If we had more than two, we now
 need to consider ways of joining join RelOptInfos to each other to make
-join RelOptInfos that represent more than two FROM items.
+join RelOptInfos that represent more than two list items.
 The join tree is constructed using a "dynamic programming" algorithm:
 in the first pass (already described) we consider ways to create join rels
-representing exactly two FROM items.  The second pass considers ways
+representing exactly two list items.  The second pass considers ways
-to make join rels that represent exactly three FROM items; the next pass,
+to make join rels that represent exactly three list items; the next pass,
 four items, etc.  The last pass considers how to make the final join
-relation that includes all FROM items --- obviously there can be only one
+relation that includes all list items --- obviously there can be only one
 join rel at this top level, whereas there can be more than one join rel
 at lower levels.  At each level we use joins that follow available join
 clauses, if possible, just as described for the first level.
@@ -155,7 +153,7 @@ For example:
    {1 2 3 4}
 We consider left-handed plans (the outer rel of an upper join is a joinrel,
-but the inner is always a single FROM item); right-handed plans (outer rel
+but the inner is always a single list item); right-handed plans (outer rel
 is always a single item); and bushy plans (both inner and outer can be
 joins themselves).  For example, when building {1 2 3 4} we consider
 joining {1 2 3} to {4} (left-handed), {4} to {1 2 3} (right-handed), and
@@ -336,7 +334,9 @@ RelOptInfo      - a relation or joined relations
  MergePath     - merge joins
  HashPath      - hash joins
- PathKeys       - a data structure representing the ordering of a path
+ EquivalenceClass - a data structure representing a set of values known equal
 PathKey        - a data structure representing the sort ordering of a path
 The optimizer spends a good deal of its time worrying about the ordering
 of the tuples returned by a path.  The reason this is useful is that by
@@ -363,213 +363,250 @@ without sorting, since it can pick from any of the paths retained for its
 inputs.
 EquivalenceClasses
 ------------------
 During the deconstruct_jointree() scan of the query's qual clauses, we look
 for mergejoinable equality clauses A = B whose applicability is not delayed
 by an outer join; these are called "equivalence clauses".  When we find
 one, we create an EquivalenceClass containing the expressions A and B to
 record this knowledge.  If we later find another equivalence clause B = C,
 we add C to the existing EquivalenceClass for {A B}; this may require
 merging two existing EquivalenceClasses.  At the end of the scan, we have
 sets of values that are known all transitively equal to each other.  We can
 therefore use a comparison of any pair of the values as a restriction or
 join clause (when these values are available at the scan or join, of
 course); furthermore, we need test only one such comparison, not all of
 them.  Therefore, equivalence clauses are removed from the standard qual
 distribution process.  Instead, when preparing a restriction or join clause
 list, we examine each EquivalenceClass to see if it can contribute a
 clause, and if so we select an appropriate pair of values to compare.  For
 example, if we are trying to join A's relation to C's, we can generate the
 clause A = C, even though this appeared nowhere explicitly in the original
 query.  This may allow us to explore join paths that otherwise would have
 been rejected as requiring Cartesian-product joins.
 Sometimes an EquivalenceClass may contain a pseudo-constant expression
 (i.e., one not containing Vars or Aggs of the current query level, nor
 volatile functions).  In this case we do not follow the policy of
 dynamically generating join clauses: instead, we dynamically generate
 restriction clauses "var = const" wherever one of the variable members of
 the class can first be computed.  For example, if we have A = B and B = 42,
 we effectively generate the restriction clauses A = 42 and B = 42, and then
 we need not bother with explicitly testing the join clause A = B when the
 relations are joined.  In effect, all the class members can be tested at
 relation-scan level and there's never a need for join tests.
 The precise technical interpretation of an EquivalenceClass is that it
 asserts that at any plan node where more than one of its member values
 can be computed, output rows in which the values are not all equal may
 be discarded without affecting the query result.  (We require all levels
 of the plan to enforce EquivalenceClasses, hence a join need not recheck
 equality of values that were computable by one of its children.)  For an
 ordinary EquivalenceClass that is "valid everywhere", we can further infer
 that the values are all non-null, because all mergejoinable operators are
 strict.  However, we also allow equivalence clauses that appear below the
 nullable side of an outer join to form EquivalenceClasses; for these
 classes, the interpretation is that either all the values are equal, or
 all (except pseudo-constants) have gone to null.  (This requires a
 limitation that non-constant members be strict, else they might not go
 to null when the other members do.)  Consider for example
 	SELECT *
 	  FROM a LEFT JOIN
 	       (SELECT * FROM b JOIN c ON b.y = c.z WHERE b.y = 10) ss
 	       ON a.x = ss.y
 	  WHERE a.x = 42;
 We can form the below-outer-join EquivalenceClass {b.y c.z 10} and thereby
 apply c.z = 10 while scanning c.  (The reason we disallow outerjoin-delayed
 clauses from forming EquivalenceClasses is exactly that we want to be able
 to push any derived clauses as far down as possible.)  But once above the
 outer join it's no longer necessarily the case that b.y = 10, and thus we
 cannot use such EquivalenceClasses to conclude that sorting is unnecessary
 (see discussion of PathKeys below).
 In this example, notice also that a.x = ss.y (really a.x = b.y) is not an
 equivalence clause because its applicability to b is delayed by the outer
 join; thus we do not try to insert b.y into the equivalence class {a.x 42}.
 But since we see that a.x has been equated to 42 above the outer join, we
 are able to form a below-outer-join class {b.y 42}; this restriction can be
 added because no b/c row not having b.y = 42 can contribute to the result
 of the outer join, and so we need not compute such rows.  Now this class
 will get merged with {b.y c.z 10}, leading to the contradiction 10 = 42,
 which lets the planner deduce that the b/c join need not be computed at all
 because none of its rows can contribute to the outer join.  (This gets
 implemented as a gating Result filter, since more usually the potential
 contradiction involves Param values rather than just Consts, and thus has
 to be checked at runtime.)
 To aid in determining the sort ordering(s) that can work with a mergejoin,
 we mark each mergejoinable clause with the EquivalenceClasses of its left
 and right inputs.  For an equivalence clause, these are of course the same
 EquivalenceClass.  For a non-equivalence mergejoinable clause (such as an
 outer-join qualification), we generate two separate EquivalenceClasses for
 the left and right inputs.  This may result in creating single-item
 equivalence "classes", though of course these are still subject to merging
 if other equivalence clauses are later found to bear on the same
 expressions.
 Another way that we may form a single-item EquivalenceClass is in creation
 of a PathKey to represent a desired sort order (see below).  This is a bit
 different from the above cases because such an EquivalenceClass might
 contain an aggregate function or volatile expression.  (A clause containing
 a volatile function will never be considered mergejoinable, even if its top
 operator is mergejoinable, so there is no way for a volatile expression to
 get into EquivalenceClasses otherwise.  Aggregates are disallowed in WHERE
 altogether, so will never be found in a mergejoinable clause.)  This is just
 a convenience to maintain a uniform PathKey representation: such an
 EquivalenceClass will never be merged with any other.
 An EquivalenceClass also contains a list of btree opfamily OIDs, which
 determines what the equalities it represents actually "mean".  All the
 equivalence clauses that contribute to an EquivalenceClass must have
 equality operators that belong to the same set of opfamilies.  (Note: most
 of the time, a particular equality operator belongs to only one family, but
 it's possible that it belongs to more than one.  We keep track of all the
 families to ensure that we can make use of an index belonging to any one of
 the families for mergejoin purposes.)
 PathKeys
 --------
 The PathKeys data structure represents what is known about the sort order
-of a particular Path.
+of the tuples generated by a particular Path.  A path's pathkeys field is a
 list of PathKey nodes, where the n'th item represents the n'th sort key of
 the result.  Each PathKey contains these fields:
-Path.pathkeys is a List of Lists of PathKeyItem nodes that represent
+	* a reference to an EquivalenceClass
-the sort order of the result generated by the Path.  The n'th sublist
+	* a btree opfamily OID (must match one of those in the EC)
-represents the n'th sort key of the result.
+	* a sort direction (ascending or descending)
 	* a nulls-first-or-last flag
 The EquivalenceClass represents the value being sorted on.  Since the
 various members of an EquivalenceClass are known equal according to the
 opfamily, we can consider a path sorted by any one of them to be sorted by
 any other too; this is what justifies referencing the whole
 EquivalenceClass rather than just one member of it.
 In single/base relation RelOptInfo's, the Paths represent various ways
 of scanning the relation and the resulting ordering of the tuples.
 Sequential scan Paths have NIL pathkeys, indicating no known ordering.
 Index scans have Path.pathkeys that represent the chosen index's ordering,
-if any.  A single-key index would create a pathkey with a single sublist,
+if any.  A single-key index would create a single-PathKey list, while a
-e.g. ( (tab1.indexkey1/sortop1) ).  A multi-key index generates a sublist
+multi-column index generates a list with one element per index column.
-per key, e.g. ( (tab1.indexkey1/sortop1) (tab1.indexkey2/sortop2) ) which
+(Actually, since an index can be scanned either forward or backward, there
-shows major sort by indexkey1 (ordering by sortop1) and minor sort by
+are two possible sort orders and two possible PathKey lists it can
-indexkey2 with sortop2.
+generate.)
-Note that a multi-pass indexscan (OR clause scan) has NIL pathkeys since
+Note that a bitmap scan or multi-pass indexscan (OR clause scan) has NIL
-we can say nothing about the overall order of its result.  Also, an
+pathkeys since we can say nothing about the overall order of its result.
-indexscan on an unordered type of index generates NIL pathkeys.  However,
+Also, an indexscan on an unordered type of index generates NIL pathkeys.
-we can always create a pathkey by doing an explicit sort.  The pathkeys
+However, we can always create a pathkey by doing an explicit sort.  The
-for a Sort plan's output just represent the sort key fields and the
+pathkeys for a Sort plan's output just represent the sort key fields and
-ordering operators used.
+the ordering operators used.
 Things get more interesting when we consider joins.  Suppose we do a
 mergejoin between A and B using the mergeclause A.X = B.Y.  The output
-of the mergejoin is sorted by X --- but it is also sorted by Y.  We
+of the mergejoin is sorted by X --- but it is also sorted by Y.  Again,
-represent this fact by listing both keys in a single pathkey sublist:
+this can be represented by a PathKey referencing an EquivalenceClass
-( (A.X/xsortop B.Y/ysortop) ).  This pathkey asserts that the major
+containing both X and Y.
 sort order of the Path can be taken to be *either* A.X or B.Y.
 They are equal, so they are both primary sort keys.  By doing this,
 we allow future joins to use either var as a pre-sorted key, so upper
 Mergejoins may be able to avoid having to re-sort the Path.  This is
 why pathkeys is a List of Lists.
-We keep a sortop associated with each PathKeyItem because cross-data-type
+With a little further thought, it becomes apparent that nestloop joins
-mergejoins are possible; for example int4 = int8 is mergejoinable.
+can also produce sorted output.  For example, if we do a nestloop join
-In this case we need to remember that the left var is ordered by int4lt
+between outer relation A and inner relation B, then any pathkeys relevant
-while the right var is ordered by int8lt.  So the different members of
+to A are still valid for the join result: we have not altered the order of
-each sublist could have different sortops.
+the tuples from A.  Even more interesting, if there was an equivalence clause
-
+A.X=B.Y, and A.X was a pathkey for the outer relation A, then we can assert
-Note that while the order of the top list is meaningful (primary vs.
+that B.Y is a pathkey for the join result; X was ordered before and still
-secondary sort key), the order of each sublist is arbitrary.  Each sublist
+is, and the joined values of Y are equal to the joined values of X, so Y
 should be regarded as a set of equivalent keys, with no significance
 to the list order.
 With a little further thought, it becomes apparent that pathkeys for
 joins need not only come from mergejoins.  For example, if we do a
 nestloop join between outer relation A and inner relation B, then any
 pathkeys relevant to A are still valid for the join result: we have
 not altered the order of the tuples from A.  Even more interesting,
 if there was a mergeclause (more formally, an "equijoin clause") A.X=B.Y,
 and A.X was a pathkey for the outer relation A, then we can assert that
 B.Y is a pathkey for the join result; X was ordered before and still is,
 and the joined values of Y are equal to the joined values of X, so Y
 must now be ordered too.  This is true even though we used neither an
-explicit sort nor a mergejoin on Y.
+explicit sort nor a mergejoin on Y.  (Note: hash joins cannot be counted
 on to preserve the order of their outer relation, because the executor
 might decide to "batch" the join, so we always set pathkeys to NIL for
 a hashjoin path.)  Exception: a RIGHT or FULL join doesn't preserve the
 ordering of its outer relation, because it might insert nulls at random
 points in the ordering.
-More generally, whenever we have an equijoin clause A.X = B.Y and a
+In general, we can justify using EquivalenceClasses as the basis for
-pathkey A.X, we can add B.Y to that pathkey if B is part of the joined
+pathkeys because, whenever we scan a relation containing multiple
-relation the pathkey is for, *no matter how we formed the join*.  It works
+EquivalenceClass members or join two relations each containing
-as long as the clause has been applied at some point while forming the
+EquivalenceClass members, we apply restriction or join clauses derived from
-join relation.  (In the current implementation, we always apply qual
+the EquivalenceClass.  This guarantees that any two values listed in the
-clauses as soon as possible, ie, as far down in the plan tree as possible.
+EquivalenceClass are in fact equal in all tuples emitted by the scan or
-So we can treat the pathkeys as equivalent everywhere.  The exception is
+join, and therefore that if the tuples are sorted by one of the values,
-when the relations A and B are joined inside the nullable side of an
+they can be considered sorted by any other as well.  It does not matter
-OUTER JOIN and the equijoin clause comes from above the OUTER JOIN.  In this
+whether the test clause is used as a mergeclause, or merely enforced
-case we cannot apply the qual as soon as A and B are joined, so we do not
+after-the-fact as a qpqual filter.
 consider the pathkeys to be equivalent.  This could be improved if we wanted
 to go to the trouble of making pathkey equivalence be context-dependent,
 but that seems much more complex than it's worth.)
-In short, then: when producing the pathkeys for a merge or nestloop join,
+Note that there is no particular difficulty in labeling a path's sort
-we can keep all of the keys of the outer path, since the ordering of the
+order with a PathKey referencing an EquivalenceClass that contains
-outer path will be preserved in the result.  Furthermore, we can add to
+variables not yet joined into the path's output.  We can simply ignore
-each pathkey sublist any inner vars that are equijoined to any of the
+such entries as not being relevant (yet).  This makes it possible to
-outer vars in the sublist; this works regardless of whether we are
+use the same EquivalenceClasses throughout the join planning process.
-implementing the join using that equijoin clause as a mergeclause,
+In fact, by being careful not to generate multiple identical PathKey
-or merely enforcing the clause after-the-fact as a qpqual filter.
+objects, we can reduce comparison of EquivalenceClasses and PathKeys
-
+to simple pointer comparison, which is a huge savings because add_path
-Although Hashjoins also work only with equijoin operators, it is *not*
+has to make a large number of PathKey comparisons in deciding whether
-safe to consider the output of a Hashjoin to be sorted in any particular
+competing Paths are equivalently sorted.
 order --- not even the outer path's order.  This is true because the
 executor might have to split the join into multiple batches.  Therefore
 a Hashjoin is always given NIL pathkeys.  (Also, we need to use only
 mergejoinable operators when deducing which inner vars are now sorted,
 because a mergejoin operator tells us which left- and right-datatype
 sortops can be considered equivalent, whereas a hashjoin operator
 doesn't imply anything about sort order.)
 Pathkeys are also useful to represent an ordering that we wish to achieve,
 since they are easily compared to the pathkeys of a potential candidate
 path.  So, SortClause lists are turned into pathkeys lists for use inside
 the optimizer.
-OK, now for how it *really* works:
+Because we have to generate pathkeys lists from the sort clauses before
-
+we've finished EquivalenceClass merging, we cannot use the pointer-equality
-We did implement pathkeys just as described above, and found that the
+method of comparing PathKeys in the earliest stages of the planning
-planner spent a huge amount of time comparing pathkeys, because the
+process.  Instead, we generate "non canonical" PathKeys that reference
-representation of pathkeys as unordered lists made it expensive to decide
+single-element EquivalenceClasses that might get merged later.  After we
-whether two were equal or not.  So, we've modified the representation
+complete EquivalenceClass merging, we replace these with "canonical"
-as described next.
+PathKeys that reference only fully-merged classes, and after that we make
-
+sure we don't generate more than one copy of each "canonical" PathKey.
-If we scan the WHERE clause for equijoin clauses (mergejoinable clauses)
+Then it is safe to use pointer comparison on canonical PathKeys.
 during planner startup, we can construct lists of equivalent pathkey items
 for the query.  There could be more than two items per equivalence set;
 for example, WHERE A.X = B.Y AND B.Y = C.Z AND D.R = E.S creates the
 equivalence sets { A.X B.Y C.Z } and { D.R E.S } (plus associated sortops).
 Any pathkey item that belongs to an equivalence set implies that all the
 other items in its set apply to the relation too, or at least all the ones
 that are for fields present in the relation.  (Some of the items in the
 set might be for as-yet-unjoined relations.)  Furthermore, any multi-item
 pathkey sublist that appears at any stage of planning the query *must* be
 a subset of one or another of these equivalence sets; there's no way we'd
 have put two items in the same pathkey sublist unless they were equijoined
 in WHERE.
 Now suppose that we allow a pathkey sublist to contain pathkey items for
 vars that are not yet part of the pathkey's relation.  This introduces
 no logical difficulty, because such items can easily be seen to be
 irrelevant; we just mandate that they be ignored.  But having allowed
 this, we can declare (by fiat) that any multiple-item pathkey sublist
 must be "equal()" to the appropriate equivalence set.  In effect,
 whenever we make a pathkey sublist that mentions any var appearing in an
 equivalence set, we instantly add all the other vars equivalenced to it,
 whether they appear yet in the pathkey's relation or not.  And we also
 mandate that the pathkey sublist appear in the same order as the
 equivalence set it comes from.
 In fact, we can go even further, and say that the canonical representation
 of a pathkey sublist is a pointer directly to the relevant equivalence set,
 which is kept in a list of pathkey equivalence sets for the query.  Then
 pathkey sublist comparison reduces to pointer-equality checking!  To do this
 we also have to add single-element pathkey sublists to the query's list of
 equivalence sets, but that's a small price to pay.
 By the way, it's OK and even useful for us to build equivalence sets
 that mention multiple vars from the same relation.  For example, if
 we have WHERE A.X = A.Y and we are scanning A using an index on X,
 we can legitimately conclude that the path is sorted by Y as well;
 and this could be handy if Y is the variable used in other join clauses
 or ORDER BY.  So, any WHERE clause with a mergejoinable operator can
 contribute to an equivalence set, even if it's not a join clause.
 As sketched so far, equijoin operators allow us to conclude that
 A.X = B.Y and B.Y = C.Z together imply A.X = C.Z, even when different
 datatypes are involved.  What is not immediately obvious is that to use
 the "canonical pathkey" representation, we *must* make this deduction.
 An example (from a real bug in Postgres 7.0) is a mergejoin for a query
 like
 	SELECT * FROM t1, t2 WHERE t1.f2 = t2.f3 AND t1.f1 = t2.f3;
 The canonical-pathkey mechanism is able to deduce that t1.f1 = t1.f2
 (ie, both appear in the same canonical pathkey set).  If we sort t1
 and then apply a mergejoin, we *must* filter the t1 tuples using the
 implied qualification f1 = f2, because otherwise the output of the sort
 will be ordered by f1 or f2 (whichever we sort on) but not both.  The
 merge will then fail since (depending on which qual clause it applies
 first) it's expecting either ORDER BY f1,f2 or ORDER BY f2,f1, but the
 actual output of the sort has neither of these orderings.  The best fix
 for this is to generate all the implied equality constraints for each
 equijoin set and add these clauses to the query's qualification list.
 In other words, we *explicitly* deduce f1 = f2 and add this to the WHERE
 clause.  The constraint will be applied as a qpqual to the output of the
 scan on t1, resulting in sort output that is indeed ordered by both vars.
 This approach provides more information to the selectivity estimation
 code than it would otherwise have, and reduces the number of tuples
 processed in join stages, so it's a win to make these deductions even
 if we weren't forced to.
 When we generate implied equality constraints, we may find ourselves
 adding redundant clauses to specific relations.  For example, consider
 	SELECT * FROM t1, t2, t3 WHERE t1.a = t2.b AND t2.b = t3.c;
 We will generate the implied clause t1.a = t3.c and add it to the tree.
 This is good since it allows us to consider joining t1 and t3 directly,
 which we otherwise wouldn't do.  But when we reach the stage of joining
 all three relations, we will have redundant join clauses --- eg, if we
 join t1 and t2 first, then the path that joins (t1 t2) to t3 will have
 both t2.b = t3.c and t1.a = t3.c as restriction clauses.  This is bad;
 not only is evaluation of the extra clause useless work at runtime,
 but the selectivity estimator routines will underestimate the number
 of tuples produced since they won't know that the two clauses are
 perfectly redundant.  We fix this by detecting and removing redundant
 clauses as the restriction clause list is built for each join.  (We
 can't do it sooner, since which clauses are redundant will vary depending
 on the join order.)
 Yet another implication of all this is that mergejoinable operators
 must form closed equivalence sets.  For example, if "int2 = int4"
 and "int4 = int8" are both marked mergejoinable, then there had better
 be a mergejoinable "int2 = int8" operator as well.  Otherwise, when
 we're given WHERE int2var = int4var AND int4var = int8var, we'll fail
 while trying to create a representation of the implied clause
 int2var = int8var.
 An additional refinement we can make is to insist that canonical pathkey
-lists (sort orderings) do not mention the same pathkey set more than once.
+lists (sort orderings) do not mention the same EquivalenceClass more than
-For example, a pathkey list ((A) (B) (A)) is redundant --- the second
+once.  For example, in all these cases the second sort column is redundant,
-occurrence of (A) does not change the ordering, since the data must already
+because it cannot distinguish values that are the same according to the
-be sorted by A.  Although a user probably wouldn't write ORDER BY A,B,A
+first sort column:
-directly, such redundancies are more probable once equijoin equivalences
+	SELECT ... ORDER BY x, x
-have been considered.  Also, the system is likely to generate redundant
+	SELECT ... ORDER BY x, x DESC
-pathkey lists when computing the sort ordering needed for a mergejoin.  By
+	SELECT ... WHERE x = y ORDER BY x, y
-eliminating the redundancy, we save time and improve planning, since the
+Although a user probably wouldn't write "ORDER BY x,x" directly, such
-planner will more easily recognize equivalent orderings as being equivalent.
+redundancies are more probable once equivalence classes have been
 considered.  Also, the system may generate redundant pathkey lists when
 computing the sort ordering needed for a mergejoin.  By eliminating the
 redundancy, we save time and improve planning, since the planner will more
 easily recognize equivalent orderings as being equivalent.
 Another interesting property is that if the underlying EquivalenceClass
 contains a constant and is not below an outer join, then the pathkey is
 completely redundant and need not be sorted by at all!  Every row must
 contain the same constant value, so there's no need to sort.  (If the EC is
 below an outer join, we still have to sort, since some of the rows might
 have gone to null and others not.  In this case we must be careful to pick
 a non-const member to sort by.  The assumption that all the non-const
 members go to null at the same plan level is critical here, else they might
 not produce the same sort order.)  This might seem pointless because users
 are unlikely to write "... WHERE x = 42 ORDER BY x", but it allows us to
 recognize when particular index columns are irrelevant to the sort order:
 if we have "... WHERE x = 42 ORDER BY y", scanning an index on (x,y)
 produces correctly ordered data without a sort step.  We used to have very
 ugly ad-hoc code to recognize that in limited contexts, but discarding
 constant ECs from pathkeys makes it happen cleanly and automatically.
 You might object that a below-outer-join EquivalenceClass doesn't always
 represent the same values at every level of the join tree, and so using
 it to uniquely identify a sort order is dubious.  This is true, but we
 can avoid dealing with the fact explicitly because we always consider that
 an outer join destroys any ordering of its nullable inputs.  Thus, even
 if a path was sorted by {a.x} below an outer join, we'll re-sort if that
 sort ordering was important; and so using the same PathKey for both sort
 orderings doesn't create any real problem.
 Though Bob Devine <bob.devine@worldnet.att.net> was not involved in the 
 coding of our optimizer, he is available to field questions about
--- a/src/backend/optimizer/path/Makefile
+++ b/src/backend/optimizer/path/Makefile
@@ -4,7 +4,7 @@
 #    Makefile for optimizer/path
 #
 # IDENTIFICATION
-#    $PostgreSQL: pgsql/src/backend/optimizer/path/Makefile,v 1.17 2007/01/20 17:16:11 petere Exp $
+#    $PostgreSQL: pgsql/src/backend/optimizer/path/Makefile,v 1.18 2007/01/20 20:45:38 tgl Exp $
 #
 #-------------------------------------------------------------------------
@@ -12,7 +12,7 @@ subdir = src/backend/optimizer/path
 top_builddir = ../../../..
 include $(top_builddir)/src/Makefile.global
-OBJS = allpaths.o clausesel.o costsize.o indxpath.o \
+OBJS = allpaths.o clausesel.o costsize.o equivclass.o indxpath.o \
       joinpath.o joinrels.o orindxpath.o pathkeys.o tidpath.o
 all: SUBSYS.o
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/path/allpaths.c,v 1.156 2007/01/09 02:14:12 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/path/allpaths.c,v 1.157 2007/01/20 20:45:38 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -325,6 +325,16 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 			adjust_appendrel_attrs((Node *) rel->joininfo,
 								   appinfo);
 		/*
 		 * We have to make child entries in the EquivalenceClass data
 		 * structures as well.
 		 */
 		if (rel->has_eclass_joins)
 		{
 			add_child_rel_equivalences(root, appinfo, rel, childrel);
 			childrel->has_eclass_joins = true;
 		}
 		/*
 		 * Copy the parent's attr_needed data as well, with appropriate
 		 * adjustment of relids and attribute numbers.
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -54,7 +54,7 @@
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.174 2007/01/10 18:06:03 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/path/costsize.c,v 1.175 2007/01/20 20:45:38 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -1258,8 +1258,6 @@ cost_mergejoin(MergePath *path, PlannerInfo *root)
 	Path	   *outer_path = path->jpath.outerjoinpath;
 	Path	   *inner_path = path->jpath.innerjoinpath;
 	List	   *mergeclauses = path->path_mergeclauses;
 	Oid		   *mergeFamilies = path->path_mergeFamilies;
 	int		   *mergeStrategies = path->path_mergeStrategies;
 	List	   *outersortkeys = path->outersortkeys;
 	List	   *innersortkeys = path->innersortkeys;
 	Cost		startup_cost = 0;
@@ -1268,7 +1266,6 @@ cost_mergejoin(MergePath *path, PlannerInfo *root)
 	Selectivity merge_selec;
 	QualCost	merge_qual_cost;
 	QualCost	qp_qual_cost;
 	RestrictInfo *firstclause;
 	double		outer_path_rows = PATH_ROWS(outer_path);
 	double		inner_path_rows = PATH_ROWS(inner_path);
 	double		outer_rows,
@@ -1347,32 +1344,47 @@ cost_mergejoin(MergePath *path, PlannerInfo *root)
 	 * inputs that will actually need to be scanned. We use only the first
 	 * (most significant) merge clause for this purpose.
 	 *
-	 * Since this calculation is somewhat expensive, and will be the same for
+	 * XXX mergejoinscansel is a bit expensive, can we cache its results?
 	 * all mergejoin paths associated with the merge clause, we cache the
 	 * results in the RestrictInfo node.  XXX that won't work anymore once
 	 * we support multiple possible orderings!
 	 */
 	if (mergeclauses && path->jpath.jointype != JOIN_FULL)
 	{
-		firstclause = (RestrictInfo *) linitial(mergeclauses);
+		RestrictInfo *firstclause = (RestrictInfo *) linitial(mergeclauses);
-		if (firstclause->left_mergescansel < 0) /* not computed yet? */
+		List	   *opathkeys;
-			mergejoinscansel(root, (Node *) firstclause->clause,
+		List	   *ipathkeys;
-							 mergeFamilies[0],
+		PathKey	   *opathkey;
-							 mergeStrategies[0],
+		PathKey	   *ipathkey;
-							 &firstclause->left_mergescansel,
+		Selectivity leftscansel,
-							 &firstclause->right_mergescansel);
+					rightscansel;
-		if (bms_is_subset(firstclause->left_relids, outer_path->parent->relids))
+		/* Get the input pathkeys to determine the sort-order details */
 		opathkeys = outersortkeys ? outersortkeys : outer_path->pathkeys;
 		ipathkeys = innersortkeys ? innersortkeys : inner_path->pathkeys;
 		Assert(opathkeys);
 		Assert(ipathkeys);
 		opathkey = (PathKey *) linitial(opathkeys);
 		ipathkey = (PathKey *) linitial(ipathkeys);
 		/* debugging check */
 		if (opathkey->pk_opfamily != ipathkey->pk_opfamily ||
 			opathkey->pk_strategy != ipathkey->pk_strategy ||
 			opathkey->pk_nulls_first != ipathkey->pk_nulls_first)
 			elog(ERROR, "left and right pathkeys do not match in mergejoin");
 		mergejoinscansel(root, (Node *) firstclause->clause,
 						 opathkey->pk_opfamily, opathkey->pk_strategy,
 						 &leftscansel, &rightscansel);
 		if (bms_is_subset(firstclause->left_relids,
 						  outer_path->parent->relids))
 		{
 			/* left side of clause is outer */
-			outerscansel = firstclause->left_mergescansel;
+			outerscansel = leftscansel;
-			innerscansel = firstclause->right_mergescansel;
+			innerscansel = rightscansel;
 		}
 		else
 		{
 			/* left side of clause is inner */
-			outerscansel = firstclause->right_mergescansel;
+			outerscansel = rightscansel;
-			innerscansel = firstclause->left_mergescansel;
+			innerscansel = leftscansel;
 		}
 		if (path->jpath.jointype == JOIN_LEFT)
 			outerscansel = 1.0;
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -9,7 +9,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/path/indxpath.c,v 1.215 2007/01/09 02:14:12 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/path/indxpath.c,v 1.216 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -32,7 +32,6 @@
 #include "optimizer/var.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/selfuncs.h"
@@ -72,21 +71,11 @@ static bool match_rowcompare_to_indexcol(IndexOptInfo *index,
 							 Oid opfamily,
 							 RowCompareExpr *clause,
 							 Relids outer_relids);
-static Relids indexable_outerrelids(RelOptInfo *rel);
+static Relids indexable_outerrelids(PlannerInfo *root, RelOptInfo *rel);
 static bool matches_any_index(RestrictInfo *rinfo, RelOptInfo *rel,
 				  Relids outer_relids);
 static List *find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
 					  Relids outer_relids, bool isouterjoin);
 static ScanDirection match_variant_ordering(PlannerInfo *root,
 					   IndexOptInfo *index,
 					   List *restrictclauses);
 static List *identify_ignorable_ordering_cols(PlannerInfo *root,
 								 IndexOptInfo *index,
 								 List *restrictclauses);
 static bool match_index_to_query_keys(PlannerInfo *root,
 						  IndexOptInfo *index,
 						  ScanDirection indexscandir,
 						  List *ignorables);
 static bool match_boolean_index_clause(Node *clause, int indexcol,
 						   IndexOptInfo *index);
 static bool match_special_index_operator(Expr *clause, Oid opfamily,
@@ -157,7 +146,7 @@ create_index_paths(PlannerInfo *root, RelOptInfo *rel)
 	 * participate in such join clauses.  We'll use this set later to
 	 * recognize outer rels that are equivalent for joining purposes.
 	 */
-	rel->index_outer_relids = indexable_outerrelids(rel);
+	rel->index_outer_relids = indexable_outerrelids(root, rel);
 	/*
 	 * Find all the index paths that are directly usable for this relation
@@ -351,8 +340,7 @@ find_usable_indexes(PlannerInfo *root, RelOptInfo *rel,
 		if (index_is_ordered && istoplevel && outer_rel == NULL)
 		{
 			index_pathkeys = build_index_pathkeys(root, index,
-												  ForwardScanDirection,
+												  ForwardScanDirection);
 												  true);
 			useful_pathkeys = truncate_useless_pathkeys(root, rel,
 														index_pathkeys);
 		}
@@ -378,23 +366,21 @@ find_usable_indexes(PlannerInfo *root, RelOptInfo *rel,
 		}
 		/*
-		 * 4. If the index is ordered, and there is a requested query ordering
+		 * 4. If the index is ordered, a backwards scan might be
-		 * that we failed to match, consider variant ways of achieving the
+		 * interesting.  Again, this is only interesting at top level.
 		 * ordering.  Again, this is only interesting at top level.
 		 */
-		if (index_is_ordered && istoplevel && outer_rel == NULL &&
+		if (index_is_ordered && istoplevel && outer_rel == NULL)
 			root->query_pathkeys != NIL &&
 			pathkeys_useful_for_ordering(root, useful_pathkeys) == 0)
 		{
-			ScanDirection scandir;
+			index_pathkeys = build_index_pathkeys(root, index,
-
+												  BackwardScanDirection);
-			scandir = match_variant_ordering(root, index, restrictclauses);
+			useful_pathkeys = truncate_useless_pathkeys(root, rel,
-			if (!ScanDirectionIsNoMovement(scandir))
+														index_pathkeys);
 			if (useful_pathkeys != NIL)
 			{
 				ipath = create_index_path(root, index,
 										  restrictclauses,
-										  root->query_pathkeys,
+										  useful_pathkeys,
-										  scandir,
+										  BackwardScanDirection,
 										  outer_rel);
 				result = lappend(result, ipath);
 			}
@@ -1207,19 +1193,6 @@ check_partial_indexes(PlannerInfo *root, RelOptInfo *rel)
 	List	   *restrictinfo_list = rel->baserestrictinfo;
 	ListCell   *ilist;
 	/*
 	 * Note: if Postgres tried to optimize queries by forming equivalence
 	 * classes over equi-joined attributes (i.e., if it recognized that a
 	 * qualification such as "where a.b=c.d and a.b=5" could make use of an
 	 * index on c.d), then we could use that equivalence class info here with
 	 * joininfo lists to do more complete tests for the usability of a partial
 	 * index.  For now, the test only uses restriction clauses (those in
 	 * baserestrictinfo). --Nels, Dec '92
 	 *
 	 * XXX as of 7.1, equivalence class info *is* available.  Consider
 	 * improving this code as foreseen by Nels.
 	 */
 	foreach(ilist, rel->indexlist)
 	{
 		IndexOptInfo *index = (IndexOptInfo *) lfirst(ilist);
@@ -1242,18 +1215,19 @@ check_partial_indexes(PlannerInfo *root, RelOptInfo *rel)
 *	  for the specified table.	Returns a set of relids.
 */
 static Relids
-indexable_outerrelids(RelOptInfo *rel)
+indexable_outerrelids(PlannerInfo *root, RelOptInfo *rel)
 {
 	Relids		outer_relids = NULL;
-	ListCell   *l;
+	bool		is_child_rel = (rel->reloptkind == RELOPT_OTHER_MEMBER_REL);
 	ListCell   *lc1;
 	/*
 	 * Examine each joinclause in the joininfo list to see if it matches any
 	 * key of any index.  If so, add the clause's other rels to the result.
 	 */
-	foreach(l, rel->joininfo)
+	foreach(lc1, rel->joininfo)
 	{
-		RestrictInfo *joininfo = (RestrictInfo *) lfirst(l);
+		RestrictInfo *joininfo = (RestrictInfo *) lfirst(lc1);
 		Relids		other_rels;
 		other_rels = bms_difference(joininfo->required_relids, rel->relids);
@@ -1263,6 +1237,71 @@ indexable_outerrelids(RelOptInfo *rel)
 			bms_free(other_rels);
 	}
 	/*
 	 * We also have to look through the query's EquivalenceClasses to see
 	 * if any of them could generate indexable join conditions for this rel.
 	 */
 	if (rel->has_eclass_joins)
 	{
 		foreach(lc1, root->eq_classes)
 		{
 			EquivalenceClass *cur_ec = (EquivalenceClass *) lfirst(lc1);
 			Relids		other_rels = NULL;
 			bool		found_index = false;
 			ListCell   *lc2;
 			/*
 			 * Won't generate joinclauses if const or single-member (the latter
 			 * test covers the volatile case too)
 			 */
 			if (cur_ec->ec_has_const || list_length(cur_ec->ec_members) <= 1)
 				continue;
 			/*
 			 * Note we don't test ec_broken; if we did, we'd need a separate
 			 * code path to look through ec_sources.  Checking the members
 			 * anyway is OK as a possibly-overoptimistic heuristic.
 			 */
 			/*
 			 * No point in searching if rel not mentioned in eclass (but we
 			 * can't tell that for a child rel).
 			 */
 			if (!is_child_rel &&
 				!bms_is_subset(rel->relids, cur_ec->ec_relids))
 				continue;
 			/*
 			 * Scan members, looking for both an index match and join
 			 * candidates
 			 */
 			foreach(lc2, cur_ec->ec_members)
 			{
 				EquivalenceMember *cur_em = (EquivalenceMember *) lfirst(lc2);
 				/* Join candidate? */
 				if (!cur_em->em_is_child &&
 					!bms_overlap(cur_em->em_relids, rel->relids))
 				{
 					other_rels = bms_add_members(other_rels,
 												 cur_em->em_relids);
 					continue;
 				}
 				/* Check for index match (only need one) */
 				if (!found_index &&
 					bms_equal(cur_em->em_relids, rel->relids) &&
 					eclass_matches_any_index(cur_ec, cur_em, rel))
 					found_index = true;
 			}
 			if (found_index)
 				outer_relids = bms_join(outer_relids, other_rels);
 			else
 				bms_free(other_rels);
 		}
 	}
 	return outer_relids;
 }
@@ -1339,6 +1378,42 @@ matches_any_index(RestrictInfo *rinfo, RelOptInfo *rel, Relids outer_relids)
 	return false;
 }
 /*
 * eclass_matches_any_index
 *	  Workhorse for indexable_outerrelids: see if an EquivalenceClass member
 *	  can be matched to any index column of the given rel.
 *
 * This is also exported for use by find_eclass_clauses_for_index_join.
 */
 bool
 eclass_matches_any_index(EquivalenceClass *ec, EquivalenceMember *em,
 						 RelOptInfo *rel)
 {
 	ListCell   *l;
 	foreach(l, rel->indexlist)
 	{
 		IndexOptInfo *index = (IndexOptInfo *) lfirst(l);
 		int			indexcol = 0;
 		Oid		   *families = index->opfamily;
 		do
 		{
 			Oid			curFamily = families[0];
 			if (list_member_oid(ec->ec_opfamilies, curFamily) &&
 				match_index_to_operand((Node *) em->em_expr, indexcol, index))
 				return true;
 			indexcol++;
 			families++;
 		} while (!DoneMatchingIndexKeys(families));
 	}
 	return false;
 }
 /*
 * best_inner_indexscan
 *	  Finds the best available inner indexscan for a nestloop join
@@ -1393,12 +1468,12 @@ best_inner_indexscan(PlannerInfo *root, RelOptInfo *rel,
 		return NULL;
 	/*
-	 * Otherwise, we have to do path selection in the memory context of the
+	 * Otherwise, we have to do path selection in the main planning context,
-	 * given rel, so that any created path can be safely attached to the rel's
+	 * so that any created path can be safely attached to the rel's cache of
-	 * cache of best inner paths.  (This is not currently an issue for normal
+	 * best inner paths.  (This is not currently an issue for normal planning,
-	 * planning, but it is an issue for GEQO planning.)
+	 * but it is an issue for GEQO planning.)
 	 */
-	oldcontext = MemoryContextSwitchTo(GetMemoryChunkContext(rel));
+	oldcontext = MemoryContextSwitchTo(root->planner_cxt);
 	/*
 	 * Intersect the given outer relids with index_outer_relids to find the
@@ -1539,7 +1614,12 @@ find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
 	Relids		join_relids;
 	ListCell   *l;
-	/* Look for joinclauses that are usable with given outer_relids */
+	/*
 	 * Look for joinclauses that are usable with given outer_relids.  Note
 	 * we'll take anything that's applicable to the join whether it has
 	 * anything to do with an index or not; since we're only building a list,
 	 * it's not worth filtering more finely here.
 	 */
 	join_relids = bms_union(rel->relids, outer_relids);
 	foreach(l, rel->joininfo)
@@ -1557,276 +1637,27 @@ find_clauses_for_join(PlannerInfo *root, RelOptInfo *rel,
 	bms_free(join_relids);
-	/* if no join clause was matched then forget it, per comments above */
+	/*
 	 * Also check to see if any EquivalenceClasses can produce a relevant
 	 * joinclause.  Since all such clauses are effectively pushed-down,
 	 * this doesn't apply to outer joins.
 	 */
 	if (!isouterjoin && rel->has_eclass_joins)
 		clause_list = list_concat(clause_list,
 								  find_eclass_clauses_for_index_join(root,
 																	 rel,
 															   outer_relids));
 	/* If no join clause was matched then forget it, per comments above */
 	if (clause_list == NIL)
 		return NIL;
-	/*
+	/* We can also use any plain restriction clauses for the rel */
 	 * We can also use any plain restriction clauses for the rel.  We put
 	 * these at the front of the clause list for the convenience of
 	 * remove_redundant_join_clauses, which can never remove non-join clauses
 	 * and hence won't be able to get rid of a non-join clause if it appears
 	 * after a join clause it is redundant with.
 	 */
 	clause_list = list_concat(list_copy(rel->baserestrictinfo), clause_list);
 	/*
 	 * We may now have clauses that are known redundant.  Get rid of 'em.
 	 */
 	if (list_length(clause_list) > 1)
 	{
 		clause_list = remove_redundant_join_clauses(root,
 													clause_list,
 													isouterjoin);
 	}
 	return clause_list;
 }
 /****************************************************************************
 *				----  ROUTINES TO HANDLE PATHKEYS  ----
 ****************************************************************************/
 /*
 * match_variant_ordering
 *		Try to match an index's ordering to the query's requested ordering
 *
 * This is used when the index is ordered but a naive comparison fails to
 * match its ordering (pathkeys) to root->query_pathkeys.  It may be that
 * we need to scan the index backwards.  Also, a less naive comparison can
 * help for both forward and backward indexscans.  Columns of the index
 * that have an equality restriction clause can be ignored in the match;
 * that is, an index on (x,y) can be considered to match the ordering of
 *		... WHERE x = 42 ORDER BY y;
 *
 * Note: it would be possible to similarly ignore useless ORDER BY items;
 * that is, an index on just y could be considered to match the ordering of
 *		... WHERE x = 42 ORDER BY x, y;
 * But proving that this is safe would require finding a btree opfamily
 * containing both the = operator and the < or > operator in the ORDER BY
 * item.  That's significantly more expensive than what we do here, since
 * we'd have to look at restriction clauses unrelated to the current index
 * and search for opfamilies without any hint from the index.  The practical
 * use-cases seem to be mostly covered by ignoring index columns, so that's
 * all we do for now.
 *
 * Inputs:
 * 'index' is the index of interest.
 * 'restrictclauses' is the list of sublists of restriction clauses
 *		matching the columns of the index (NIL if none)
 *
 * If able to match the requested query pathkeys, returns either
 * ForwardScanDirection or BackwardScanDirection to indicate the proper index
 * scan direction.	If no match, returns NoMovementScanDirection.
 */
 static ScanDirection
 match_variant_ordering(PlannerInfo *root,
 					   IndexOptInfo *index,
 					   List *restrictclauses)
 {
 	List	   *ignorables;
 	/*
 	 * Forget the whole thing if not a btree index; our check for ignorable
 	 * columns assumes we are dealing with btree opfamilies.  (It'd be possible
 	 * to factor out just the try for backwards indexscan, but considering
 	 * that we presently have no orderable indexes except btrees anyway, it's
 	 * hardly worth contorting this code for that case.)
 	 *
 	 * Note: if you remove this, you probably need to put in a check on
 	 * amoptionalkey to prevent possible clauseless scan on an index that
 	 * won't cope.
 	 */
 	if (index->relam != BTREE_AM_OID)
 		return NoMovementScanDirection;
 	/*
 	 * Figure out which index columns can be optionally ignored because they
 	 * have an equality constraint.  This is the same set for either forward
 	 * or backward scan, so we do it just once.
 	 */
 	ignorables = identify_ignorable_ordering_cols(root, index,
 												  restrictclauses);
 	/*
 	 * Try to match to forward scan, then backward scan.  However, we can skip
 	 * the forward-scan case if there are no ignorable columns, because
 	 * find_usable_indexes() would have found the match already.
 	 */
 	if (ignorables &&
 		match_index_to_query_keys(root, index, ForwardScanDirection,
 								  ignorables))
 		return ForwardScanDirection;
 	if (match_index_to_query_keys(root, index, BackwardScanDirection,
 								  ignorables))
 		return BackwardScanDirection;
 	return NoMovementScanDirection;
 }
 /*
 * identify_ignorable_ordering_cols
 *		Determine which index columns can be ignored for ordering purposes
 *
 * Returns an integer List of column numbers (1-based) of ignorable
 * columns.  The ignorable columns are those that have equality constraints
 * against pseudoconstants.
 */
 static List *
 identify_ignorable_ordering_cols(PlannerInfo *root,
 								 IndexOptInfo *index,
 								 List *restrictclauses)
 {
 	List	   *result = NIL;
 	int			indexcol = 0;	/* note this is 0-based */
 	ListCell   *l;
 	/* restrictclauses is either NIL or has a sublist per column */
 	foreach(l, restrictclauses)
 	{
 		List	   *sublist = (List *) lfirst(l);
 		Oid			opfamily = index->opfamily[indexcol];
 		ListCell   *l2;
 		foreach(l2, sublist)
 		{
 			RestrictInfo *rinfo = (RestrictInfo *) lfirst(l2);
 			OpExpr	   *clause = (OpExpr *) rinfo->clause;
 			Oid			clause_op;
 			int			op_strategy;
 			bool		varonleft;
 			bool		ispc;
 			/* First check for boolean-index cases. */
 			if (IsBooleanOpfamily(opfamily))
 			{
 				if (match_boolean_index_clause((Node *) clause, indexcol,
 											   index))
 				{
 					/*
 					 * The clause means either col = TRUE or col = FALSE; we
 					 * do not care which, it's an equality constraint either
 					 * way.
 					 */
 					result = lappend_int(result, indexcol + 1);
 					break;
 				}
 			}
 			/* Otherwise, ignore if not a binary opclause */
 			if (!is_opclause(clause) || list_length(clause->args) != 2)
 				continue;
 			/* Determine left/right sides and check the operator */
 			clause_op = clause->opno;
 			if (match_index_to_operand(linitial(clause->args), indexcol,
 									   index))
 			{
 				/* clause_op is correct */
 				varonleft = true;
 			}
 			else
 			{
 				Assert(match_index_to_operand(lsecond(clause->args), indexcol,
 											  index));
 				/* Must flip operator to get the opfamily member */
 				clause_op = get_commutator(clause_op);
 				varonleft = false;
 			}
 			if (!OidIsValid(clause_op))
 				continue;		/* ignore non match, per next comment */
 			op_strategy = get_op_opfamily_strategy(clause_op, opfamily);
 			/*
 			 * You might expect to see Assert(op_strategy != 0) here, but you
 			 * won't: the clause might contain a special indexable operator
 			 * rather than an ordinary opfamily member.	Currently none of the
 			 * special operators are very likely to expand to an equality
 			 * operator; we do not bother to check, but just assume no match.
 			 */
 			if (op_strategy != BTEqualStrategyNumber)
 				continue;
 			/* Now check that other side is pseudoconstant */
 			if (varonleft)
 				ispc = is_pseudo_constant_clause_relids(lsecond(clause->args),
 														rinfo->right_relids);
 			else
 				ispc = is_pseudo_constant_clause_relids(linitial(clause->args),
 														rinfo->left_relids);
 			if (ispc)
 			{
 				result = lappend_int(result, indexcol + 1);
 				break;
 			}
 		}
 		indexcol++;
 	}
 	return result;
 }
 /*
 * match_index_to_query_keys
 *		Check a single scan direction for "intelligent" match to query keys
 *
 * 'index' is the index of interest.
 * 'indexscandir' is the scan direction to consider
 * 'ignorables' is an integer list of indexes of ignorable index columns
 *
 * Returns TRUE on successful match (ie, the query_pathkeys can be considered
 * to match this index).
 */
 static bool
 match_index_to_query_keys(PlannerInfo *root,
 						  IndexOptInfo *index,
 						  ScanDirection indexscandir,
 						  List *ignorables)
 {
 	List	   *index_pathkeys;
 	ListCell   *index_cell;
 	int			index_col;
 	ListCell   *r;
 	/* Get the pathkeys that exactly describe the index */
 	index_pathkeys = build_index_pathkeys(root, index, indexscandir, false);
 	/*
 	 * Can we match to the query's requested pathkeys?  The inner loop skips
 	 * over ignorable index columns while trying to match.
 	 */
 	index_cell = list_head(index_pathkeys);
 	index_col = 0;
 	foreach(r, root->query_pathkeys)
 	{
 		List	   *rsubkey = (List *) lfirst(r);
 		for (;;)
 		{
 			List	   *isubkey;
 			if (index_cell == NULL)
 				return false;
 			isubkey = (List *) lfirst(index_cell);
 			index_cell = lnext(index_cell);
 			index_col++;		/* index_col is now 1-based */
 			/*
 			 * Since we are dealing with canonicalized pathkeys, pointer
 			 * comparison is sufficient to determine a match.
 			 */
 			if (rsubkey == isubkey)
 				break;			/* matched current query pathkey */
 			if (!list_member_int(ignorables, index_col))
 				return false;	/* definite failure to match */
 			/* otherwise loop around and try to match to next index col */
 		}
 	}
 	return true;
 }
 /****************************************************************************
 *				----  PATH CREATION UTILITIES  ----
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/path/joinpath.c,v 1.110 2007/01/10 18:06:03 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/path/joinpath.c,v 1.111 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -16,7 +16,6 @@
 #include <math.h>
 #include "access/skey.h"
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -40,10 +39,6 @@ static List *select_mergejoin_clauses(RelOptInfo *joinrel,
 						 RelOptInfo *innerrel,
 						 List *restrictlist,
 						 JoinType jointype);
 static void build_mergejoin_strat_arrays(List *mergeclauses,
 										 Oid **mergefamilies,
 										 int **mergestrategies,
 										 bool **mergenullsfirst);
 /*
@@ -205,9 +200,9 @@ sort_inner_and_outer(PlannerInfo *root,
 	 *
 	 * Actually, it's not quite true that every mergeclause ordering will
 	 * generate a different path order, because some of the clauses may be
-	 * redundant.  Therefore, what we do is convert the mergeclause list to a
+	 * partially redundant (refer to the same EquivalenceClasses).  Therefore,
-	 * list of canonical pathkeys, and then consider different orderings of
+	 * what we do is convert the mergeclause list to a list of canonical
-	 * the pathkeys.
+	 * pathkeys, and then consider different orderings of the pathkeys.
 	 *
 	 * Generating a path for *every* permutation of the pathkeys doesn't seem
 	 * like a winning strategy; the cost in planning time is too high. For
@@ -216,76 +211,59 @@ sort_inner_and_outer(PlannerInfo *root,
 	 * mergejoin without re-sorting against any other possible mergejoin
 	 * partner path.  But if we've not guessed the right ordering of secondary
 	 * keys, we may end up evaluating clauses as qpquals when they could have
-	 * been done as mergeclauses. We need to figure out a better way.  (Two
+	 * been done as mergeclauses.  (In practice, it's rare that there's more
-	 * possible approaches: look at all the relevant index relations to
+	 * than two or three mergeclauses, so expending a huge amount of thought
-	 * suggest plausible sort orders, or make just one output path and somehow
+	 * on that is probably not worth it.)
-	 * mark it as having a sort-order that can be rearranged freely.)
+	 *
 	 * The pathkey order returned by select_outer_pathkeys_for_merge() has
 	 * some heuristics behind it (see that function), so be sure to try it
 	 * exactly as-is as well as making variants.
 	 */
-	all_pathkeys = make_pathkeys_for_mergeclauses(root,
+	all_pathkeys = select_outer_pathkeys_for_merge(root,
-												  mergeclause_list,
+												   mergeclause_list,
-												  outerrel);
+												   joinrel);
 	foreach(l, all_pathkeys)
 	{
 		List	   *front_pathkey = (List *) lfirst(l);
 		List	   *cur_pathkeys;
 		List	   *cur_mergeclauses;
 		Oid		   *mergefamilies;
 		int		   *mergestrategies;
 		bool	   *mergenullsfirst;
 		List	   *outerkeys;
 		List	   *innerkeys;
 		List	   *merge_pathkeys;
-		/* Make a pathkey list with this guy first. */
+		/* Make a pathkey list with this guy first */
 		if (l != list_head(all_pathkeys))
-			cur_pathkeys = lcons(front_pathkey,
+			outerkeys = lcons(front_pathkey,
-								 list_delete_ptr(list_copy(all_pathkeys),
+							  list_delete_ptr(list_copy(all_pathkeys),
-												 front_pathkey));
+											  front_pathkey));
 		else
-			cur_pathkeys = all_pathkeys;		/* no work at first one... */
+			outerkeys = all_pathkeys;		/* no work at first one... */
-		/*
+		/* Sort the mergeclauses into the corresponding ordering */
 		 * Select mergeclause(s) that match this sort ordering.  If we had
 		 * redundant merge clauses then we will get a subset of the original
 		 * clause list.  There had better be some match, however...
 		 */
 		cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
-														  cur_pathkeys,
+														  outerkeys,
 														  true,
 														  mergeclause_list);
 		Assert(cur_mergeclauses != NIL);
-		/* Forget it if can't use all the clauses in right/full join */
+		/* Should have used them all... */
-		if (useallclauses &&
+		Assert(list_length(cur_mergeclauses) == list_length(mergeclause_list));
-			list_length(cur_mergeclauses) != list_length(mergeclause_list))
+
-			continue;
+		/* Build sort pathkeys for the inner side */
 		innerkeys = make_inner_pathkeys_for_merge(root,
 												  cur_mergeclauses,
 												  outerkeys);
 		/* Build pathkeys representing output sort order */
 		merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
 											 outerkeys);
 		/*
-		 * Build sort pathkeys for both sides.
+		 * And now we can make the path.
 		 *
 		 * Note: it's possible that the cheapest paths will already be sorted
 		 * properly.  create_mergejoin_path will detect that case and suppress
 		 * an explicit sort step, so we needn't do so here.
 		 */
 		outerkeys = make_pathkeys_for_mergeclauses(root,
 												   cur_mergeclauses,
 												   outerrel);
 		innerkeys = make_pathkeys_for_mergeclauses(root,
 												   cur_mergeclauses,
 												   innerrel);
 		/* Build pathkeys representing output sort order. */
 		merge_pathkeys = build_join_pathkeys(root, joinrel, jointype,
 											 outerkeys);
 		/* Build opfamily info for execution */
 		build_mergejoin_strat_arrays(cur_mergeclauses,
 									 &mergefamilies,
 									 &mergestrategies,
 									 &mergenullsfirst);
 		/*
 		 * And now we can make the path.
 		 */
 		add_path(joinrel, (Path *)
 				 create_mergejoin_path(root,
 									   joinrel,
@@ -295,9 +273,6 @@ sort_inner_and_outer(PlannerInfo *root,
 									   restrictlist,
 									   merge_pathkeys,
 									   cur_mergeclauses,
 									   mergefamilies,
 									   mergestrategies,
 									   mergenullsfirst,
 									   outerkeys,
 									   innerkeys));
 	}
@@ -427,9 +402,6 @@ match_unsorted_outer(PlannerInfo *root,
 		Path	   *outerpath = (Path *) lfirst(l);
 		List	   *merge_pathkeys;
 		List	   *mergeclauses;
 		Oid		   *mergefamilies;
 		int		   *mergestrategies;
 		bool	   *mergenullsfirst;
 		List	   *innersortkeys;
 		List	   *trialsortkeys;
 		Path	   *cheapest_startup_inner;
@@ -510,6 +482,7 @@ match_unsorted_outer(PlannerInfo *root,
 		/* Look for useful mergeclauses (if any) */
 		mergeclauses = find_mergeclauses_for_pathkeys(root,
 													  outerpath->pathkeys,
 													  true,
 													  mergeclause_list);
 		/*
@@ -532,15 +505,9 @@ match_unsorted_outer(PlannerInfo *root,
 			continue;
 		/* Compute the required ordering of the inner path */
-		innersortkeys = make_pathkeys_for_mergeclauses(root,
+		innersortkeys = make_inner_pathkeys_for_merge(root,
-													   mergeclauses,
+													  mergeclauses,
-													   innerrel);
+													  outerpath->pathkeys);
 		/* Build opfamily info for execution */
 		build_mergejoin_strat_arrays(mergeclauses,
 									 &mergefamilies,
 									 &mergestrategies,
 									 &mergenullsfirst);
 		/*
 		 * Generate a mergejoin on the basis of sorting the cheapest inner.
@@ -557,9 +524,6 @@ match_unsorted_outer(PlannerInfo *root,
 									   restrictlist,
 									   merge_pathkeys,
 									   mergeclauses,
 									   mergefamilies,
 									   mergestrategies,
 									   mergenullsfirst,
 									   NIL,
 									   innersortkeys));
@@ -613,18 +577,12 @@ match_unsorted_outer(PlannerInfo *root,
 					newclauses =
 						find_mergeclauses_for_pathkeys(root,
 													   trialsortkeys,
 													   false,
 													   mergeclauses);
 					Assert(newclauses != NIL);
 				}
 				else
 					newclauses = mergeclauses;
 				/* Build opfamily info for execution */
 				build_mergejoin_strat_arrays(newclauses,
 											 &mergefamilies,
 											 &mergestrategies,
 											 &mergenullsfirst);
 				add_path(joinrel, (Path *)
 						 create_mergejoin_path(root,
 											   joinrel,
@@ -634,9 +592,6 @@ match_unsorted_outer(PlannerInfo *root,
 											   restrictlist,
 											   merge_pathkeys,
 											   newclauses,
 											   mergefamilies,
 											   mergestrategies,
 											   mergenullsfirst,
 											   NIL,
 											   NIL));
 				cheapest_total_inner = innerpath;
@@ -666,19 +621,13 @@ match_unsorted_outer(PlannerInfo *root,
 							newclauses =
 								find_mergeclauses_for_pathkeys(root,
 															   trialsortkeys,
 															   false,
 															   mergeclauses);
 							Assert(newclauses != NIL);
 						}
 						else
 							newclauses = mergeclauses;
 					}
 					/* Build opfamily info for execution */
 					build_mergejoin_strat_arrays(newclauses,
 												 &mergefamilies,
 												 &mergestrategies,
 												 &mergenullsfirst);
 					add_path(joinrel, (Path *)
 							 create_mergejoin_path(root,
 												   joinrel,
@@ -688,9 +637,6 @@ match_unsorted_outer(PlannerInfo *root,
 												   restrictlist,
 												   merge_pathkeys,
 												   newclauses,
 												   mergefamilies,
 												   mergestrategies,
 												   mergenullsfirst,
 												   NIL,
 												   NIL));
 				}
@@ -909,6 +855,10 @@ best_appendrel_indexscan(PlannerInfo *root, RelOptInfo *rel,
 *	  Select mergejoin clauses that are usable for a particular join.
 *	  Returns a list of RestrictInfo nodes for those clauses.
 *
 * We also mark each selected RestrictInfo to show which side is currently
 * being considered as outer.  These are transient markings that are only
 * good for the duration of the current add_paths_to_joinrel() call!
 *
 * We examine each restrictinfo clause known for the join to see
 * if it is mergejoinable and involves vars from the two sub-relations
 * currently of interest.
@@ -939,7 +889,7 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
 			continue;
 		if (!restrictinfo->can_join ||
-			restrictinfo->mergejoinoperator == InvalidOid)
+			restrictinfo->mergeopfamilies == NIL)
 		{
 			have_nonmergeable_joinclause = true;
 			continue;			/* not mergejoinable */
@@ -954,11 +904,13 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
 			bms_is_subset(restrictinfo->right_relids, innerrel->relids))
 		{
 			/* righthand side is inner */
 			restrictinfo->outer_is_left = true;
 		}
 		else if (bms_is_subset(restrictinfo->left_relids, innerrel->relids) &&
 				 bms_is_subset(restrictinfo->right_relids, outerrel->relids))
 		{
 			/* lefthand side is inner */
 			restrictinfo->outer_is_left = false;
 		}
 		else
 		{
@@ -966,7 +918,7 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
 			continue;			/* no good for these input relations */
 		}
-		result_list = lcons(restrictinfo, result_list);
+		result_list = lappend(result_list, restrictinfo);
 	}
 	/*
@@ -995,46 +947,3 @@ select_mergejoin_clauses(RelOptInfo *joinrel,
 	return result_list;
 }
 /*
 * Temporary hack to build opfamily and strategy info needed for mergejoin
 * by the executor.  We need to rethink the planner's handling of merge
 * planning so that it can deal with multiple possible merge orders, but
 * that's not done yet.
 */
 static void
 build_mergejoin_strat_arrays(List *mergeclauses,
 							 Oid **mergefamilies,
 							 int **mergestrategies,
 							 bool **mergenullsfirst)
 {
 	int			nClauses = list_length(mergeclauses);
 	int			i;
 	ListCell   *l;
 	*mergefamilies = (Oid *) palloc(nClauses * sizeof(Oid));
 	*mergestrategies = (int *) palloc(nClauses * sizeof(int));
 	*mergenullsfirst = (bool *) palloc(nClauses * sizeof(bool));
 	i = 0;
 	foreach(l, mergeclauses)
 	{
 		RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(l);
 		/*
 		 * We do not need to worry about whether the mergeclause will be
 		 * commuted at runtime --- it's the same opfamily either way.
 		 */
 		(*mergefamilies)[i] = restrictinfo->mergeopfamily;
 		/*
 		 * For the moment, strategy must always be LessThan --- see
 		 * hack version of get_op_mergejoin_info
 		 */
 		(*mergestrategies)[i] = BTLessStrategyNumber;
 		/* And we only allow NULLS LAST, too */
 		(*mergenullsfirst)[i] = false;
 		i++;
 	}
 }
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/path/joinrels.c,v 1.83 2007/01/05 22:19:31 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/path/joinrels.c,v 1.84 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -72,7 +72,7 @@ make_rels_by_joins(PlannerInfo *root, int level, List **joinrels)
 			other_rels = list_head(joinrels[1]);		/* consider all initial
 														 * rels */
-		if (old_rel->joininfo != NIL)
+		if (old_rel->joininfo != NIL || old_rel->has_eclass_joins)
 		{
 			/*
 			 * Note that if all available join clauses for this rel require
@@ -152,7 +152,8 @@ make_rels_by_joins(PlannerInfo *root, int level, List **joinrels)
 			 * outer joins --- then we might have to force a bushy outer
 			 * join.  See have_relevant_joinclause().
 			 */
-			if (old_rel->joininfo == NIL && root->oj_info_list == NIL)
+			if (old_rel->joininfo == NIL && !old_rel->has_eclass_joins &&
 				root->oj_info_list == NIL)
 				continue;
 			if (k == other_level)
@@ -251,8 +252,7 @@ make_rels_by_joins(PlannerInfo *root, int level, List **joinrels)
 /*
 * make_rels_by_clause_joins
 *	  Build joins between the given relation 'old_rel' and other relations
- *	  that are mentioned within old_rel's joininfo list (i.e., relations
+ *	  that participate in join clauses that 'old_rel' also participates in.
 *	  that participate in join clauses that 'old_rel' also participates in).
 *	  The join rel nodes are returned in a list.
 *
 * 'old_rel' is the relation entry for the relation to be joined
--- a/src/backend/optimizer/path/pathkeys.c
+++ b/src/backend/optimizer/path/pathkeys.c
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -10,7 +10,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/createplan.c,v 1.221 2007/01/10 18:06:03 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/createplan.c,v 1.222 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -121,8 +121,6 @@ static MergeJoin *make_mergejoin(List *tlist,
 			   JoinType jointype);
 static Sort *make_sort(PlannerInfo *root, Plan *lefttree, int numCols,
 		  AttrNumber *sortColIdx, Oid *sortOperators, bool *nullsFirst);
 static Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
 						List *pathkeys);
 /*
@@ -1425,23 +1423,21 @@ create_nestloop_plan(PlannerInfo *root,
 		 * that have to be checked as qpquals at the join node.
 		 *
 		 * We can also remove any join clauses that are redundant with those
-		 * being used in the index scan; prior redundancy checks will not have
+		 * being used in the index scan; this check is needed because
-		 * caught this case because the join clauses would never have been put
+		 * find_eclass_clauses_for_index_join() may emit different clauses
-		 * in the same joininfo list.
+		 * than generate_join_implied_equalities() did.
 		 *
 		 * We can skip this if the index path is an ordinary indexpath and not
-		 * a special innerjoin path.
+		 * a special innerjoin path, since it then wouldn't be using any join
 		 * clauses.
 		 */
 		IndexPath  *innerpath = (IndexPath *) best_path->innerjoinpath;
 		if (innerpath->isjoininner)
 		{
 			joinrestrictclauses =
 				select_nonredundant_join_clauses(root,
 												 joinrestrictclauses,
-												 innerpath->indexclauses,
+												 innerpath->indexclauses);
 										 IS_OUTER_JOIN(best_path->jointype));
 		}
 	}
 	else if (IsA(best_path->innerjoinpath, BitmapHeapPath))
 	{
@@ -1471,8 +1467,7 @@ create_nestloop_plan(PlannerInfo *root,
 			joinrestrictclauses =
 				select_nonredundant_join_clauses(root,
 												 joinrestrictclauses,
-												 bitmapclauses,
+												 bitmapclauses);
 										 IS_OUTER_JOIN(best_path->jointype));
 		}
 	}
@@ -1516,7 +1511,21 @@ create_mergejoin_plan(PlannerInfo *root,
 	List	   *joinclauses;
 	List	   *otherclauses;
 	List	   *mergeclauses;
 	List	   *outerpathkeys;
 	List	   *innerpathkeys;
 	int			nClauses;
 	Oid		   *mergefamilies;
 	int		   *mergestrategies;
 	bool	   *mergenullsfirst;
 	MergeJoin  *join_plan;
 	int			i;
 	EquivalenceClass *lastoeclass;
 	EquivalenceClass *lastieclass;
 	PathKey	   *opathkey;
 	PathKey	   *ipathkey;
 	ListCell   *lc;
 	ListCell   *lop;
 	ListCell   *lip;
 	/* Get the join qual clauses (in plain expression form) */
 	/* Any pseudoconstant clauses are ignored here */
@@ -1542,7 +1551,8 @@ create_mergejoin_plan(PlannerInfo *root,
 	/*
 	 * Rearrange mergeclauses, if needed, so that the outer variable is always
-	 * on the left.
+	 * on the left; mark the mergeclause restrictinfos with correct
 	 * outer_is_left status.
 	 */
 	mergeclauses = get_switched_clauses(best_path->path_mergeclauses,
 							 best_path->jpath.outerjoinpath->parent->relids);
@@ -1564,7 +1574,10 @@ create_mergejoin_plan(PlannerInfo *root,
 			make_sort_from_pathkeys(root,
 									outer_plan,
 									best_path->outersortkeys);
 		outerpathkeys = best_path->outersortkeys;
 	}
 	else
 		outerpathkeys = best_path->jpath.outerjoinpath->pathkeys;
 	if (best_path->innersortkeys)
 	{
@@ -1573,7 +1586,86 @@ create_mergejoin_plan(PlannerInfo *root,
 			make_sort_from_pathkeys(root,
 									inner_plan,
 									best_path->innersortkeys);
 		innerpathkeys = best_path->innersortkeys;
 	}
 	else
 		innerpathkeys = best_path->jpath.innerjoinpath->pathkeys;
 	/*
 	 * Compute the opfamily/strategy/nullsfirst arrays needed by the executor.
 	 * The information is in the pathkeys for the two inputs, but we need to
 	 * be careful about the possibility of mergeclauses sharing a pathkey
 	 * (compare find_mergeclauses_for_pathkeys()).
 	 */
 	nClauses = list_length(mergeclauses);
 	Assert(nClauses == list_length(best_path->path_mergeclauses));
 	mergefamilies = (Oid *) palloc(nClauses * sizeof(Oid));
 	mergestrategies = (int *) palloc(nClauses * sizeof(int));
 	mergenullsfirst = (bool *) palloc(nClauses * sizeof(bool));
 	lastoeclass = NULL;
 	lastieclass = NULL;
 	opathkey = NULL;
 	ipathkey = NULL;
 	lop = list_head(outerpathkeys);
 	lip = list_head(innerpathkeys);
 	i = 0;
 	foreach(lc, best_path->path_mergeclauses)
 	{
 		RestrictInfo   *rinfo = (RestrictInfo *) lfirst(lc);
 		EquivalenceClass *oeclass;
 		EquivalenceClass *ieclass;
 		/* fetch outer/inner eclass from mergeclause */
 		Assert(IsA(rinfo, RestrictInfo));
 		if (rinfo->outer_is_left)
 		{
 			oeclass = rinfo->left_ec;
 			ieclass = rinfo->right_ec;
 		}
 		else
 		{
 			oeclass = rinfo->right_ec;
 			ieclass = rinfo->left_ec;
 		}
 		Assert(oeclass != NULL);
 		Assert(ieclass != NULL);
 		/* should match current or next pathkeys */
 		/* we check this carefully for debugging reasons */
 		if (oeclass != lastoeclass)
 		{
 			if (!lop)
 				elog(ERROR, "too few pathkeys for mergeclauses");
 			opathkey = (PathKey *) lfirst(lop);
 			lop = lnext(lop);
 			lastoeclass = opathkey->pk_eclass;
 			if (oeclass != lastoeclass)
 				elog(ERROR, "outer pathkeys do not match mergeclause");
 		}
 		if (ieclass != lastieclass)
 		{
 			if (!lip)
 				elog(ERROR, "too few pathkeys for mergeclauses");
 			ipathkey = (PathKey *) lfirst(lip);
 			lip = lnext(lip);
 			lastieclass = ipathkey->pk_eclass;
 			if (ieclass != lastieclass)
 				elog(ERROR, "inner pathkeys do not match mergeclause");
 		}
 		/* pathkeys should match each other too (more debugging) */
 		if (opathkey->pk_opfamily != ipathkey->pk_opfamily ||
 			opathkey->pk_strategy != ipathkey->pk_strategy ||
 			opathkey->pk_nulls_first != ipathkey->pk_nulls_first)
 			elog(ERROR, "left and right pathkeys do not match in mergejoin");
 		/* OK, save info for executor */
 		mergefamilies[i] = opathkey->pk_opfamily;
 		mergestrategies[i] = opathkey->pk_strategy;
 		mergenullsfirst[i] = opathkey->pk_nulls_first;
 		i++;
 	}
 	/*
 	 * Now we can build the mergejoin node.
@@ -1582,9 +1674,9 @@ create_mergejoin_plan(PlannerInfo *root,
 							   joinclauses,
 							   otherclauses,
 							   mergeclauses,
-							   best_path->path_mergeFamilies,
+							   mergefamilies,
-							   best_path->path_mergeStrategies,
+							   mergestrategies,
-							   best_path->path_mergeNullsFirst,
+							   mergenullsfirst,
 							   outer_plan,
 							   inner_plan,
 							   best_path->jpath.jointype);
@@ -1921,8 +2013,9 @@ fix_indexqual_operand(Node *node, IndexOptInfo *index, Oid *opfamily)
 *	  Given a list of merge or hash joinclauses (as RestrictInfo nodes),
 *	  extract the bare clauses, and rearrange the elements within the
 *	  clauses, if needed, so the outer join variable is on the left and
- *	  the inner is on the right.  The original data structure is not touched;
+ *	  the inner is on the right.  The original clause data structure is not
- *	  a modified list is returned.
+ *	  touched; a modified list is returned.  We do, however, set the transient
 *	  outer_is_left field in each RestrictInfo to show which side was which.
 */
 static List *
 get_switched_clauses(List *clauses, Relids outerrelids)
@@ -1953,9 +2046,14 @@ get_switched_clauses(List *clauses, Relids outerrelids)
 			/* Commute it --- note this modifies the temp node in-place. */
 			CommuteOpExpr(temp);
 			t_list = lappend(t_list, temp);
 			restrictinfo->outer_is_left = false;
 		}
 		else
 		{
 			Assert(bms_is_subset(restrictinfo->left_relids, outerrelids));
 			t_list = lappend(t_list, clause);
 			restrictinfo->outer_is_left = true;
 		}
 	}
 	return t_list;
 }
@@ -2490,7 +2588,7 @@ add_sort_column(AttrNumber colIdx, Oid sortOp, bool nulls_first,
 * If the input plan type isn't one that can do projections, this means
 * adding a Result node just to do the projection.
 */
-static Sort *
+Sort *
 make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
 {
 	List	   *tlist = lefttree->targetlist;
@@ -2512,41 +2610,55 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
 	foreach(i, pathkeys)
 	{
-		List	   *keysublist = (List *) lfirst(i);
+		PathKey	   *pathkey = (PathKey *) lfirst(i);
 		PathKeyItem *pathkey = NULL;
 		TargetEntry *tle = NULL;
 		Oid			pk_datatype = InvalidOid;
 		Oid			sortop;
 		ListCell   *j;
 		/*
-		 * We can sort by any one of the sort key items listed in this
+		 * We can sort by any non-constant expression listed in the pathkey's
-		 * sublist.  For now, we take the first one that corresponds to an
+		 * EquivalenceClass.  For now, we take the first one that corresponds
-		 * available Var in the tlist.	If there isn't any, use the first one
+		 * to an available Var in the tlist. If there isn't any, use the first
-		 * that is an expression in the input's vars.
+		 * one that is an expression in the input's vars.  (The non-const
 		 * restriction only matters if the EC is below_outer_join; but if it
 		 * isn't, it won't contain consts anyway, else we'd have discarded
 		 * the pathkey as redundant.)
 		 *
 		 * XXX if we have a choice, is there any way of figuring out which
 		 * might be cheapest to execute?  (For example, int4lt is likely much
 		 * cheaper to execute than numericlt, but both might appear in the
-		 * same pathkey sublist...)  Not clear that we ever will have a choice
+		 * same equivalence class...)  Not clear that we ever will have an
-		 * in practice, so it may not matter.
+		 * interesting choice in practice, so it may not matter.
 		 */
-		foreach(j, keysublist)
+		foreach(j, pathkey->pk_eclass->ec_members)
 		{
-			pathkey = (PathKeyItem *) lfirst(j);
+			EquivalenceMember *em = (EquivalenceMember *) lfirst(j);
-			Assert(IsA(pathkey, PathKeyItem));
+
-			tle = tlist_member(pathkey->key, tlist);
+			if (em->em_is_const || em->em_is_child)
 				continue;
 			tle = tlist_member((Node *) em->em_expr, tlist);
 			if (tle)
-				break;
+			{
 				pk_datatype = em->em_datatype;
 				break;			/* found expr already in tlist */
 			}
 		}
 		if (!tle)
 		{
 			/* No matching Var; look for a computable expression */
-			foreach(j, keysublist)
+			Expr   *sortexpr = NULL;
 			foreach(j, pathkey->pk_eclass->ec_members)
 			{
 				EquivalenceMember *em = (EquivalenceMember *) lfirst(j);
 				List	   *exprvars;
 				ListCell   *k;
-				pathkey = (PathKeyItem *) lfirst(j);
+				if (em->em_is_const || em->em_is_child)
-				exprvars = pull_var_clause(pathkey->key, false);
+					continue;
 				sortexpr = em->em_expr;
 				exprvars = pull_var_clause((Node *) sortexpr, false);
 				foreach(k, exprvars)
 				{
 					if (!tlist_member(lfirst(k), tlist))
@@ -2554,7 +2666,10 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
 				}
 				list_free(exprvars);
 				if (!k)
 				{
 					pk_datatype = em->em_datatype;
 					break;		/* found usable expression */
 				}
 			}
 			if (!j)
 				elog(ERROR, "could not find pathkey item to sort");
@@ -2571,7 +2686,7 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
 			/*
 			 * Add resjunk entry to input's tlist
 			 */
-			tle = makeTargetEntry((Expr *) pathkey->key,
+			tle = makeTargetEntry(sortexpr,
 								  list_length(tlist) + 1,
 								  NULL,
 								  true);
@@ -2579,14 +2694,28 @@ make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree, List *pathkeys)
 			lefttree->targetlist = tlist;		/* just in case NIL before */
 		}
 		/*
 		 * Look up the correct sort operator from the PathKey's slightly
 		 * abstracted representation.
 		 */
 		sortop = get_opfamily_member(pathkey->pk_opfamily,
 									 pk_datatype,
 									 pk_datatype,
 									 pathkey->pk_strategy);
 		if (!OidIsValid(sortop))	/* should not happen */
 			elog(ERROR, "could not find member %d(%u,%u) of opfamily %u",
 				 pathkey->pk_strategy, pk_datatype, pk_datatype,
 				 pathkey->pk_opfamily);
 		/*
 		 * The column might already be selected as a sort key, if the pathkeys
 		 * contain duplicate entries.  (This can happen in scenarios where
 		 * multiple mergejoinable clauses mention the same var, for example.)
 		 * So enter it only once in the sort arrays.
 		 */
-		numsortkeys = add_sort_column(tle->resno, pathkey->sortop,
+		numsortkeys = add_sort_column(tle->resno,
-									  pathkey->nulls_first,
+									  sortop,
 									  pathkey->pk_nulls_first,
 									  numsortkeys,
 									  sortColIdx, sortOperators, nullsFirst);
 	}
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/initsplan.c,v 1.127 2007/01/08 16:47:30 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/initsplan.c,v 1.128 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -37,8 +37,6 @@ int			from_collapse_limit;
 int			join_collapse_limit;
 static void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 					   Relids where_needed);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
 					bool below_outer_join, Relids *qualscope);
 static OuterJoinInfo *make_outerjoininfo(PlannerInfo *root,
@@ -51,8 +49,7 @@ static void distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 						Relids qualscope,
 						Relids ojscope,
 						Relids outerjoin_nonnullable);
-static bool qual_is_redundant(PlannerInfo *root, RestrictInfo *restrictinfo,
+static bool check_outerjoin_delay(PlannerInfo *root, Relids *relids_p);
 				  List *restrictlist);
 static void check_mergejoinable(RestrictInfo *restrictinfo);
 static void check_hashjoinable(RestrictInfo *restrictinfo);
@@ -144,7 +141,7 @@ build_base_rel_tlists(PlannerInfo *root, List *final_tlist)
 *	  as being needed for the indicated join (or for final output if
 *	  where_needed includes "relation 0").
 */
-static void
+void
 add_vars_to_targetlist(PlannerInfo *root, List *vars, Relids where_needed)
 {
 	ListCell   *temp;
@@ -590,17 +587,17 @@ make_outerjoininfo(PlannerInfo *root,
 *	  Add clause information to either the baserestrictinfo or joininfo list
 *	  (depending on whether the clause is a join) of each base relation
 *	  mentioned in the clause.	A RestrictInfo node is created and added to
- *	  the appropriate list for each rel.  Also, if the clause uses a
+ *	  the appropriate list for each rel.  Alternatively, if the clause uses a
 *	  mergejoinable operator and is not delayed by outer-join rules, enter
- *	  the left- and right-side expressions into the query's lists of
+ *	  the left- and right-side expressions into the query's list of
- *	  equijoined vars.
+ *	  EquivalenceClasses.
 *
 * 'clause': the qual clause to be distributed
 * 'is_pushed_down': if TRUE, force the clause to be marked 'is_pushed_down'
 *		(this indicates the clause came from a FromExpr, not a JoinExpr)
 * 'is_deduced': TRUE if the qual came from implied-equality deduction
 * 'below_outer_join': TRUE if the qual is from a JOIN/ON that is below the
- *		nullable side of a higher-level outer join.
+ *		nullable side of a higher-level outer join
 * 'qualscope': set of baserels the qual's syntactic scope covers
 * 'ojscope': NULL if not an outer-join qual, else the minimum set of baserels
 *		needed to form this join
@@ -625,11 +622,9 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 	Relids		relids;
 	bool		outerjoin_delayed;
 	bool		pseudoconstant = false;
-	bool		maybe_equijoin;
+	bool		maybe_equivalence;
 	bool		maybe_outer_join;
 	RestrictInfo *restrictinfo;
 	RelOptInfo *rel;
 	List	   *vars;
 	/*
 	 * Retrieve all relids mentioned within the clause.
@@ -705,108 +700,57 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 	if (is_deduced)
 	{
 		/*
-		 * If the qual came from implied-equality deduction, we always
+		 * If the qual came from implied-equality deduction, it should
-		 * evaluate the qual at its natural semantic level.  It is the
+		 * not be outerjoin-delayed, else deducer blew it.  But we can't
-		 * responsibility of the deducer not to create any quals that should
+		 * check this because the ojinfo list may now contain OJs above
-		 * be delayed by outer-join rules.
+		 * where the qual belongs.
 		 */
 		Assert(bms_equal(relids, qualscope));
 		Assert(!ojscope);
 		Assert(!pseudoconstant);
 		/* Needn't feed it back for more deductions */
 		outerjoin_delayed = false;
-		maybe_equijoin = false;
+		/* Don't feed it back for more deductions */
 		maybe_equivalence = false;
 		maybe_outer_join = false;
 	}
 	else if (bms_overlap(relids, outerjoin_nonnullable))
 	{
 		/*
 		 * The qual is attached to an outer join and mentions (some of the)
-		 * rels on the nonnullable side.  Force the qual to be evaluated
+		 * rels on the nonnullable side.
 		 * exactly at the level of joining corresponding to the outer join. We
 		 * cannot let it get pushed down into the nonnullable side, since then
 		 * we'd produce no output rows, rather than the intended single
 		 * null-extended row, for any nonnullable-side rows failing the qual.
 		 *
 		 * Note: an outer-join qual that mentions only nullable-side rels can
 		 * be pushed down into the nullable side without changing the join
-		 * result, so we treat it the same as an ordinary inner-join qual,
+		 * result, so we treat it almost the same as an ordinary inner-join
-		 * except for not setting maybe_equijoin (see below).
+		 * qual (see below).
 		 *
 		 * We can't use such a clause to deduce equivalence (the left and right
 		 * sides might be unequal above the join because one of them has gone
 		 * to NULL) ... but we might be able to use it for more limited
 		 * deductions, if there are no lower outer joins that delay its
 		 * application.  If so, consider adding it to the lists of set-aside
 		 * clauses.
 		 */
 		maybe_equivalence = false;
 		maybe_outer_join = !check_outerjoin_delay(root, &relids);
 		/*
 		 * Now force the qual to be evaluated exactly at the level of joining
 		 * corresponding to the outer join.  We cannot let it get pushed down
 		 * into the nonnullable side, since then we'd produce no output rows,
 		 * rather than the intended single null-extended row, for any
 		 * nonnullable-side rows failing the qual.
 		 *
 		 * (Do this step after calling check_outerjoin_delay, because that
 		 * trashes relids.)
 		 */
 		Assert(ojscope);
 		relids = ojscope;
 		outerjoin_delayed = true;
 		Assert(!pseudoconstant);
 		/*
 		 * We can't use such a clause to deduce equijoin (the left and right
 		 * sides might be unequal above the join because one of them has gone
 		 * to NULL) ... but we might be able to use it for more limited
 		 * purposes.  Note: for the current uses of deductions from an
 		 * outer-join clause, it seems safe to make the deductions even when
 		 * the clause is below a higher-level outer join; so we do not check
 		 * below_outer_join here.
 		 */
 		maybe_equijoin = false;
 		maybe_outer_join = true;
 	}
 	else
 	{
-		/*
+		/* Normal qual clause; check to see if must be delayed by outer join */
-		 * For a non-outer-join qual, we can evaluate the qual as soon as (1)
+		outerjoin_delayed = check_outerjoin_delay(root, &relids);
 		 * we have all the rels it mentions, and (2) we are at or above any
 		 * outer joins that can null any of these rels and are below the
 		 * syntactic location of the given qual.  We must enforce (2) because
 		 * pushing down such a clause below the OJ might cause the OJ to emit
 		 * null-extended rows that should not have been formed, or that should
 		 * have been rejected by the clause.  (This is only an issue for
 		 * non-strict quals, since if we can prove a qual mentioning only
 		 * nullable rels is strict, we'd have reduced the outer join to an
 		 * inner join in reduce_outer_joins().)
 		 *
 		 * To enforce (2), scan the oj_info_list and merge the required-relid
 		 * sets of any such OJs into the clause's own reference list.  At the
 		 * time we are called, the oj_info_list contains only outer joins
 		 * below this qual.  We have to repeat the scan until no new relids
 		 * get added; this ensures that the qual is suitably delayed regardless
 		 * of the order in which OJs get executed.  As an example, if we have
 		 * one OJ with LHS=A, RHS=B, and one with LHS=B, RHS=C, it is implied
 		 * that these can be done in either order; if the B/C join is done
 		 * first then the join to A can null C, so a qual actually mentioning
 		 * only C cannot be applied below the join to A.
 		 */
 		bool		found_some;
 		outerjoin_delayed = false;
 		do {
 			ListCell   *l;
 			found_some = false;
 			foreach(l, root->oj_info_list)
 			{
 				OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
 				/* do we have any nullable rels of this OJ? */
 				if (bms_overlap(relids, ojinfo->min_righthand) ||
 					(ojinfo->is_full_join &&
 					 bms_overlap(relids, ojinfo->min_lefthand)))
 				{
 					/* yes; do we have all its rels? */
 					if (!bms_is_subset(ojinfo->min_lefthand, relids) ||
 						!bms_is_subset(ojinfo->min_righthand, relids))
 					{
 						/* no, so add them in */
 						relids = bms_add_members(relids,
 												 ojinfo->min_lefthand);
 						relids = bms_add_members(relids,
 												 ojinfo->min_righthand);
 						outerjoin_delayed = true;
 						/* we'll need another iteration */
 						found_some = true;
 					}
 				}
 			}
 		} while (found_some);
 		if (outerjoin_delayed)
 		{
@@ -816,26 +760,27 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 			 * Because application of the qual will be delayed by outer join,
 			 * we mustn't assume its vars are equal everywhere.
 			 */
-			maybe_equijoin = false;
+			maybe_equivalence = false;
 		}
 		else
 		{
 			/*
-			 * Qual is not delayed by any lower outer-join restriction. If it
+			 * Qual is not delayed by any lower outer-join restriction, so
-			 * is not itself below or within an outer join, we can consider it
+			 * we can consider feeding it to the equivalence machinery.
-			 * "valid everywhere", so consider feeding it to the equijoin
+			 * However, if it's itself within an outer-join clause, treat it
-			 * machinery.  (If it is within an outer join, we can't consider
+			 * as though it appeared below that outer join (note that we can
-			 * it "valid everywhere": once the contained variables have gone
+			 * only get here when the clause references only nullable-side
-			 * to NULL, we'd be asserting things like NULL = NULL, which is
+			 * rels).
 			 * not true.)
 			 */
-			if (!below_outer_join && outerjoin_nonnullable == NULL)
+			maybe_equivalence = true;
-				maybe_equijoin = true;
+			if (outerjoin_nonnullable != NULL)
-			else
+				below_outer_join = true;
 				maybe_equijoin = false;
 		}
-		/* Since it doesn't mention the LHS, it's certainly not an OJ clause */
+		/*
 		 * Since it doesn't mention the LHS, it's certainly not useful as a
 		 * set-aside OJ clause, even if it's in an OJ.
 		 */
 		maybe_outer_join = false;
 	}
@@ -860,118 +805,65 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 									 relids);
 	/*
-	 * Figure out where to attach it.
+	 * If it's a join clause (either naturally, or because delayed by
 	 * outer-join rules), add vars used in the clause to targetlists of
 	 * their relations, so that they will be emitted by the plan nodes that
 	 * scan those relations (else they won't be available at the join node!).
 	 *
 	 * Note: if the clause gets absorbed into an EquivalenceClass then this
 	 * may be unnecessary, but for now we have to do it to cover the case
 	 * where the EC becomes ec_broken and we end up reinserting the original
 	 * clauses into the plan.
 	 */
-	switch (bms_membership(relids))
+	if (bms_membership(relids) == BMS_MULTIPLE)
 	{
-		case BMS_SINGLETON:
+		List	   *vars = pull_var_clause(clause, false);
-			/*
+		add_vars_to_targetlist(root, vars, relids);
-			 * There is only one relation participating in 'clause', so
+		list_free(vars);
 			 * 'clause' is a restriction clause for that relation.
 			 */
 			rel = find_base_rel(root, bms_singleton_member(relids));
 			/*
 			 * Check for a "mergejoinable" clause even though it's not a join
 			 * clause.	This is so that we can recognize that "a.x = a.y"
 			 * makes x and y eligible to be considered equal, even when they
 			 * belong to the same rel.	Without this, we would not recognize
 			 * that "a.x = a.y AND a.x = b.z AND a.y = c.q" allows us to
 			 * consider z and q equal after their rels are joined.
 			 */
 			check_mergejoinable(restrictinfo);
 			/*
 			 * If the clause was deduced from implied equality, check to see
 			 * whether it is redundant with restriction clauses we already
 			 * have for this rel.  Note we cannot apply this check to
 			 * user-written clauses, since we haven't found the canonical
 			 * pathkey sets yet while processing user clauses. (NB: no
 			 * comparable check is done in the join-clause case; redundancy
 			 * will be detected when the join clause is moved into a join
 			 * rel's restriction list.)
 			 */
 			if (!is_deduced ||
 				!qual_is_redundant(root, restrictinfo,
 								   rel->baserestrictinfo))
 			{
 				/* Add clause to rel's restriction list */
 				rel->baserestrictinfo = lappend(rel->baserestrictinfo,
 												restrictinfo);
 			}
 			break;
 		case BMS_MULTIPLE:
 			/*
 			 * 'clause' is a join clause, since there is more than one rel in
 			 * the relid set.
 			 */
 			/*
 			 * Check for hash or mergejoinable operators.
 			 *
 			 * We don't bother setting the hashjoin info if we're not going to
 			 * need it.  We do want to know about mergejoinable ops in all
 			 * cases, however, because we use mergejoinable ops for other
 			 * purposes such as detecting redundant clauses.
 			 */
 			check_mergejoinable(restrictinfo);
 			if (enable_hashjoin)
 				check_hashjoinable(restrictinfo);
 			/*
 			 * Add clause to the join lists of all the relevant relations.
 			 */
 			add_join_clause_to_rels(root, restrictinfo, relids);
 			/*
 			 * Add vars used in the join clause to targetlists of their
 			 * relations, so that they will be emitted by the plan nodes that
 			 * scan those relations (else they won't be available at the join
 			 * node!).
 			 */
 			vars = pull_var_clause(clause, false);
 			add_vars_to_targetlist(root, vars, relids);
 			list_free(vars);
 			break;
 		default:
 			/*
 			 * 'clause' references no rels, and therefore we have no place to
 			 * attach it.  Shouldn't get here if callers are working properly.
 			 */
 			elog(ERROR, "cannot cope with variable-free clause");
 			break;
 	}
 	/*
-	 * If the clause has a mergejoinable operator, we may be able to deduce
+	 * We check "mergejoinability" of every clause, not only join clauses,
-	 * more things from it under the principle of transitivity.
+	 * because we want to know about equivalences between vars of the same
 	 * relation, or between vars and consts.
 	 */
 	check_mergejoinable(restrictinfo);
 	/*
 	 * If it is a true equivalence clause, send it to the EquivalenceClass
 	 * machinery.  We do *not* attach it directly to any restriction or join
 	 * lists.  The EC code will propagate it to the appropriate places later.
 	 *
-	 * If it is not an outer-join qualification nor bubbled up due to an outer
+	 * If the clause has a mergejoinable operator and is not outerjoin-delayed,
-	 * join, then the two sides represent equivalent PathKeyItems for path
+	 * yet isn't an equivalence because it is an outer-join clause, the EC
-	 * keys: any path that is sorted by one side will also be sorted by the
+	 * code may yet be able to do something with it.  We add it to appropriate
-	 * other (as soon as the two rels are joined, that is).  Pass such clauses
+	 * lists for further consideration later.  Specifically:
 	 * to add_equijoined_keys.
 	 *
-	 * If it is a left or right outer-join qualification that relates the two
+	 * If it is a left or right outer-join qualification that relates the
-	 * sides of the outer join (no funny business like leftvar1 = leftvar2 +
+	 * two sides of the outer join (no funny business like leftvar1 =
-	 * rightvar), we add it to root->left_join_clauses or
+	 * leftvar2 + rightvar), we add it to root->left_join_clauses or
 	 * root->right_join_clauses according to which side the nonnullable
 	 * variable appears on.
 	 *
 	 * If it is a full outer-join qualification, we add it to
 	 * root->full_join_clauses.  (Ideally we'd discard cases that aren't
 	 * leftvar = rightvar, as we do for left/right joins, but this routine
-	 * doesn't have the info needed to do that; and the current usage of the
+	 * doesn't have the info needed to do that; and the current usage of
-	 * full_join_clauses list doesn't require that, so it's not currently
+	 * the full_join_clauses list doesn't require that, so it's not
-	 * worth complicating this routine's API to make it possible.)
+	 * currently worth complicating this routine's API to make it possible.)
 	 *
 	 * If none of the above hold, pass it off to
 	 * distribute_restrictinfo_to_rels().
 	 */
-	if (restrictinfo->mergejoinoperator != InvalidOid)
+	if (restrictinfo->mergeopfamilies)
 	{
-		if (maybe_equijoin)
+		if (maybe_equivalence)
-			add_equijoined_keys(root, restrictinfo);
+		{
 			if (process_equivalence(root, restrictinfo, below_outer_join))
 				return;
 			/* EC rejected it, so pass to distribute_restrictinfo_to_rels */
 		}
 		else if (maybe_outer_join && restrictinfo->can_join)
 		{
 			if (bms_is_subset(restrictinfo->left_relids,
@@ -982,8 +874,9 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 				/* we have outervar = innervar */
 				root->left_join_clauses = lappend(root->left_join_clauses,
 												  restrictinfo);
 				return;
 			}
-			else if (bms_is_subset(restrictinfo->right_relids,
+			if (bms_is_subset(restrictinfo->right_relids,
 								   outerjoin_nonnullable) &&
 					 !bms_overlap(restrictinfo->left_relids,
 								  outerjoin_nonnullable))
@@ -991,166 +884,213 @@ distribute_qual_to_rels(PlannerInfo *root, Node *clause,
 				/* we have innervar = outervar */
 				root->right_join_clauses = lappend(root->right_join_clauses,
 												   restrictinfo);
 				return;
 			}
-			else if (bms_equal(outerjoin_nonnullable, qualscope))
+			if (bms_equal(outerjoin_nonnullable, qualscope))
 			{
 				/* FULL JOIN (above tests cannot match in this case) */
 				root->full_join_clauses = lappend(root->full_join_clauses,
 												  restrictinfo);
 				return;
 			}
 		}
 	}
 	/* No EC special case applies, so push it into the clause lists */
 	distribute_restrictinfo_to_rels(root, restrictinfo);
 }
 /*
 * check_outerjoin_delay
 *		Detect whether a qual referencing the given relids must be delayed
 *		in application due to the presence of a lower outer join.
 *
 * If so, add relids to *relids_p to reflect the lowest safe level for
 * evaluating the qual, and return TRUE.
 *
 * For a non-outer-join qual, we can evaluate the qual as soon as (1) we have
 * all the rels it mentions, and (2) we are at or above any outer joins that
 * can null any of these rels and are below the syntactic location of the
 * given qual.  We must enforce (2) because pushing down such a clause below
 * the OJ might cause the OJ to emit null-extended rows that should not have
 * been formed, or that should have been rejected by the clause.  (This is
 * only an issue for non-strict quals, since if we can prove a qual mentioning
 * only nullable rels is strict, we'd have reduced the outer join to an inner
 * join in reduce_outer_joins().)
 *
 * To enforce (2), scan the oj_info_list and merge the required-relid sets of
 * any such OJs into the clause's own reference list.  At the time we are
 * called, the oj_info_list contains only outer joins below this qual.  We
 * have to repeat the scan until no new relids get added; this ensures that
 * the qual is suitably delayed regardless of the order in which OJs get
 * executed.  As an example, if we have one OJ with LHS=A, RHS=B, and one with
 * LHS=B, RHS=C, it is implied that these can be done in either order; if the
 * B/C join is done first then the join to A can null C, so a qual actually
 * mentioning only C cannot be applied below the join to A.
 *
 * For an outer-join qual, this isn't going to determine where we place the
 * qual, but we need to determine outerjoin_delayed anyway so we can decide
 * whether the qual is potentially useful for equivalence deductions.
 */
 static bool
 check_outerjoin_delay(PlannerInfo *root, Relids *relids_p)
 {
 	Relids		relids = *relids_p;
 	bool		outerjoin_delayed;
 	bool		found_some;
 	outerjoin_delayed = false;
 	do {
 		ListCell   *l;
 		found_some = false;
 		foreach(l, root->oj_info_list)
 		{
 			OuterJoinInfo *ojinfo = (OuterJoinInfo *) lfirst(l);
 			/* do we reference any nullable rels of this OJ? */
 			if (bms_overlap(relids, ojinfo->min_righthand) ||
 				(ojinfo->is_full_join &&
 				 bms_overlap(relids, ojinfo->min_lefthand)))
 			{
 				/* yes; have we included all its rels in relids? */
 				if (!bms_is_subset(ojinfo->min_lefthand, relids) ||
 					!bms_is_subset(ojinfo->min_righthand, relids))
 				{
 					/* no, so add them in */
 					relids = bms_add_members(relids, ojinfo->min_lefthand);
 					relids = bms_add_members(relids, ojinfo->min_righthand);
 					outerjoin_delayed = true;
 					/* we'll need another iteration */
 					found_some = true;
 				}
 			}
 		}
 	} while (found_some);
 	*relids_p = relids;
 	return outerjoin_delayed;
 }
 /*
 * distribute_restrictinfo_to_rels
 *	  Push a completed RestrictInfo into the proper restriction or join
 *	  clause list(s).
 *
 * This is the last step of distribute_qual_to_rels() for ordinary qual
 * clauses.  Clauses that are interesting for equivalence-class processing
 * are diverted to the EC machinery, but may ultimately get fed back here.
 */
 void
 distribute_restrictinfo_to_rels(PlannerInfo *root,
 								RestrictInfo *restrictinfo)
 {
 	Relids		relids = restrictinfo->required_relids;
 	RelOptInfo *rel;
 	switch (bms_membership(relids))
 	{
 		case BMS_SINGLETON:
 			/*
 			 * There is only one relation participating in the clause, so
 			 * it is a restriction clause for that relation.
 			 */
 			rel = find_base_rel(root, bms_singleton_member(relids));
 			/* Add clause to rel's restriction list */
 			rel->baserestrictinfo = lappend(rel->baserestrictinfo,
 											restrictinfo);
 			break;
 		case BMS_MULTIPLE:
 			/*
 			 * The clause is a join clause, since there is more than one rel
 			 * in its relid set.
 			 */
 			/*
 			 * Check for hashjoinable operators.  (We don't bother setting
 			 * the hashjoin info if we're not going to need it.)
 			 */
 			if (enable_hashjoin)
 				check_hashjoinable(restrictinfo);
 			/*
 			 * Add clause to the join lists of all the relevant relations.
 			 */
 			add_join_clause_to_rels(root, restrictinfo, relids);
 			break;
 		default:
 			/*
 			 * clause references no rels, and therefore we have no place to
 			 * attach it.  Shouldn't get here if callers are working properly.
 			 */
 			elog(ERROR, "cannot cope with variable-free clause");
 			break;
 	}
 }
 /*
 * process_implied_equality
- *	  Check to see whether we already have a restrictinfo item that says
+ *	  Create a restrictinfo item that says "item1 op item2", and push it
- *	  item1 = item2, and create one if not; or if delete_it is true,
+ *	  into the appropriate lists.  (In practice opno is always a btree
- *	  remove any such restrictinfo item.
+ *	  equality operator.)
 *
- * This processing is a consequence of transitivity of mergejoin equality:
+ * "qualscope" is the nominal syntactic level to impute to the restrictinfo.
- * if we have mergejoinable clauses A = B and B = C, we can deduce A = C
+ * This must contain at least all the rels used in the expressions, but it
- * (where = is an appropriate mergejoinable operator).	See path/pathkeys.c
+ * is used only to set the qual application level when both exprs are
- * for more details.
+ * variable-free.  Otherwise the qual is applied at the lowest join level
 * that provides all its variables.
 *
 * "both_const" indicates whether both items are known pseudo-constant;
 * in this case it is worth applying eval_const_expressions() in case we
 * can produce constant TRUE or constant FALSE.  (Otherwise it's not,
 * because the expressions went through eval_const_expressions already.)
 *
 * This is currently used only when an EquivalenceClass is found to
 * contain pseudoconstants.  See path/pathkeys.c for more details.
 */
 void
 process_implied_equality(PlannerInfo *root,
-						 Node *item1, Node *item2,
+						 Oid opno,
-						 Oid sortop1, Oid sortop2,
+						 Expr *item1,
-						 Relids item1_relids, Relids item2_relids,
+						 Expr *item2,
-						 bool delete_it)
+						 Relids qualscope,
 						 bool below_outer_join,
 						 bool both_const)
 {
 	Relids		relids;
 	BMS_Membership membership;
 	RelOptInfo *rel1;
 	List	   *restrictlist;
 	ListCell   *itm;
 	Oid			ltype,
 				rtype;
 	Operator	eq_operator;
 	Form_pg_operator pgopform;
 	Expr	   *clause;
 	/* Get set of relids referenced in the two expressions */
 	relids = bms_union(item1_relids, item2_relids);
 	membership = bms_membership(relids);
 	/*
-	 * generate_implied_equalities() shouldn't call me on two constants.
+	 * Build the new clause.  Copy to ensure it shares no substructure with
 	 * original (this is necessary in case there are subselects in there...)
 	 */
-	Assert(membership != BMS_EMPTY_SET);
+	clause = make_opclause(opno,
 	/*
 	 * If the exprs involve a single rel, we need to look at that rel's
 	 * baserestrictinfo list.  If multiple rels, we can scan the joininfo list
 	 * of any of 'em.
 	 */
 	if (membership == BMS_SINGLETON)
 	{
 		rel1 = find_base_rel(root, bms_singleton_member(relids));
 		restrictlist = rel1->baserestrictinfo;
 	}
 	else
 	{
 		Relids		other_rels;
 		int			first_rel;
 		/* Copy relids, find and remove one member */
 		other_rels = bms_copy(relids);
 		first_rel = bms_first_member(other_rels);
 		bms_free(other_rels);
 		rel1 = find_base_rel(root, first_rel);
 		restrictlist = rel1->joininfo;
 	}
 	/*
 	 * Scan to see if equality is already known.  If so, we're done in the add
 	 * case, and done after removing it in the delete case.
 	 */
 	foreach(itm, restrictlist)
 	{
 		RestrictInfo *restrictinfo = (RestrictInfo *) lfirst(itm);
 		Node	   *left,
 				   *right;
 		if (restrictinfo->mergejoinoperator == InvalidOid)
 			continue;			/* ignore non-mergejoinable clauses */
 		/* We now know the restrictinfo clause is a binary opclause */
 		left = get_leftop(restrictinfo->clause);
 		right = get_rightop(restrictinfo->clause);
 		if ((equal(item1, left) && equal(item2, right)) ||
 			(equal(item2, left) && equal(item1, right)))
 		{
 			/* found a matching clause */
 			if (delete_it)
 			{
 				if (membership == BMS_SINGLETON)
 				{
 					/* delete it from local restrictinfo list */
 					rel1->baserestrictinfo = list_delete_ptr(rel1->baserestrictinfo,
 															 restrictinfo);
 				}
 				else
 				{
 					/* let joininfo.c do it */
 					remove_join_clause_from_rels(root, restrictinfo, relids);
 				}
 			}
 			return;				/* done */
 		}
 	}
 	/* Didn't find it.  Done if deletion requested */
 	if (delete_it)
 		return;
 	/*
 	 * This equality is new information, so construct a clause representing it
 	 * to add to the query data structures.
 	 */
 	ltype = exprType(item1);
 	rtype = exprType(item2);
 	eq_operator = compatible_oper(NULL, list_make1(makeString("=")),
 								  ltype, rtype,
 								  true, -1);
 	if (!HeapTupleIsValid(eq_operator))
 	{
 		/*
 		 * Would it be safe to just not add the equality to the query if we
 		 * have no suitable equality operator for the combination of
 		 * datatypes?  NO, because sortkey selection may screw up anyway.
 		 */
 		ereport(ERROR,
 				(errcode(ERRCODE_UNDEFINED_FUNCTION),
 		errmsg("could not identify an equality operator for types %s and %s",
 			   format_type_be(ltype), format_type_be(rtype))));
 	}
 	pgopform = (Form_pg_operator) GETSTRUCT(eq_operator);
 	/*
 	 * Let's just make sure this appears to be a compatible operator.
 	 *
 	 * XXX needs work
 	 */
 	if (pgopform->oprresult != BOOLOID)
 		ereport(ERROR,
 				(errcode(ERRCODE_INVALID_FUNCTION_DEFINITION),
 				 errmsg("equality operator for types %s and %s should be merge-joinable, but isn't",
 						format_type_be(ltype), format_type_be(rtype))));
 	/*
 	 * Now we can build the new clause.  Copy to ensure it shares no
 	 * substructure with original (this is necessary in case there are
 	 * subselects in there...)
 	 */
 	clause = make_opclause(oprid(eq_operator),	/* opno */
 						   BOOLOID,		/* opresulttype */
 						   false,		/* opretset */
 						   (Expr *) copyObject(item1),
 						   (Expr *) copyObject(item2));
-	ReleaseSysCache(eq_operator);
+	/* If both constant, try to reduce to a boolean constant. */
 	if (both_const)
 	{
 		clause = (Expr *) eval_const_expressions((Node *) clause);
 		/* If we produced const TRUE, just drop the clause */
 		if (clause && IsA(clause, Const))
 		{
 			Const	*cclause = (Const *) clause;
 			Assert(cclause->consttype == BOOLOID);
 			if (!cclause->constisnull && DatumGetBool(cclause->constvalue))
 				return;
 		}
 	}
 	/* Make a copy of qualscope to avoid problems if source EC changes */
 	qualscope = bms_copy(qualscope);
 	/*
 	 * Push the new clause into all the appropriate restrictinfo lists.
@@ -1159,119 +1099,53 @@ process_implied_equality(PlannerInfo *root,
 	 * taken for an original JOIN/ON clause.
 	 */
 	distribute_qual_to_rels(root, (Node *) clause,
-							true, true, false, relids, NULL, NULL);
+							true, true, below_outer_join,
 							qualscope, NULL, NULL);
 }
 /*
- * qual_is_redundant
+ * build_implied_join_equality --- build a RestrictInfo for a derived equality
 *	  Detect whether an implied-equality qual that turns out to be a
 *	  restriction clause for a single base relation is redundant with
 *	  already-known restriction clauses for that rel.  This occurs with,
 *	  for example,
 *				SELECT * FROM tab WHERE f1 = f2 AND f2 = f3;
 *	  We need to suppress the redundant condition to avoid computing
 *	  too-small selectivity, not to mention wasting time at execution.
 *
- * Note: quals of the form "var = const" are never considered redundant,
+ * This overlaps the functionality of process_implied_equality(), but we
- * only those of the form "var = var".	This is needed because when we
+ * must return the RestrictInfo, not push it into the joininfo tree.
 * have constants in an implied-equality set, we use a different strategy
 * that suppresses all "var = var" deductions.	We must therefore keep
 * all the "var = const" quals.
 */
-static bool
+RestrictInfo *
-qual_is_redundant(PlannerInfo *root,
+build_implied_join_equality(Oid opno,
-				  RestrictInfo *restrictinfo,
+							Expr *item1,
-				  List *restrictlist)
+							Expr *item2,
 							Relids qualscope)
 {
-	Node	   *newleft;
+	RestrictInfo *restrictinfo;
-	Node	   *newright;
+	Expr	   *clause;
 	List	   *oldquals;
 	ListCell   *olditem;
 	List	   *equalexprs;
 	bool		someadded;
 	/* Never redundant unless vars appear on both sides */
 	if (bms_is_empty(restrictinfo->left_relids) ||
 		bms_is_empty(restrictinfo->right_relids))
 		return false;
 	newleft = get_leftop(restrictinfo->clause);
 	newright = get_rightop(restrictinfo->clause);
 	/*
-	 * Set cached pathkeys.  NB: it is okay to do this now because this
+	 * Build the new clause.  Copy to ensure it shares no substructure with
-	 * routine is only invoked while we are generating implied equalities.
+	 * original (this is necessary in case there are subselects in there...)
 	 * Therefore, the equi_key_list is already complete and so we can
 	 * correctly determine canonical pathkeys.
 	 */
-	cache_mergeclause_pathkeys(root, restrictinfo);
+	clause = make_opclause(opno,
-	/* If different, say "not redundant" (should never happen) */
+						   BOOLOID,		/* opresulttype */
-	if (restrictinfo->left_pathkey != restrictinfo->right_pathkey)
+						   false,		/* opretset */
-		return false;
+						   (Expr *) copyObject(item1),
 						   (Expr *) copyObject(item2));
 	/* Make a copy of qualscope to avoid problems if source EC changes */
 	qualscope = bms_copy(qualscope);
 	/*
-	 * Scan existing quals to find those referencing same pathkeys. Usually
+	 * Build the RestrictInfo node itself.
 	 * there will be few, if any, so build a list of just the interesting
 	 * ones.
 	 */
-	oldquals = NIL;
+	restrictinfo = make_restrictinfo(clause,
-	foreach(olditem, restrictlist)
+									 true, /* is_pushed_down */
-	{
+									 false,	/* outerjoin_delayed */
-		RestrictInfo *oldrinfo = (RestrictInfo *) lfirst(olditem);
+									 false,	/* pseudoconstant */
 									 qualscope);
-		if (oldrinfo->mergejoinoperator != InvalidOid)
+	/* Set mergejoinability info always, and hashjoinability if enabled */
-		{
+	check_mergejoinable(restrictinfo);
-			cache_mergeclause_pathkeys(root, oldrinfo);
+	if (enable_hashjoin)
-			if (restrictinfo->left_pathkey == oldrinfo->left_pathkey &&
+		check_hashjoinable(restrictinfo);
 				restrictinfo->right_pathkey == oldrinfo->right_pathkey)
 				oldquals = lcons(oldrinfo, oldquals);
 		}
 	}
 	if (oldquals == NIL)
 		return false;
-	/*
+	return restrictinfo;
 	 * Now, we want to develop a list of exprs that are known equal to the
 	 * left side of the new qual.  We traverse the old-quals list repeatedly
 	 * to transitively expand the exprs list.  If at any point we find we can
 	 * reach the right-side expr of the new qual, we are done.	We give up
 	 * when we can't expand the equalexprs list any more.
 	 */
 	equalexprs = list_make1(newleft);
 	do
 	{
 		someadded = false;
 		/* cannot use foreach here because of possible list_delete */
 		olditem = list_head(oldquals);
 		while (olditem)
 		{
 			RestrictInfo *oldrinfo = (RestrictInfo *) lfirst(olditem);
 			Node	   *oldleft = get_leftop(oldrinfo->clause);
 			Node	   *oldright = get_rightop(oldrinfo->clause);
 			Node	   *newguy = NULL;
 			/* must advance olditem before list_delete possibly pfree's it */
 			olditem = lnext(olditem);
 			if (list_member(equalexprs, oldleft))
 				newguy = oldright;
 			else if (list_member(equalexprs, oldright))
 				newguy = oldleft;
 			else
 				continue;
 			if (equal(newguy, newright))
 				return true;	/* we proved new clause is redundant */
 			equalexprs = lcons(newguy, equalexprs);
 			someadded = true;
 			/*
 			 * Remove this qual from list, since we don't need it anymore.
 			 */
 			oldquals = list_delete_ptr(oldquals, oldrinfo);
 		}
 	} while (someadded);
 	return false;				/* it's not redundant */
 }
@@ -1294,10 +1168,7 @@ static void
 check_mergejoinable(RestrictInfo *restrictinfo)
 {
 	Expr	   *clause = restrictinfo->clause;
-	Oid			opno,
+	Oid			opno;
 				leftOp,
 				rightOp;
 	Oid			opfamily;
 	if (restrictinfo->pseudoconstant)
 		return;
@@ -1310,16 +1181,13 @@ check_mergejoinable(RestrictInfo *restrictinfo)
 	if (op_mergejoinable(opno) &&
 		!contain_volatile_functions((Node *) clause))
-	{
+		restrictinfo->mergeopfamilies = get_mergejoin_opfamilies(opno);
-		/* XXX for the moment, continue to force use of particular sortops */
+
-		if (get_op_mergejoin_info(opno, &leftOp, &rightOp, &opfamily))
+	/*
-		{
+	 * Note: op_mergejoinable is just a hint; if we fail to find the
-			restrictinfo->mergejoinoperator = opno;
+	 * operator in any btree opfamilies, mergeopfamilies remains NIL
-			restrictinfo->left_sortop = leftOp;
+	 * and so the clause is not treated as mergejoinable.
-			restrictinfo->right_sortop = rightOp;
+	 */
 			restrictinfo->mergeopfamily = opfamily;
 		}
 	}
 }
 /*
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -14,7 +14,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.98 2007/01/05 22:19:32 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/planmain.c,v 1.99 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -110,14 +110,14 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
 	 * for "simple" rels.
 	 *
 	 * NOTE: in_info_list and append_rel_list were set up by subquery_planner,
-	 * do not touch here
+	 * do not touch here; eq_classes may contain data already, too.
 	 */
 	root->simple_rel_array_size = list_length(parse->rtable) + 1;
 	root->simple_rel_array = (RelOptInfo **)
 		palloc0(root->simple_rel_array_size * sizeof(RelOptInfo *));
 	root->join_rel_list = NIL;
 	root->join_rel_hash = NULL;
-	root->equi_key_list = NIL;
+	root->canon_pathkeys = NIL;
 	root->left_join_clauses = NIL;
 	root->right_join_clauses = NIL;
 	root->full_join_clauses = NIL;
@@ -165,8 +165,8 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
 	 * Examine the targetlist and qualifications, adding entries to baserel
 	 * targetlists for all referenced Vars.  Restrict and join clauses are
 	 * added to appropriate lists belonging to the mentioned relations.  We
-	 * also build lists of equijoined keys for pathkey construction, and form
+	 * also build EquivalenceClasses for provably equivalent expressions,
-	 * a target joinlist for make_one_rel() to work from.
+	 * and form a target joinlist for make_one_rel() to work from.
 	 *
 	 * Note: all subplan nodes will have "flat" (var-only) tlists. This
 	 * implies that all expression evaluations are done at the root of the
@@ -179,16 +179,23 @@ query_planner(PlannerInfo *root, List *tlist, double tuple_fraction,
 	joinlist = deconstruct_jointree(root);
 	/*
-	 * Use the completed lists of equijoined keys to deduce any implied but
+	 * Reconsider any postponed outer-join quals now that we have built up
-	 * unstated equalities (for example, A=B and B=C imply A=C).
+	 * equivalence classes.  (This could result in further additions or
 	 * mergings of classes.)
 	 */
-	generate_implied_equalities(root);
+	reconsider_outer_join_clauses(root);
 	/*
-	 * We should now have all the pathkey equivalence sets built, so it's now
+	 * If we formed any equivalence classes, generate additional restriction
-	 * possible to convert the requested query_pathkeys to canonical form.
+	 * clauses as appropriate.  (Implied join clauses are formed on-the-fly
-	 * Also canonicalize the groupClause and sortClause pathkeys for use
+	 * later.)
-	 * later.
+	 */
 	generate_base_implied_equalities(root);
 	/*
 	 * We have completed merging equivalence sets, so it's now possible to
 	 * convert the requested query_pathkeys to canonical form.  Also
 	 * canonicalize the groupClause and sortClause pathkeys for use later.
 	 */
 	root->query_pathkeys = canonicalize_pathkeys(root, root->query_pathkeys);
 	root->group_pathkeys = canonicalize_pathkeys(root, root->group_pathkeys);
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.211 2007/01/10 18:06:03 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/plan/planner.c,v 1.212 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -206,6 +206,8 @@ subquery_planner(Query *parse, double tuple_fraction,
 	/* Create a PlannerInfo data structure for this subquery */
 	root = makeNode(PlannerInfo);
 	root->parse = parse;
 	root->planner_cxt = CurrentMemoryContext;
 	root->eq_classes = NIL;
 	root->in_info_list = NIL;
 	root->append_rel_list = NIL;
@@ -715,9 +717,10 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		 * operation's result.  We have to do this before overwriting the sort
 		 * key information...
 		 */
-		current_pathkeys = make_pathkeys_for_sortclauses(set_sortclauses,
+		current_pathkeys = make_pathkeys_for_sortclauses(root,
-													result_plan->targetlist);
+														 set_sortclauses,
-		current_pathkeys = canonicalize_pathkeys(root, current_pathkeys);
+													result_plan->targetlist,
 														 true);
 		/*
 		 * We should not need to call preprocess_targetlist, since we must be
@@ -742,9 +745,10 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		/*
 		 * Calculate pathkeys that represent result ordering requirements
 		 */
-		sort_pathkeys = make_pathkeys_for_sortclauses(parse->sortClause,
+		sort_pathkeys = make_pathkeys_for_sortclauses(root,
-													  tlist);
+													  parse->sortClause,
-		sort_pathkeys = canonicalize_pathkeys(root, sort_pathkeys);
+													  tlist,
 													  true);
 	}
 	else
 	{
@@ -778,12 +782,18 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 		/*
 		 * Calculate pathkeys that represent grouping/ordering requirements.
 		 * Stash them in PlannerInfo so that query_planner can canonicalize
-		 * them.
+		 * them after EquivalenceClasses have been formed.
 		 */
 		root->group_pathkeys =
-			make_pathkeys_for_sortclauses(parse->groupClause, tlist);
+			make_pathkeys_for_sortclauses(root,
 										  parse->groupClause,
 										  tlist,
 										  false);
 		root->sort_pathkeys =
-			make_pathkeys_for_sortclauses(parse->sortClause, tlist);
+			make_pathkeys_for_sortclauses(root,
 										  parse->sortClause,
 										  tlist,
 										  false);
 		/*
 		 * Will need actual number of aggregates for estimating costs.
@@ -1069,10 +1079,9 @@ grouping_planner(PlannerInfo *root, double tuple_fraction)
 	{
 		if (!pathkeys_contained_in(sort_pathkeys, current_pathkeys))
 		{
-			result_plan = (Plan *)
+			result_plan = (Plan *) make_sort_from_pathkeys(root,
-				make_sort_from_sortclauses(root,
+														   result_plan,
-										   parse->sortClause,
+														   sort_pathkeys);
 										   result_plan);
 			current_pathkeys = sort_pathkeys;
 		}
 	}
--- a/src/backend/optimizer/prep/prepjointree.c
+++ b/src/backend/optimizer/prep/prepjointree.c
@@ -15,7 +15,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/prep/prepjointree.c,v 1.45 2007/01/05 22:19:32 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/prep/prepjointree.c,v 1.46 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -292,6 +292,7 @@ pull_up_simple_subquery(PlannerInfo *root, Node *jtnode, RangeTblEntry *rte,
 	 */
 	subroot = makeNode(PlannerInfo);
 	subroot->parse = subquery;
 	subroot->planner_cxt = CurrentMemoryContext;
 	subroot->in_info_list = NIL;
 	subroot->append_rel_list = NIL;
--- a/src/backend/optimizer/prep/prepunion.c
+++ b/src/backend/optimizer/prep/prepunion.c
@@ -22,7 +22,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/prep/prepunion.c,v 1.135 2007/01/05 22:19:32 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/prep/prepunion.c,v 1.136 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -1195,10 +1195,8 @@ adjust_appendrel_attrs_mutator(Node *node, AppendRelInfo *context)
 		 */
 		newinfo->eval_cost.startup = -1;
 		newinfo->this_selec = -1;
-		newinfo->left_pathkey = NIL;
+		newinfo->left_ec = NULL;
-		newinfo->right_pathkey = NIL;
+		newinfo->right_ec = NULL;
 		newinfo->left_mergescansel = -1;
 		newinfo->right_mergescansel = -1;
 		newinfo->left_bucketsize = -1;
 		newinfo->right_bucketsize = -1;
--- a/src/backend/optimizer/util/joininfo.c
+++ b/src/backend/optimizer/util/joininfo.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/util/joininfo.c,v 1.46 2007/01/05 22:19:32 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/util/joininfo.c,v 1.47 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -16,6 +16,7 @@
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 /*
@@ -54,6 +55,13 @@ have_relevant_joinclause(PlannerInfo *root,
 		}
 	}
 	/*
 	 * We also need to check the EquivalenceClass data structure, which
 	 * might contain relationships not emitted into the joininfo lists.
 	 */
 	if (!result && rel1->has_eclass_joins && rel2->has_eclass_joins)
 		result = have_relevant_eclass_joinclause(root, rel1, rel2);
 	/*
 	 * It's possible that the rels correspond to the left and right sides
 	 * of a degenerate outer join, that is, one with no joinclause mentioning
@@ -124,37 +132,3 @@ add_join_clause_to_rels(PlannerInfo *root,
 	}
 	bms_free(tmprelids);
 }
 /*
 * remove_join_clause_from_rels
 *	  Delete 'restrictinfo' from all the joininfo lists it is in
 *
 * This reverses the effect of add_join_clause_to_rels.  It's used when we
 * discover that a join clause is redundant.
 *
 * 'restrictinfo' describes the join clause
 * 'join_relids' is the list of relations participating in the join clause
 *				 (there must be more than one)
 */
 void
 remove_join_clause_from_rels(PlannerInfo *root,
 							 RestrictInfo *restrictinfo,
 							 Relids join_relids)
 {
 	Relids		tmprelids;
 	int			cur_relid;
 	tmprelids = bms_copy(join_relids);
 	while ((cur_relid = bms_first_member(tmprelids)) >= 0)
 	{
 		RelOptInfo *rel = find_base_rel(root, cur_relid);
 		/*
 		 * Remove the restrictinfo from the list.  Pointer comparison is
 		 * sufficient.
 		 */
 		Assert(list_member_ptr(rel->joininfo, restrictinfo));
 		rel->joininfo = list_delete_ptr(rel->joininfo, restrictinfo);
 	}
 	bms_free(tmprelids);
 }
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/util/pathnode.c,v 1.136 2007/01/10 18:06:04 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/util/pathnode.c,v 1.137 2007/01/20 20:45:39 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -26,7 +26,6 @@
 #include "parser/parse_expr.h"
 #include "parser/parse_oper.h"
 #include "parser/parsetree.h"
 #include "utils/memutils.h"
 #include "utils/selfuncs.h"
 #include "utils/lsyscache.h"
 #include "utils/syscache.h"
@@ -747,11 +746,11 @@ create_unique_path(PlannerInfo *root, RelOptInfo *rel, Path *subpath)
 		return (UniquePath *) rel->cheapest_unique_path;
 	/*
-	 * We must ensure path struct is allocated in same context as parent rel;
+	 * We must ensure path struct is allocated in main planning context;
 	 * otherwise GEQO memory management causes trouble.  (Compare
 	 * best_inner_indexscan().)
 	 */
-	oldcontext = MemoryContextSwitchTo(GetMemoryChunkContext(rel));
+	oldcontext = MemoryContextSwitchTo(root->planner_cxt);
 	pathnode = makeNode(UniquePath);
@@ -1198,11 +1197,6 @@ create_nestloop_path(PlannerInfo *root,
 * 'pathkeys' are the path keys of the new join path
 * 'mergeclauses' are the RestrictInfo nodes to use as merge clauses
 *		(this should be a subset of the restrict_clauses list)
 * 'mergefamilies' are the btree opfamily OIDs identifying the merge
 *		ordering for each merge clause
 * 'mergestrategies' are the btree operator strategies identifying the merge
 *		ordering for each merge clause
 * 'mergenullsfirst' are the nulls first/last flags for each merge clause
 * 'outersortkeys' are the sort varkeys for the outer relation
 * 'innersortkeys' are the sort varkeys for the inner relation
 */
@@ -1215,9 +1209,6 @@ create_mergejoin_path(PlannerInfo *root,
 					  List *restrict_clauses,
 					  List *pathkeys,
 					  List *mergeclauses,
 					  Oid *mergefamilies,
 					  int *mergestrategies,
 					  bool *mergenullsfirst,
 					  List *outersortkeys,
 					  List *innersortkeys)
 {
@@ -1258,9 +1249,6 @@ create_mergejoin_path(PlannerInfo *root,
 	pathnode->jpath.joinrestrictinfo = restrict_clauses;
 	pathnode->jpath.path.pathkeys = pathkeys;
 	pathnode->path_mergeclauses = mergeclauses;
 	pathnode->path_mergeFamilies = mergefamilies;
 	pathnode->path_mergeStrategies = mergestrategies;
 	pathnode->path_mergeNullsFirst = mergenullsfirst;
 	pathnode->outersortkeys = outersortkeys;
 	pathnode->innersortkeys = innersortkeys;
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/util/relnode.c,v 1.84 2007/01/05 22:19:33 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/util/relnode.c,v 1.85 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -16,6 +16,7 @@
 #include "optimizer/cost.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/restrictinfo.h"
 #include "parser/parsetree.h"
@@ -31,17 +32,18 @@ typedef struct JoinHashEntry
 static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 					RelOptInfo *input_rel);
 static List *build_joinrel_restrictlist(PlannerInfo *root,
-						   RelOptInfo *joinrel,
+										RelOptInfo *joinrel,
-						   RelOptInfo *outer_rel,
+										RelOptInfo *outer_rel,
-						   RelOptInfo *inner_rel,
+										RelOptInfo *inner_rel);
 						   JoinType jointype);
 static void build_joinrel_joinlist(RelOptInfo *joinrel,
 					   RelOptInfo *outer_rel,
 					   RelOptInfo *inner_rel);
 static List *subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
-							  List *joininfo_list);
+							  List *joininfo_list,
-static void subbuild_joinrel_joinlist(RelOptInfo *joinrel,
+							  List *new_restrictlist);
-						  List *joininfo_list);
+static List *subbuild_joinrel_joinlist(RelOptInfo *joinrel,
 						  List *joininfo_list,
 						  List *new_joininfo);
 /*
@@ -84,6 +86,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptKind reloptkind)
 	rel->baserestrictcost.startup = 0;
 	rel->baserestrictcost.per_tuple = 0;
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->index_outer_relids = NULL;
 	rel->index_inner_paths = NIL;
@@ -303,8 +306,7 @@ build_join_rel(PlannerInfo *root,
 			*restrictlist_ptr = build_joinrel_restrictlist(root,
 														   joinrel,
 														   outer_rel,
-														   inner_rel,
+														   inner_rel);
 														   jointype);
 		return joinrel;
 	}
@@ -335,6 +337,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->baserestrictcost.startup = 0;
 	joinrel->baserestrictcost.per_tuple = 0;
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->index_outer_relids = NULL;
 	joinrel->index_inner_paths = NIL;
@@ -354,15 +357,18 @@ build_join_rel(PlannerInfo *root,
 	 * caller might or might not need the restrictlist, but I need it anyway
 	 * for set_joinrel_size_estimates().)
 	 */
-	restrictlist = build_joinrel_restrictlist(root,
+	restrictlist = build_joinrel_restrictlist(root, joinrel,
-											  joinrel,
+											  outer_rel, inner_rel);
 											  outer_rel,
 											  inner_rel,
 											  jointype);
 	if (restrictlist_ptr)
 		*restrictlist_ptr = restrictlist;
 	build_joinrel_joinlist(joinrel, outer_rel, inner_rel);
 	/*
 	 * This is also the right place to check whether the joinrel has any
 	 * pending EquivalenceClass joins.
 	 */
 	joinrel->has_eclass_joins = has_relevant_eclass_joinclause(root, joinrel);
 	/*
 	 * Set estimates of the joinrel's size.
 	 */
@@ -468,15 +474,15 @@ build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 *	  join paths made from this pair of sub-relations.	(It will not need to
 *	  be considered further up the join tree.)
 *
- *	  When building a restriction list, we eliminate redundant clauses.
+ *	  In many case we will find the same RestrictInfos in both input
- *	  We don't try to do that for join clause lists, since the join clauses
+ *	  relations' joinlists, so be careful to eliminate duplicates.
- *	  aren't really doing anything, just waiting to become part of higher
+ *	  Pointer equality should be a sufficient test for dups, since all
- *	  levels' restriction lists.
+ *	  the various joinlist entries ultimately refer to RestrictInfos
 *	  pushed into them by distribute_restrictinfo_to_rels().
 *
 * 'joinrel' is a join relation node
 * 'outer_rel' and 'inner_rel' are a pair of relations that can be joined
 *		to form joinrel.
 * 'jointype' is the type of join used.
 *
 * build_joinrel_restrictlist() returns a list of relevant restrictinfos,
 * whereas build_joinrel_joinlist() stores its results in the joinrel's
@@ -491,33 +497,27 @@ static List *
 build_joinrel_restrictlist(PlannerInfo *root,
 						   RelOptInfo *joinrel,
 						   RelOptInfo *outer_rel,
-						   RelOptInfo *inner_rel,
+						   RelOptInfo *inner_rel)
 						   JoinType jointype)
 {
 	List	   *result;
 	List	   *rlist;
 	/*
-	 * Collect all the clauses that syntactically belong at this level.
+	 * Collect all the clauses that syntactically belong at this level,
 	 * eliminating any duplicates (important since we will see many of the
 	 * same clauses arriving from both input relations).
 	 */
-	rlist = list_concat(subbuild_joinrel_restrictlist(joinrel,
+	result = subbuild_joinrel_restrictlist(joinrel, outer_rel->joininfo, NIL);
-													  outer_rel->joininfo),
+	result = subbuild_joinrel_restrictlist(joinrel, inner_rel->joininfo, result);
 						subbuild_joinrel_restrictlist(joinrel,
 													  inner_rel->joininfo));
 	/*
-	 * Eliminate duplicate and redundant clauses.
+	 * Add on any clauses derived from EquivalenceClasses.  These cannot be
-	 *
+	 * redundant with the clauses in the joininfo lists, so don't bother
-	 * We must eliminate duplicates, since we will see many of the same
+	 * checking.
 	 * clauses arriving from both input relations.	Also, if a clause is a
 	 * mergejoinable clause, it's possible that it is redundant with previous
 	 * clauses (see optimizer/README for discussion).  We detect that case and
 	 * omit the redundant clause from the result list.
 	 */
-	result = remove_redundant_join_clauses(root, rlist,
+	result = list_concat(result,
-										   IS_OUTER_JOIN(jointype));
+						 generate_join_implied_equalities(root,
-
+														  joinrel,
-	list_free(rlist);
+														  outer_rel,
 														  inner_rel));
 	return result;
 }
@@ -527,15 +527,24 @@ build_joinrel_joinlist(RelOptInfo *joinrel,
 					   RelOptInfo *outer_rel,
 					   RelOptInfo *inner_rel)
 {
-	subbuild_joinrel_joinlist(joinrel, outer_rel->joininfo);
+	List	   *result;
-	subbuild_joinrel_joinlist(joinrel, inner_rel->joininfo);
+
 	/*
 	 * Collect all the clauses that syntactically belong above this level,
 	 * eliminating any duplicates (important since we will see many of the
 	 * same clauses arriving from both input relations).
 	 */
 	result = subbuild_joinrel_joinlist(joinrel, outer_rel->joininfo, NIL);
 	result = subbuild_joinrel_joinlist(joinrel, inner_rel->joininfo, result);
 	joinrel->joininfo = result;
 }
 static List *
 subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
-							  List *joininfo_list)
+							  List *joininfo_list,
 							  List *new_restrictlist)
 {
 	List	   *restrictlist = NIL;
 	ListCell   *l;
 	foreach(l, joininfo_list)
@@ -546,10 +555,12 @@ subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
 		{
 			/*
 			 * This clause becomes a restriction clause for the joinrel, since
-			 * it refers to no outside rels.  We don't bother to check for
+			 * it refers to no outside rels.  Add it to the list, being
-			 * duplicates here --- build_joinrel_restrictlist will do that.
+			 * careful to eliminate duplicates. (Since RestrictInfo nodes in
 			 * different joinlists will have been multiply-linked rather than
 			 * copied, pointer equality should be a sufficient test.)
 			 */
-			restrictlist = lappend(restrictlist, rinfo);
+			new_restrictlist = list_append_unique_ptr(new_restrictlist, rinfo);
 		}
 		else
 		{
@@ -560,12 +571,13 @@ subbuild_joinrel_restrictlist(RelOptInfo *joinrel,
 		}
 	}
-	return restrictlist;
+	return new_restrictlist;
 }
-static void
+static List *
 subbuild_joinrel_joinlist(RelOptInfo *joinrel,
-						  List *joininfo_list)
+						  List *joininfo_list,
 						  List *new_joininfo)
 {
 	ListCell   *l;
@@ -585,15 +597,14 @@ subbuild_joinrel_joinlist(RelOptInfo *joinrel,
 		{
 			/*
 			 * This clause is still a join clause at this level, so add it to
-			 * the joininfo list for the joinrel, being careful to eliminate
+			 * the new joininfo list, being careful to eliminate
-			 * duplicates.	(Since RestrictInfo nodes are normally
+			 * duplicates. (Since RestrictInfo nodes in different joinlists
-			 * multiply-linked rather than copied, pointer equality should be
+			 * will have been multiply-linked rather than copied, pointer
-			 * a sufficient test.  If two equal() nodes should happen to sneak
+			 * equality should be a sufficient test.)
 			 * in, no great harm is done --- they'll be detected by
 			 * redundant-clause testing when they reach a restriction list.)
 			 */
-			joinrel->joininfo = list_append_unique_ptr(joinrel->joininfo,
+			new_joininfo = list_append_unique_ptr(new_joininfo, rinfo);
 													   rinfo);
 		}
 	}
 	return new_joininfo;
 }
--- a/src/backend/optimizer/util/restrictinfo.c
+++ b/src/backend/optimizer/util/restrictinfo.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/optimizer/util/restrictinfo.c,v 1.51 2007/01/05 22:19:33 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/optimizer/util/restrictinfo.c,v 1.52 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -33,10 +33,9 @@ static Expr *make_sub_restrictinfos(Expr *clause,
 					   bool outerjoin_delayed,
 					   bool pseudoconstant,
 					   Relids required_relids);
-static RestrictInfo *join_clause_is_redundant(PlannerInfo *root,
+static bool join_clause_is_redundant(PlannerInfo *root,
 						 RestrictInfo *rinfo,
-						 List *reference_list,
+						 List *reference_list);
 						 bool isouterjoin);
 /*
@@ -336,19 +335,17 @@ make_restrictinfo_internal(Expr *clause,
 	 * that happens only if it appears in the right context (top level of a
 	 * joinclause list).
 	 */
 	restrictinfo->parent_ec = NULL;
 	restrictinfo->eval_cost.startup = -1;
 	restrictinfo->this_selec = -1;
-	restrictinfo->mergejoinoperator = InvalidOid;
+	restrictinfo->mergeopfamilies = NIL;
 	restrictinfo->left_sortop = InvalidOid;
 	restrictinfo->right_sortop = InvalidOid;
 	restrictinfo->mergeopfamily = InvalidOid;
-	restrictinfo->left_pathkey = NIL;
+	restrictinfo->left_ec = NULL;
-	restrictinfo->right_pathkey = NIL;
+	restrictinfo->right_ec = NULL;
-	restrictinfo->left_mergescansel = -1;
+	restrictinfo->outer_is_left = false;
 	restrictinfo->right_mergescansel = -1;
 	restrictinfo->hashjoinoperator = InvalidOid;
@@ -529,78 +526,18 @@ extract_actual_join_clauses(List *restrictinfo_list,
 	}
 }
 /*
 * remove_redundant_join_clauses
 *
 * Given a list of RestrictInfo clauses that are to be applied in a join,
 * remove any duplicate or redundant clauses.
 *
 * We must eliminate duplicates when forming the restrictlist for a joinrel,
 * since we will see many of the same clauses arriving from both input
 * relations. Also, if a clause is a mergejoinable clause, it's possible that
 * it is redundant with previous clauses (see optimizer/README for
 * discussion). We detect that case and omit the redundant clause from the
 * result list.
 *
 * The result is a fresh List, but it points to the same member nodes
 * as were in the input.
 */
 List *
 remove_redundant_join_clauses(PlannerInfo *root, List *restrictinfo_list,
 							  bool isouterjoin)
 {
 	List	   *result = NIL;
 	ListCell   *item;
 	QualCost	cost;
 	/*
 	 * If there are any redundant clauses, we want to eliminate the ones that
 	 * are more expensive in favor of the ones that are less so. Run
 	 * cost_qual_eval() to ensure the eval_cost fields are set up.
 	 */
 	cost_qual_eval(&cost, restrictinfo_list);
 	/*
 	 * We don't have enough knowledge yet to be able to estimate the number of
 	 * times a clause might be evaluated, so it's hard to weight the startup
 	 * and per-tuple costs appropriately.  For now just weight 'em the same.
 	 */
 #define CLAUSECOST(r)  ((r)->eval_cost.startup + (r)->eval_cost.per_tuple)
 	foreach(item, restrictinfo_list)
 	{
 		RestrictInfo *rinfo = (RestrictInfo *) lfirst(item);
 		RestrictInfo *prevrinfo;
 		/* is it redundant with any prior clause? */
 		prevrinfo = join_clause_is_redundant(root, rinfo, result, isouterjoin);
 		if (prevrinfo == NULL)
 		{
 			/* no, so add it to result list */
 			result = lappend(result, rinfo);
 		}
 		else if (CLAUSECOST(rinfo) < CLAUSECOST(prevrinfo))
 		{
 			/* keep this one, drop the previous one */
 			result = list_delete_ptr(result, prevrinfo);
 			result = lappend(result, rinfo);
 		}
 		/* else, drop this one */
 	}
 	return result;
 }
 /*
 * select_nonredundant_join_clauses
 *
 * Given a list of RestrictInfo clauses that are to be applied in a join,
 * select the ones that are not redundant with any clause in the
- * reference_list.
+ * reference_list.  This is used only for nestloop-with-inner-indexscan
 * joins: any clauses being checked by the index should be removed from
 * the qpquals list.
 *
- * This is similar to remove_redundant_join_clauses, but we are looking for
+ * "Redundant" means either equal() or derived from the same EquivalenceClass.
- * redundancies with a separate list of clauses (i.e., clauses that have
+ * We have to check the latter because indxqual.c may select different derived
- * already been applied below the join itself).
+ * clauses than were selected by generate_join_implied_equalities().
 *
 * Note that we assume the given restrictinfo_list has already been checked
 * for local redundancies, so we don't check again.
@@ -608,8 +545,7 @@ remove_redundant_join_clauses(PlannerInfo *root, List *restrictinfo_list,
 List *
 select_nonredundant_join_clauses(PlannerInfo *root,
 								 List *restrictinfo_list,
-								 List *reference_list,
+								 List *reference_list)
 								 bool isouterjoin)
 {
 	List	   *result = NIL;
 	ListCell   *item;
@@ -619,7 +555,7 @@ select_nonredundant_join_clauses(PlannerInfo *root,
 		RestrictInfo *rinfo = (RestrictInfo *) lfirst(item);
 		/* drop it if redundant with any reference clause */
-		if (join_clause_is_redundant(root, rinfo, reference_list, isouterjoin) != NULL)
+		if (join_clause_is_redundant(root, rinfo, reference_list))
 			continue;
 		/* otherwise, add it to result list */
@@ -631,79 +567,28 @@ select_nonredundant_join_clauses(PlannerInfo *root,
 /*
 * join_clause_is_redundant
- *		If rinfo is redundant with any clause in reference_list,
+ *		Test whether rinfo is redundant with any clause in reference_list.
 *		return one such clause; otherwise return NULL.
 *
 * This is the guts of both remove_redundant_join_clauses and
 * select_nonredundant_join_clauses.  See the docs above for motivation.
 *
 * We can detect redundant mergejoinable clauses very cheaply by using their
 * left and right pathkeys, which uniquely identify the sets of equijoined
 * variables in question.  All the members of a pathkey set that are in the
 * left relation have already been forced to be equal; likewise for those in
 * the right relation.	So, we need to have only one clause that checks
 * equality between any set member on the left and any member on the right;
 * by transitivity, all the rest are then equal.
 *
 * However, clauses that are of the form "var expr = const expr" cannot be
 * eliminated as redundant.  This is because when there are const expressions
 * in a pathkey set, generate_implied_equalities() suppresses "var = var"
 * clauses in favor of "var = const" clauses.  We cannot afford to drop any
 * of the latter, even though they might seem redundant by the pathkey
 * membership test.
 *
 * Weird special case: if we have two clauses that seem redundant
 * except one is pushed down into an outer join and the other isn't,
 * then they're not really redundant, because one constrains the
 * joined rows after addition of null fill rows, and the other doesn't.
 */
-static RestrictInfo *
+static bool
 join_clause_is_redundant(PlannerInfo *root,
 						 RestrictInfo *rinfo,
-						 List *reference_list,
+						 List *reference_list)
 						 bool isouterjoin)
 {
 	ListCell   *refitem;
 	/* always consider exact duplicates redundant */
 	foreach(refitem, reference_list)
 	{
 		RestrictInfo *refrinfo = (RestrictInfo *) lfirst(refitem);
 		/* always consider exact duplicates redundant */
 		if (equal(rinfo, refrinfo))
-			return refrinfo;
+			return true;
 		/* check if derived from same EquivalenceClass */
 		if (rinfo->parent_ec != NULL &&
 			rinfo->parent_ec == refrinfo->parent_ec)
 			return true;
 	}
-	/* check for redundant merge clauses */
+	return false;
 	if (rinfo->mergejoinoperator != InvalidOid)
 	{
 		/* do the cheap test first: is it a "var = const" clause? */
 		if (bms_is_empty(rinfo->left_relids) ||
 			bms_is_empty(rinfo->right_relids))
 			return NULL;		/* var = const, so not redundant */
 		cache_mergeclause_pathkeys(root, rinfo);
 		foreach(refitem, reference_list)
 		{
 			RestrictInfo *refrinfo = (RestrictInfo *) lfirst(refitem);
 			if (refrinfo->mergejoinoperator != InvalidOid)
 			{
 				cache_mergeclause_pathkeys(root, refrinfo);
 				if (rinfo->left_pathkey == refrinfo->left_pathkey &&
 					rinfo->right_pathkey == refrinfo->right_pathkey &&
 					(rinfo->is_pushed_down == refrinfo->is_pushed_down ||
 					 !isouterjoin))
 				{
 					/* Yup, it's redundant */
 					return refrinfo;
 				}
 			}
 		}
 	}
 	/* otherwise, not redundant */
 	return NULL;
 }
--- a/src/backend/parser/parse_agg.c
+++ b/src/backend/parser/parse_agg.c
@@ -8,7 +8,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.75 2007/01/05 22:19:33 momjian Exp $
+ *	  $PostgreSQL: pgsql/src/backend/parser/parse_agg.c,v 1.76 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -171,6 +171,7 @@ parseCheckAggregates(ParseState *pstate, Query *qry)
 	{
 		root = makeNode(PlannerInfo);
 		root->parse = qry;
 		root->planner_cxt = CurrentMemoryContext;
 		root->hasJoinRTEs = true;
 		groupClauses = (List *) flatten_join_alias_vars(root,
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -15,7 +15,7 @@
 *
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.219 2007/01/09 02:14:14 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/utils/adt/selfuncs.c,v 1.220 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -2345,7 +2345,7 @@ add_unique_group_var(PlannerInfo *root, List *varinfos,
 *		expressional index for which we have statistics, then we treat the
 *		whole expression as though it were just a Var.
 *	2.	If the list contains Vars of different relations that are known equal
- *		due to equijoin clauses, then drop all but one of the Vars from each
+ *		due to equivalence classes, then drop all but one of the Vars from each
 *		known-equal set, keeping the one with smallest estimated # of values
 *		(since the extra values of the others can't appear in joined rows).
 *		Note the reason we only consider Vars of different relations is that
@@ -2365,10 +2365,9 @@ add_unique_group_var(PlannerInfo *root, List *varinfos,
 *	4.	If there are Vars from multiple rels, we repeat step 3 for each such
 *		rel, and multiply the results together.
 * Note that rels not containing grouped Vars are ignored completely, as are
- * join clauses other than the equijoin clauses used in step 2.  Such rels
+ * join clauses.  Such rels cannot increase the number of groups, and we
- * cannot increase the number of groups, and we assume such clauses do not
+ * assume such clauses do not reduce the number either (somewhat bogus,
- * reduce the number either (somewhat bogus, but we don't have the info to
+ * but we don't have the info to do better).
 * do better).
 */
 double
 estimate_num_groups(PlannerInfo *root, List *groupExprs, double input_rows)
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * IDENTIFICATION
- *	  $PostgreSQL: pgsql/src/backend/utils/cache/lsyscache.c,v 1.143 2007/01/10 18:06:04 tgl Exp $
+ *	  $PostgreSQL: pgsql/src/backend/utils/cache/lsyscache.c,v 1.144 2007/01/20 20:45:40 tgl Exp $
 *
 * NOTES
 *	  Eventually, the index information should go through here, too.
@@ -138,153 +138,6 @@ get_opfamily_member(Oid opfamily, Oid lefttype, Oid righttype,
 	return result;
 }
 /*
 * get_op_mergejoin_info
 *		Given the OIDs of a (putatively) mergejoinable equality operator
 *		and a sortop defining the sort ordering of the lefthand input of
 *		the merge clause, determine whether this sort ordering is actually
 *		usable for merging.  If so, return the required sort ordering op
 *		for the righthand input, as well as the btree opfamily OID containing
 *		these operators and the operator strategy number of the two sortops
 *		(either BTLessStrategyNumber or BTGreaterStrategyNumber).
 *
 * We can mergejoin if we find the two operators in the same opfamily as
 * equality and either less-than or greater-than respectively.  If there
 * are multiple such opfamilies, assume we can use any one.
 */
 #ifdef NOT_YET
 /* eventually should look like this */
 bool
 get_op_mergejoin_info(Oid eq_op, Oid left_sortop,
 					  Oid *right_sortop, Oid *opfamily, int *opstrategy)
 {
 	bool		result = false;
 	Oid			lefttype;
 	Oid			righttype;
 	CatCList   *catlist;
 	int			i;
 	/* Make sure output args are initialized even on failure */
 	*right_sortop = InvalidOid;
 	*opfamily = InvalidOid;
 	*opstrategy = 0;
 	/* Need the righthand input datatype */
 	op_input_types(eq_op, &lefttype, &righttype);
 	/*
 	 * Search through all the pg_amop entries containing the equality operator
 	 */
 	catlist = SearchSysCacheList(AMOPOPID, 1,
 								 ObjectIdGetDatum(eq_op),
 								 0, 0, 0);
 	for (i = 0; i < catlist->n_members; i++)
 	{
 		HeapTuple	op_tuple = &catlist->members[i]->tuple;
 		Form_pg_amop op_form = (Form_pg_amop) GETSTRUCT(op_tuple);
 		Oid			opfamily_id;
 		StrategyNumber op_strategy;
 		/* must be btree */
 		if (op_form->amopmethod != BTREE_AM_OID)
 			continue;
 		/* must use the operator as equality */
 		if (op_form->amopstrategy != BTEqualStrategyNumber)
 			continue;
 		/* See if sort operator is also in this opfamily with OK semantics */
 		opfamily_id = op_form->amopfamily;
 		op_strategy = get_op_opfamily_strategy(left_sortop, opfamily_id);
 		if (op_strategy == BTLessStrategyNumber ||
 			op_strategy == BTGreaterStrategyNumber)
 		{
 			/* Yes, so find the corresponding righthand sortop */
 			*right_sortop = get_opfamily_member(opfamily_id,
 												righttype,
 												righttype,
 												op_strategy);
 			if (OidIsValid(*right_sortop))
 			{
 				/* Found a workable mergejoin semantics */
 				*opfamily = opfamily_id;
 				*opstrategy = op_strategy;
 				result = true;
 				break;
 			}
 		}
 	}
 	ReleaseSysCacheList(catlist);
 	return result;
 }
 #else
 /* temp implementation until planner gets smarter: left_sortop is output */
 bool
 get_op_mergejoin_info(Oid eq_op, Oid *left_sortop,
 					  Oid *right_sortop, Oid *opfamily)
 {
 	bool		result = false;
 	Oid			lefttype;
 	Oid			righttype;
 	CatCList   *catlist;
 	int			i;
 	/* Make sure output args are initialized even on failure */
 	*left_sortop = InvalidOid;
 	*right_sortop = InvalidOid;
 	*opfamily = InvalidOid;
 	/* Need the input datatypes */
 	op_input_types(eq_op, &lefttype, &righttype);
 	/*
 	 * Search through all the pg_amop entries containing the equality operator
 	 */
 	catlist = SearchSysCacheList(AMOPOPID, 1,
 								 ObjectIdGetDatum(eq_op),
 								 0, 0, 0);
 	for (i = 0; i < catlist->n_members; i++)
 	{
 		HeapTuple	op_tuple = &catlist->members[i]->tuple;
 		Form_pg_amop op_form = (Form_pg_amop) GETSTRUCT(op_tuple);
 		Oid			opfamily_id;
 		/* must be btree */
 		if (op_form->amopmethod != BTREE_AM_OID)
 			continue;
 		/* must use the operator as equality */
 		if (op_form->amopstrategy != BTEqualStrategyNumber)
 			continue;
 		opfamily_id = op_form->amopfamily;
 		/* Find the matching sortops */
 		*left_sortop = get_opfamily_member(opfamily_id,
 										   lefttype,
 										   lefttype,
 										   BTLessStrategyNumber);
 		*right_sortop = get_opfamily_member(opfamily_id,
 											righttype,
 											righttype,
 											BTLessStrategyNumber);
 		if (OidIsValid(*left_sortop) && OidIsValid(*right_sortop))
 		{
 			/* Found a workable mergejoin semantics */
 			*opfamily = opfamily_id;
 			result = true;
 			break;
 		}
 	}
 	ReleaseSysCacheList(catlist);
 	return result;
 }
 #endif
 /*
 * get_compare_function_for_ordering_op
 *		Get the OID of the datatype-specific btree comparison function
@@ -469,6 +322,56 @@ get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type)
 	return result;
 }
 /*
 * get_mergejoin_opfamilies
 *		Given a putatively mergejoinable operator, return a list of the OIDs
 *		of the btree opfamilies in which it represents equality.
 *
 * It is possible (though at present unusual) for an operator to be equality
 * in more than one opfamily, hence the result is a list.  This also lets us
 * return NIL if the operator is not found in any opfamilies.
 *
 * The planner currently uses simple equal() tests to compare the lists
 * returned by this function, which makes the list order relevant, though
 * strictly speaking it should not be.  Because of the way syscache list
 * searches are handled, in normal operation the result will be sorted by OID
 * so everything works fine.  If running with system index usage disabled,
 * the result ordering is unspecified and hence the planner might fail to
 * recognize optimization opportunities ... but that's hardly a scenario in
 * which performance is good anyway, so there's no point in expending code
 * or cycles here to guarantee the ordering in that case.
 */
 List *
 get_mergejoin_opfamilies(Oid opno)
 {
 	List	   *result = NIL;
 	CatCList   *catlist;
 	int			i;
 	/*
 	 * Search pg_amop to see if the target operator is registered as the "="
 	 * operator of any btree opfamily.
 	 */
 	catlist = SearchSysCacheList(AMOPOPID, 1,
 								 ObjectIdGetDatum(opno),
 								 0, 0, 0);
 	for (i = 0; i < catlist->n_members; i++)
 	{
 		HeapTuple	tuple = &catlist->members[i]->tuple;
 		Form_pg_amop aform = (Form_pg_amop) GETSTRUCT(tuple);
 		/* must be btree equality */
 		if (aform->amopmethod == BTREE_AM_OID &&
 			aform->amopstrategy == BTEqualStrategyNumber)
 			result = lappend_oid(result, aform->amopfamily);
 	}
 	ReleaseSysCacheList(catlist);
 	return result;
 }
 /*
 * get_compatible_hash_operator
 *		Get the OID of a hash equality operator compatible with the given
--- a/src/include/nodes/nodes.h
+++ b/src/include/nodes/nodes.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/nodes/nodes.h,v 1.191 2007/01/05 22:19:55 momjian Exp $
+ * $PostgreSQL: pgsql/src/include/nodes/nodes.h,v 1.192 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -190,7 +190,9 @@ typedef enum NodeTag
 	T_ResultPath,
 	T_MaterialPath,
 	T_UniquePath,
-	T_PathKeyItem,
+	T_EquivalenceClass,
 	T_EquivalenceMember,
 	T_PathKey,
 	T_RestrictInfo,
 	T_InnerIndexscanInfo,
 	T_OuterJoinInfo,
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.132 2007/01/10 18:06:04 tgl Exp $
+ * $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.133 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -69,7 +69,7 @@ typedef struct PlannerInfo
 	 * does not correspond to a base relation, such as a join RTE or an
 	 * unreferenced view RTE; or if the RelOptInfo hasn't been made yet.
 	 */
-	struct RelOptInfo **simple_rel_array;		/* All 1-relation RelOptInfos */
+	struct RelOptInfo **simple_rel_array;		/* All 1-rel RelOptInfos */
 	int			simple_rel_array_size;	/* allocated size of array */
 	/*
@@ -84,18 +84,20 @@ typedef struct PlannerInfo
 	List	   *join_rel_list;	/* list of join-relation RelOptInfos */
 	struct HTAB *join_rel_hash; /* optional hashtable for join relations */
-	List	   *equi_key_list;	/* list of lists of equijoined PathKeyItems */
+	List	   *eq_classes;				/* list of active EquivalenceClasses */
-	List	   *left_join_clauses;		/* list of RestrictInfos for outer
+	List	   *canon_pathkeys;			/* list of "canonical" PathKeys */
 										 * join clauses w/nonnullable var on
 										 * left */
-	List	   *right_join_clauses;		/* list of RestrictInfos for outer
+	List	   *left_join_clauses;		/* list of RestrictInfos for
-										 * join clauses w/nonnullable var on
+										 * mergejoinable outer join clauses
-										 * right */
+										 * w/nonnullable var on left */
-	List	   *full_join_clauses;		/* list of RestrictInfos for full
+	List	   *right_join_clauses;		/* list of RestrictInfos for
-										 * outer join clauses */
+										 * mergejoinable outer join clauses
 										 * w/nonnullable var on right */
 	List	   *full_join_clauses;		/* list of RestrictInfos for
 										 * mergejoinable full join clauses */
 	List	   *oj_info_list;	/* list of OuterJoinInfos */
@@ -109,6 +111,8 @@ typedef struct PlannerInfo
 	List	   *group_pathkeys; /* groupClause pathkeys, if any */
 	List	   *sort_pathkeys;	/* sortClause pathkeys, if any */
 	MemoryContext planner_cxt;	/* context holding PlannerInfo */
 	double		total_table_pages;		/* # of pages in all tables of query */
 	double		tuple_fraction; /* tuple_fraction passed to query_planner */
@@ -209,7 +213,10 @@ typedef struct PlannerInfo
 *		baserestrictcost - Estimated cost of evaluating the baserestrictinfo
 *					clauses at a single tuple (only used for base rels)
 *		joininfo  - List of RestrictInfo nodes, containing info about each
- *					join clause in which this relation participates
+ *					join clause in which this relation participates (but
 *					note this excludes clauses that might be derivable from
 *					EquivalenceClasses)
 *		has_eclass_joins - flag that EquivalenceClass joins are possible
 *		index_outer_relids - only used for base rels; set of outer relids
 *					that participate in indexable joinclauses for this rel
 *		index_inner_paths - only used for base rels; list of InnerIndexscanInfo
@@ -278,6 +285,7 @@ typedef struct RelOptInfo
 	QualCost	baserestrictcost;		/* cost of evaluating the above */
 	List	   *joininfo;		/* RestrictInfo structures for join clauses
 								 * involving this rel */
 	bool		has_eclass_joins;		/* T means joininfo is incomplete */
 	/* cached info about inner indexscan paths for relation: */
 	Relids		index_outer_relids;		/* other relids in indexable join
@@ -349,31 +357,106 @@ typedef struct IndexOptInfo
 /*
- * PathKeys
+ * EquivalenceClasses
 *
- *	The sort ordering of a path is represented by a list of sublists of
+ * Whenever we can determine that a mergejoinable equality clause A = B is
- *	PathKeyItem nodes.	An empty list implies no known ordering.  Otherwise
+ * not delayed by any outer join, we create an EquivalenceClass containing
- *	the first sublist represents the primary sort key, the second the
+ * the expressions A and B to record this knowledge.  If we later find another
- *	first secondary sort key, etc.	Each sublist contains one or more
+ * equivalence B = C, we add C to the existing EquivalenceClass; this may
- *	PathKeyItem nodes, each of which can be taken as the attribute that
+ * require merging two existing EquivalenceClasses.  At the end of the qual
- *	appears at that sort position.	(See optimizer/README for more
+ * distribution process, we have sets of values that are known all transitively
- *	information.)
+ * equal to each other, where "equal" is according to the rules of the btree
 * operator family(s) shown in ec_opfamilies.  (We restrict an EC to contain
 * only equalities whose operators belong to the same set of opfamilies.  This
 * could probably be relaxed, but for now it's not worth the trouble, since
 * nearly all equality operators belong to only one btree opclass anyway.)
 *
 * We also use EquivalenceClasses as the base structure for PathKeys, letting
 * us represent knowledge about different sort orderings being equivalent.
 * Since every PathKey must reference an EquivalenceClass, we will end up
 * with single-member EquivalenceClasses whenever a sort key expression has
 * not been equivalenced to anything else.  It is also possible that such an
 * EquivalenceClass will contain a volatile expression ("ORDER BY random()"),
 * which is a case that can't arise otherwise since clauses containing
 * volatile functions are never considered mergejoinable.  We mark such
 * EquivalenceClasses specially to prevent them from being merged with
 * ordinary EquivalenceClasses.
 *
 * We allow equality clauses appearing below the nullable side of an outer join
 * to form EquivalenceClasses, but these have a slightly different meaning:
 * the included values might be all NULL rather than all the same non-null
 * values.  See src/backend/optimizer/README for more on that point.
 *
 * NB: if ec_merged isn't NULL, this class has been merged into another, and
 * should be ignored in favor of using the pointed-to class.
 */
-
+typedef struct EquivalenceClass
 typedef struct PathKeyItem
 {
 	NodeTag		type;
-	Node	   *key;			/* the item that is ordered */
+	List	   *ec_opfamilies;		/* btree operator family OIDs */
-	Oid			sortop;			/* the ordering operator ('<' op) */
+	List	   *ec_members;			/* list of EquivalenceMembers */
-	bool		nulls_first;	/* do NULLs come before normal values? */
+	List	   *ec_sources;			/* list of generating RestrictInfos */
 	Relids		ec_relids;			/* all relids appearing in ec_members */
 	bool		ec_has_const;		/* any pseudoconstants in ec_members? */
 	bool		ec_has_volatile;	/* the (sole) member is a volatile expr */
 	bool		ec_below_outer_join;	/* equivalence applies below an OJ */
 	bool		ec_broken;			/* failed to generate needed clauses? */
 	struct EquivalenceClass *ec_merged;		/* set if merged into another EC */
 } EquivalenceClass;
-	/*
+/*
-	 * key typically points to a Var node, ie a relation attribute, but it can
+ * EquivalenceMember - one member expression of an EquivalenceClass
-	 * also point to an arbitrary expression representing the value indexed by
+ *
-	 * an index expression.
+ * em_is_child signifies that this element was built by transposing a member
-	 */
+ * for an inheritance parent relation to represent the corresponding expression
-} PathKeyItem;
+ * on an inheritance child.  The element should be ignored for all purposes
 * except constructing inner-indexscan paths for the child relation.  (Other
 * types of join are driven from transposed joininfo-list entries.)  Note
 * that the EC's ec_relids field does NOT include the child relation.
 *
 * em_datatype is usually the same as exprType(em_expr), but can be
 * different when dealing with a binary-compatible opfamily; in particular
 * anyarray_ops would never work without this.  Use em_datatype when
 * looking up a specific btree operator to work with this expression.
 */
 typedef struct EquivalenceMember
 {
 	NodeTag		type;
 	Expr	   *em_expr;		/* the expression represented */
 	Relids		em_relids;		/* all relids appearing in em_expr */
 	bool		em_is_const;	/* expression is pseudoconstant? */
 	bool		em_is_child;	/* derived version for a child relation? */
 	Oid			em_datatype;	/* the "nominal type" used by the opfamily */
 } EquivalenceMember;
 /*
 * PathKeys
 *
 * The sort ordering of a path is represented by a list of PathKey nodes.
 * An empty list implies no known ordering.  Otherwise the first item
 * represents the primary sort key, the second the first secondary sort key,
 * etc.  The value being sorted is represented by linking to an
 * EquivalenceClass containing that value and including pk_opfamily among its
 * ec_opfamilies.  This is a convenient method because it makes it trivial
 * to detect equivalent and closely-related orderings.  (See optimizer/README
 * for more information.)
 *
 * Note: pk_strategy is either BTLessStrategyNumber (for ASC) or
 * BTGreaterStrategyNumber (for DESC).  We assume that all ordering-capable
 * index types will use btree-compatible strategy numbers.
 */
 typedef struct PathKey
 {
 	NodeTag		type;
 	EquivalenceClass *pk_eclass;	/* the value that is ordered */
 	Oid			pk_opfamily;		/* btree opfamily defining the ordering */
 	int			pk_strategy;		/* sort direction (ASC or DESC) */
 	bool		pk_nulls_first;		/* do NULLs come before normal values? */
 } PathKey;
 /*
 * Type "Path" is used as-is for sequential-scan paths.  For other
@@ -398,7 +481,7 @@ typedef struct Path
 	Cost		total_cost;		/* total cost (assuming all tuples fetched) */
 	List	   *pathkeys;		/* sort ordering of path's output */
-	/* pathkeys is a List of Lists of PathKeyItem nodes; see above */
+	/* pathkeys is a List of PathKey nodes; see above */
 } Path;
 /*----------
@@ -618,11 +701,7 @@ typedef JoinPath NestPath;
 * A mergejoin path has these fields.
 *
 * path_mergeclauses lists the clauses (in the form of RestrictInfos)
- * that will be used in the merge.  The parallel arrays path_mergeFamilies,
+ * that will be used in the merge.
 * path_mergeStrategies, and path_mergeNullsFirst specify the merge semantics
 * for each clause (i.e., define the relevant sort ordering for each clause).
 * (XXX is this the most reasonable path-time representation?  It's at least
 * partially redundant with the pathkeys of the input paths.)
 *
 * Note that the mergeclauses are a subset of the parent relation's
 * restriction-clause list.  Any join clauses that are not mergejoinable
@@ -639,10 +718,6 @@ typedef struct MergePath
 {
 	JoinPath	jpath;
 	List	   *path_mergeclauses;		/* join clauses to be used for merge */
 	/* these are arrays, but have the same length as the mergeclauses list: */
 	Oid		   *path_mergeFamilies;		/* per-clause OIDs of opfamilies */
 	int		   *path_mergeStrategies;	/* per-clause ordering (ASC or DESC) */
 	bool	   *path_mergeNullsFirst;	/* per-clause nulls ordering */
 	List	   *outersortkeys;	/* keys for explicit sort, if any */
 	List	   *innersortkeys;	/* keys for explicit sort, if any */
 } MergePath;
@@ -696,6 +771,15 @@ typedef struct HashPath
 * sequence we use.  So, these clauses cannot be associated directly with
 * the join RelOptInfo, but must be kept track of on a per-join-path basis.
 *
 * RestrictInfos that represent equivalence conditions (i.e., mergejoinable
 * equalities that are not outerjoin-delayed) are handled a bit differently.
 * Initially we attach them to the EquivalenceClasses that are derived from
 * them.  When we construct a scan or join path, we look through all the
 * EquivalenceClasses and generate derived RestrictInfos representing the
 * minimal set of conditions that need to be checked for this particular scan
 * or join to enforce that all members of each EquivalenceClass are in fact
 * equal in all rows emitted by the scan or join.
 *
 * When dealing with outer joins we have to be very careful about pushing qual
 * clauses up and down the tree.  An outer join's own JOIN/ON conditions must
 * be evaluated exactly at that join node, and any quals appearing in WHERE or
@@ -728,9 +812,9 @@ typedef struct HashPath
 *
 * In general, the referenced clause might be arbitrarily complex.	The
 * kinds of clauses we can handle as indexscan quals, mergejoin clauses,
- * or hashjoin clauses are fairly limited --- the code for each kind of
+ * or hashjoin clauses are limited (e.g., no volatile functions).  The code
- * path is responsible for identifying the restrict clauses it can use
+ * for each kind of path is responsible for identifying the restrict clauses
- * and ignoring the rest.  Clauses not implemented by an indexscan,
+ * it can use and ignoring the rest.  Clauses not implemented by an indexscan,
 * mergejoin, or hashjoin will be placed in the plan qual or joinqual field
 * of the finished Plan node, where they will be enforced by general-purpose
 * qual-expression-evaluation code.  (But we are still entitled to count
@@ -758,6 +842,12 @@ typedef struct HashPath
 * estimates.  Note that a pseudoconstant clause can never be an indexqual
 * or merge or hash join clause, so it's of no interest to large parts of
 * the planner.
 *
 * When join clauses are generated from EquivalenceClasses, there may be
 * several equally valid ways to enforce join equivalence, of which we need
 * apply only one.  We mark clauses of this kind by setting parent_ec to
 * point to the generating EquivalenceClass.  Multiple clauses with the same
 * parent_ec in the same join are redundant.
 */
 typedef struct RestrictInfo
@@ -787,23 +877,22 @@ typedef struct RestrictInfo
 	/* This field is NULL unless clause is an OR clause: */
 	Expr	   *orclause;		/* modified clause with RestrictInfos */
 	/* This field is NULL unless clause is potentially redundant: */
 	EquivalenceClass *parent_ec;	/* generating EquivalenceClass */
 	/* cache space for cost and selectivity */
 	QualCost	eval_cost;		/* eval cost of clause; -1 if not yet set */
 	Selectivity this_selec;		/* selectivity; -1 if not yet set */
-	/* valid if clause is mergejoinable, else InvalidOid: */
+	/* valid if clause is mergejoinable, else NIL */
-	Oid			mergejoinoperator;		/* copy of clause operator */
+	List	   *mergeopfamilies;	/* opfamilies containing clause operator */
 	Oid			left_sortop;	/* leftside sortop needed for mergejoin */
 	Oid			right_sortop;	/* rightside sortop needed for mergejoin */
 	Oid			mergeopfamily;	/* btree opfamily relating these ops */
-	/* cache space for mergeclause processing; NIL if not yet set */
+	/* cache space for mergeclause processing; NULL if not yet set */
-	List	   *left_pathkey;	/* canonical pathkey for left side */
+	EquivalenceClass *left_ec;	/* EquivalenceClass containing lefthand */
-	List	   *right_pathkey;	/* canonical pathkey for right side */
+	EquivalenceClass *right_ec;	/* EquivalenceClass containing righthand */
-	/* cache space for mergeclause processing; -1 if not yet set */
+	/* transient workspace for use while considering a specific join path */
-	Selectivity left_mergescansel;		/* fraction of left side to scan */
+	bool		outer_is_left;	/* T = outer var on left, F = on right */
 	Selectivity right_mergescansel;		/* fraction of right side to scan */
 	/* valid if clause is hashjoinable, else InvalidOid: */
 	Oid			hashjoinoperator;		/* copy of clause operator */
--- a/src/include/optimizer/joininfo.h
+++ b/src/include/optimizer/joininfo.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/optimizer/joininfo.h,v 1.33 2007/01/05 22:19:56 momjian Exp $
+ * $PostgreSQL: pgsql/src/include/optimizer/joininfo.h,v 1.34 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -23,8 +23,5 @@ extern bool have_relevant_joinclause(PlannerInfo *root,
 extern void add_join_clause_to_rels(PlannerInfo *root,
 						RestrictInfo *restrictinfo,
 						Relids join_relids);
 extern void remove_join_clause_from_rels(PlannerInfo *root,
 							 RestrictInfo *restrictinfo,
 							 Relids join_relids);
 #endif   /* JOININFO_H */
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/optimizer/pathnode.h,v 1.75 2007/01/10 18:06:04 tgl Exp $
+ * $PostgreSQL: pgsql/src/include/optimizer/pathnode.h,v 1.76 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -71,9 +71,6 @@ extern MergePath *create_mergejoin_path(PlannerInfo *root,
 					  List *restrict_clauses,
 					  List *pathkeys,
 					  List *mergeclauses,
 					  Oid *mergefamilies,
 					  int *mergestrategies,
 					  bool *mergenullsfirst,
 					  List *outersortkeys,
 					  List *innersortkeys);
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.94 2007/01/05 22:19:56 momjian Exp $
+ * $PostgreSQL: pgsql/src/include/optimizer/paths.h,v 1.95 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -52,6 +52,9 @@ extern List *group_clauses_by_indexkey(IndexOptInfo *index,
 						  Relids outer_relids,
 						  SaOpControl saop_control,
 						  bool *found_clause);
 extern bool eclass_matches_any_index(EquivalenceClass *ec,
 									 EquivalenceMember *em,
 									 RelOptInfo *rel);
 extern bool match_index_to_operand(Node *operand, int indexcol,
 					   IndexOptInfo *index);
 extern List *expand_indexqual_conditions(IndexOptInfo *index,
@@ -89,6 +92,37 @@ extern List *make_rels_by_joins(PlannerInfo *root, int level, List **joinrels);
 extern RelOptInfo *make_join_rel(PlannerInfo *root,
 			  RelOptInfo *rel1, RelOptInfo *rel2);
 /*
 * equivclass.c
 *	  routines for managing EquivalenceClasses
 */
 extern bool process_equivalence(PlannerInfo *root, RestrictInfo *restrictinfo,
 								bool below_outer_join);
 extern void reconsider_outer_join_clauses(PlannerInfo *root);
 extern EquivalenceClass *get_eclass_for_sort_expr(PlannerInfo *root,
 						 Expr *expr,
 						 Oid expr_datatype,
 						 List *opfamilies);
 extern void generate_base_implied_equalities(PlannerInfo *root);
 extern List *generate_join_implied_equalities(PlannerInfo *root,
 											  RelOptInfo *joinrel,
 											  RelOptInfo *outer_rel,
 											  RelOptInfo *inner_rel);
 extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
 extern void add_child_rel_equivalences(PlannerInfo *root,
 									   AppendRelInfo *appinfo,
 									   RelOptInfo *parent_rel,
 									   RelOptInfo *child_rel);
 extern List *find_eclass_clauses_for_index_join(PlannerInfo *root,
 												RelOptInfo *rel,
 												Relids outer_relids);
 extern bool have_relevant_eclass_joinclause(PlannerInfo *root,
 								RelOptInfo *rel1, RelOptInfo *rel2);
 extern bool has_relevant_eclass_joinclause(PlannerInfo *root,
 										   RelOptInfo *rel1);
 extern bool eclass_useful_for_merging(EquivalenceClass *eclass,
 									  RelOptInfo *rel);
 /*
 * pathkeys.c
 *	  utilities for matching and building path keys
@@ -101,9 +135,6 @@ typedef enum
 	PATHKEYS_DIFFERENT			/* neither pathkey includes the other */
 } PathKeysComparison;
 extern void add_equijoined_keys(PlannerInfo *root, RestrictInfo *restrictinfo);
 extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
 extern void generate_implied_equalities(PlannerInfo *root);
 extern List *canonicalize_pathkeys(PlannerInfo *root, List *pathkeys);
 extern PathKeysComparison compare_pathkeys(List *keys1, List *keys2);
 extern bool pathkeys_contained_in(List *keys1, List *keys2);
@@ -113,23 +144,29 @@ extern Path *get_cheapest_fractional_path_for_pathkeys(List *paths,
 										  List *pathkeys,
 										  double fraction);
 extern List *build_index_pathkeys(PlannerInfo *root, IndexOptInfo *index,
-					 ScanDirection scandir, bool canonical);
+					 ScanDirection scandir);
 extern List *convert_subquery_pathkeys(PlannerInfo *root, RelOptInfo *rel,
 						  List *subquery_pathkeys);
 extern List *build_join_pathkeys(PlannerInfo *root,
 					RelOptInfo *joinrel,
 					JoinType jointype,
 					List *outer_pathkeys);
-extern List *make_pathkeys_for_sortclauses(List *sortclauses,
+extern List *make_pathkeys_for_sortclauses(PlannerInfo *root,
-							  List *tlist);
+							  List *sortclauses,
-extern void cache_mergeclause_pathkeys(PlannerInfo *root,
+							  List *tlist,
 							  bool canonicalize);
 extern void cache_mergeclause_eclasses(PlannerInfo *root,
 						   RestrictInfo *restrictinfo);
 extern List *find_mergeclauses_for_pathkeys(PlannerInfo *root,
 							   List *pathkeys,
 							   bool outer_keys,
 							   List *restrictinfos);
-extern List *make_pathkeys_for_mergeclauses(PlannerInfo *root,
+extern List *select_outer_pathkeys_for_merge(PlannerInfo *root,
-							   List *mergeclauses,
+											 List *mergeclauses,
-							   RelOptInfo *rel);
+											 RelOptInfo *joinrel);
 extern List *make_inner_pathkeys_for_merge(PlannerInfo *root,
 										   List *mergeclauses,
 										   List *outer_pathkeys);
 extern int pathkeys_useful_for_merging(PlannerInfo *root,
 							RelOptInfo *rel,
 							List *pathkeys);
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.97 2007/01/10 18:06:04 tgl Exp $
+ * $PostgreSQL: pgsql/src/include/optimizer/planmain.h,v 1.98 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -38,6 +38,8 @@ extern Plan *create_plan(PlannerInfo *root, Path *best_path);
 extern SubqueryScan *make_subqueryscan(List *qptlist, List *qpqual,
 				  Index scanrelid, Plan *subplan);
 extern Append *make_append(List *appendplans, bool isTarget, List *tlist);
 extern Sort *make_sort_from_pathkeys(PlannerInfo *root, Plan *lefttree,
 						List *pathkeys);
 extern Sort *make_sort_from_sortclauses(PlannerInfo *root, List *sortcls,
 						   Plan *lefttree);
 extern Sort *make_sort_from_groupcols(PlannerInfo *root, List *groupcls,
@@ -69,12 +71,22 @@ extern int	join_collapse_limit;
 extern void add_base_rels_to_query(PlannerInfo *root, Node *jtnode);
 extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
 extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 								   Relids where_needed);
 extern List *deconstruct_jointree(PlannerInfo *root);
 extern void distribute_restrictinfo_to_rels(PlannerInfo *root,
 											RestrictInfo *restrictinfo);
 extern void process_implied_equality(PlannerInfo *root,
-						 Node *item1, Node *item2,
+									 Oid opno,
-						 Oid sortop1, Oid sortop2,
+									 Expr *item1,
-						 Relids item1_relids, Relids item2_relids,
+									 Expr *item2,
-						 bool delete_it);
+									 Relids qualscope,
 									 bool below_outer_join,
 									 bool both_const);
 extern RestrictInfo *build_implied_join_equality(Oid opno,
 							Expr *item1,
 							Expr *item2,
 							Relids qualscope);
 /*
 * prototypes for plan/setrefs.c
--- a/src/include/optimizer/restrictinfo.h
+++ b/src/include/optimizer/restrictinfo.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/optimizer/restrictinfo.h,v 1.39 2007/01/05 22:19:56 momjian Exp $
+ * $PostgreSQL: pgsql/src/include/optimizer/restrictinfo.h,v 1.40 2007/01/20 20:45:40 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -32,12 +32,8 @@ extern List *extract_actual_clauses(List *restrictinfo_list,
 extern void extract_actual_join_clauses(List *restrictinfo_list,
 							List **joinquals,
 							List **otherquals);
 extern List *remove_redundant_join_clauses(PlannerInfo *root,
 							  List *restrictinfo_list,
 							  bool isouterjoin);
 extern List *select_nonredundant_join_clauses(PlannerInfo *root,
 								 List *restrictinfo_list,
-								 List *reference_list,
+								 List *reference_list);
 								 bool isouterjoin);
 #endif   /* RESTRICTINFO_H */
--- a/src/include/utils/lsyscache.h
+++ b/src/include/utils/lsyscache.h
@@ -6,7 +6,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/utils/lsyscache.h,v 1.112 2007/01/10 18:06:05 tgl Exp $
+ * $PostgreSQL: pgsql/src/include/utils/lsyscache.h,v 1.113 2007/01/20 20:45:41 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -35,12 +35,11 @@ extern void get_op_opfamily_properties(Oid opno, Oid opfamily,
 						  bool *recheck);
 extern Oid	get_opfamily_member(Oid opfamily, Oid lefttype, Oid righttype,
 								int16 strategy);
 extern bool get_op_mergejoin_info(Oid eq_op, Oid *left_sortop,
 					  Oid *right_sortop, Oid *opfamily);
 extern bool get_compare_function_for_ordering_op(Oid opno,
 												 Oid *cmpfunc, bool *reverse);
 extern Oid	get_equality_op_for_ordering_op(Oid opno);
 extern Oid	get_ordering_op_for_equality_op(Oid opno, bool use_lhs_type);
 extern List *get_mergejoin_opfamilies(Oid opno);
 extern Oid	get_compatible_hash_operator(Oid opno, bool use_lhs_type);
 extern Oid	get_op_hash_function(Oid opno);
 extern void get_op_btree_interpretation(Oid opno,