Fix best_inner_indexscan to return both the cheapest-total-cost and

cheapest-startup-cost innerjoin indexscans, and make joinpath.c consider both of these (when different) as the inside of a nestloop join. The original design was based on the assumption that indexscan paths always have negligible startup cost, and so total cost is the only important figure of merit; an assumption that's obviously broken by bitmap indexscans. This oversight could lead to choosing poor plans in cases where fast-start behavior is more important than total cost, such as LIMIT and IN queries. 8.1-vintage brain fade exposed by an example from Chuck D.
2026-01-05 23:38:41 +03:00 · 2007-05-22 01:40:33 +00:00
parent 2415ad9831
commit d7153c5fad
5 changed files with 94 additions and 63 deletions
--- a/src/include/nodes/relation.h
+++ b/src/include/nodes/relation.h
@@ -7,7 +7,7 @@
 * Portions Copyright (c) 1996-2007, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
- * $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.141 2007/04/21 21:01:45 tgl Exp $
+ * $PostgreSQL: pgsql/src/include/nodes/relation.h,v 1.142 2007/05/22 01:40:33 tgl Exp $
 *
 *-------------------------------------------------------------------------
 */
@@ -996,20 +996,20 @@ typedef struct MergeScanSelCache
 * relation includes all other relids appearing in those joinclauses.
 * The set of usable joinclauses, and thus the best inner indexscan,
 * thus varies depending on which outer relation we consider; so we have
- * to recompute the best such path for every join.	To avoid lots of
+ * to recompute the best such paths for every join.  To avoid lots of
 * redundant computation, we cache the results of such searches.  For
 * each relation we compute the set of possible otherrelids (all relids
 * appearing in joinquals that could become indexquals for this table).
 * Two outer relations whose relids have the same intersection with this
 * set will have the same set of available joinclauses and thus the same
- * best inner indexscan for the inner relation.  By taking the intersection
+ * best inner indexscans for the inner relation.  By taking the intersection
 * before scanning the cache, we avoid recomputing when considering
 * join rels that differ only by the inclusion of irrelevant other rels.
 *
 * The search key also includes a bool showing whether the join being
 * considered is an outer join.  Since we constrain the join order for
 * outer joins, I believe that this bool can only have one possible value
- * for any particular base relation; but store it anyway to avoid confusion.
+ * for any particular lookup key; but store it anyway to avoid confusion.
 */

 typedef struct InnerIndexscanInfo
@@ -1018,8 +1018,9 @@ typedef struct InnerIndexscanInfo
 	/* The lookup key: */
 	Relids		other_relids;	/* a set of relevant other relids */
 	bool		isouterjoin;	/* true if join is outer */
-	/* Best path for this lookup key: */
-	Path	   *best_innerpath; /* best inner indexscan, or NULL if none */
+	/* Best paths for this lookup key (NULL if no available indexscans): */
+	Path	   *cheapest_startup_innerpath;	/* cheapest startup cost */
+	Path	   *cheapest_total_innerpath;	/* cheapest total cost */
 } InnerIndexscanInfo;

 /*