1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-07 19:06:32 +03:00
Commit Graph

886 Commits

Author SHA1 Message Date
Tom Lane
39cee73889 Revise searching of subplan target lists to use something more efficient
than tlist_member calls.  Building a large join tlist is still O(N^2),
but with a much smaller constant factor than before.
2005-06-10 00:28:54 +00:00
Tom Lane
a31ad27fc5 Simplify the planner's join clause management by storing join clauses
of a relation in a flat 'joininfo' list.  The former arrangement grouped
the join clauses according to the set of unjoined relids used in each;
however, profiling on test cases involving lots of joins proves that
that data structure is a net loss.  It takes more time to group the
join clauses together than is saved by avoiding duplicate tests later.
It doesn't help any that there are usually not more than one or two
clauses per group ...
2005-06-09 04:19:00 +00:00
Tom Lane
e3a33a9a9f Marginal hack to avoid spending a lot of time in find_join_rel during
large planning problems: when the list of join rels gets too long, make
an auxiliary hash table that hashes on the identifying Bitmapset.
2005-06-08 23:02:05 +00:00
Tom Lane
9a586fe0c5 Nab some low-hanging fruit: replace the planner's base_rel_list and
other_rel_list with a single array indexed by rangetable index.
This reduces find_base_rel from O(N) to O(1) without any real penalty.
While find_base_rel isn't one of the major bottlenecks in any profile
I've seen so far, it was starting to creep up on the radar screen
for complex queries --- so might as well fix it.
2005-06-06 04:13:36 +00:00
Tom Lane
9ab4d98168 Remove planner's private fields from Query struct, and put them into
a new PlannerInfo struct, which is passed around instead of the bare
Query in all the planning code.  This commit is essentially just a
code-beautification exercise, but it does open the door to making
larger changes to the planner data structures without having to muck
with the widely-known Query struct.
2005-06-05 22:32:58 +00:00
Tom Lane
e18e8f8735 Change expandRTE() and ResolveNew() back to taking just the single
RTE of interest, rather than the whole rangetable list.  This makes
the API more understandable and avoids duplicate RTE lookups.  This
patch reverts no-longer-needed portions of my patch of 2004-08-19.
2005-06-04 19:19:42 +00:00
Tom Lane
ba42002461 Revise handling of dropped columns in JOIN alias lists to avoid a
performance problem pointed out by phil@vodafone: to wit, we were
spending O(N^2) time to check dropped-ness in an N-deep join tree,
even in the case where the tree was freshly constructed and couldn't
possibly mention any dropped columns.  Instead of recursing in
get_rte_attribute_is_dropped(), change the data structure definition:
the joinaliasvars list of a JOIN RTE must have a NULL Const instead
of a Var at any position that references a now-dropped column.  This
costs nothing during normal parse-rewrite-plan path, and instead we
have a linear-time update to make when loading a stored rule that
might contain now-dropped columns.  While at it, move the responsibility
for acquring locks on relations referenced by rules into this separate
function (which I therefore chose to call AcquireRewriteLocks).
This saves effort --- namely, duplicated lock grabs in parser and rewriter
--- in the normal path at a cost of one extra non-locked heap_open()
in the stored-rule path; seems a good tradeoff.  A fringe benefit is
that it is now *much* clearer that we acquire lock on relations referenced
in rules before we make any rewriter decisions based on their properties.
(I don't know of any bug of that ilk, but it wasn't exactly clear before.)
2005-06-03 23:05:30 +00:00
Tom Lane
3531383224 Just noticed that you can't Query-Cancel a long planner run, because
no part of the planner did CHECK_FOR_INTERRUPTS().  Add one in a
suitably strategic spot.
2005-06-03 19:00:12 +00:00
Tom Lane
ac25dbd84b Add support for FUNCTION RTEs to build_physical_tlist(), so that the
physical-tlist optimization can be applied to FunctionScan nodes as well
as regular tables and SubqueryScans.
2005-05-30 18:55:49 +00:00
Tom Lane
c8f81df41b Skip eval_const_expressions when the query is such that the expression
would be evaluated only once anyway (ie, it's just a SELECT with no
FROM or an INSERT ... VALUES).  The planner can't do it any faster than
the executor, so no point in an extra copying of the expression tree.
2005-05-30 01:04:44 +00:00
Tom Lane
872c1497fc Previous fix for "x FULL JOIN y ON true" failed to handle the case
where there was also a WHERE-clause restriction that applied to the
join.  The check on restrictlist == NIL is really unnecessary anyway,
because select_mergejoin_clauses already checked for and complained
about any unmergejoinable join clauses.  So just take it out.
2005-05-24 18:02:31 +00:00
Tom Lane
c1393173aa Avoid redundant relation lock grabs during planning, and make sure
that we acquire a lock on relations added to the query due to inheritance.
Formerly, no such lock was held throughout planning, which meant that
a schema change could occur to invalidate the plan before it's even
been completed.
2005-05-23 03:01:14 +00:00
Tom Lane
e2159f3842 Teach the planner to remove SubqueryScan nodes from the plan if they
aren't doing anything useful (ie, neither selection nor projection).
Also, extend to SubqueryScan the hacks already in place to avoid
unnecessary ExecProject calls when the result would just be the same
tuple the subquery already delivered.  This saves some overhead in
UNION and other set operations, as well as avoiding overhead for
unflatten-able subqueries.  Per example from Sokolov Yura.
2005-05-22 22:30:20 +00:00
Tom Lane
278bd0cc22 For some reason access/tupmacs.h has been #including utils/memutils.h,
which is neither needed by nor related to that header.  Remove the bogus
inclusion and instead include the header in those C files that actually
need it.  Also fix unnecessary inclusions and bad inclusion order in
tsearch2 files.
2005-05-06 17:24:55 +00:00
Tom Lane
bedb78d386 Implement sharable row-level locks, and use them for foreign key references
to eliminate unnecessary deadlocks.  This commit adds SELECT ... FOR SHARE
paralleling SELECT ... FOR UPDATE.  The implementation uses a new SLRU
data structure (managed much like pg_subtrans) to represent multiple-
transaction-ID sets.  When more than one transaction is holding a shared
lock on a particular row, we create a MultiXactId representing that set
of transactions and store its ID in the row's XMAX.  This scheme allows
an effectively unlimited number of row locks, just as we did before,
while not costing any extra overhead except when a shared lock actually
has to be shared.   Still TODO: use the regular lock manager to control
the grant order when multiple backends are waiting for a row lock.

Alvaro Herrera and Tom Lane.
2005-04-28 21:47:18 +00:00
Tom Lane
a0ea71333a Avoid rechecking lossy operators twice in a bitmap scan plan. 2005-04-25 04:27:12 +00:00
Tom Lane
1fcd4b7a07 While determining the filter clauses for an index scan (either plain
or bitmap), use pred_test to be a little smarter about cases where a
filter clause is logically unnecessary.  This may be overkill for the
plain indexscan case, but it's definitely useful for OR'd bitmap scans.
2005-04-25 03:58:30 +00:00
Tom Lane
79a1b00226 Replace slightly klugy create_bitmap_restriction() function with a
more efficient routine in restrictinfo.c (which can make use of
make_restrictinfo_internal).
2005-04-25 02:14:48 +00:00
Tom Lane
5b05185262 Remove support for OR'd indexscans internal to a single IndexScan plan
node, as this behavior is now better done as a bitmap OR indexscan.
This allows considerable simplification in nodeIndexscan.c itself as
well as several planner modules concerned with indexscan plan generation.
Also we can improve the sharing of code between regular and bitmap
indexscans, since they are now working with nigh-identical Plan nodes.
2005-04-25 01:30:14 +00:00
Tom Lane
56c8877291 Turns out that my recent elimination of the 'redundant' flatten_andors()
code in prepqual.c had a small drawback: the flatten_andors code was
able to cope with deeply nested AND/OR structures (like 10000 ORs in
a row), whereas eval_const_expressions tends to recurse until it
overruns the stack.  Revise eval_const_expressions so that it doesn't
choke on deeply nested ANDs or ORs.
2005-04-23 04:42:53 +00:00
Tom Lane
e092828241 Teach choose_bitmap_and() to actually be choosy --- that is, try to
make some estimate of which available indexes to AND together, rather
than blindly taking 'em all.  This could probably stand further
improvement, but it seems to do OK in simple tests.
2005-04-23 01:57:34 +00:00
Tom Lane
4b89126ccc Fix bogus EXPLAIN display of rowcount estimates for BitmapAnd and
BitmapOr nodes.
2005-04-23 01:29:15 +00:00
Tom Lane
bc843d3960 First cut at planner support for bitmap index scans. Lots to do yet,
but the code is basically working.  Along the way, rewrite the entire
approach to processing OR index conditions, and make it work in join
cases for the first time ever.  orindxpath.c is now basically obsolete,
but I left it in for the time being to allow easy comparison testing
against the old implementation.
2005-04-22 21:58:32 +00:00
Tom Lane
14c7fba3f7 Rethink original decision to use AND/OR Expr nodes to represent bitmap
logic operations during planning.  Seems cleaner to create two new Path
node types, instead --- this avoids duplication of cost-estimation code.
Also, create an enable_bitmapscan GUC parameter to control use of bitmap
plans.
2005-04-21 19:18:13 +00:00
Tom Lane
e6f7edb9d5 Install some slightly realistic cost estimation for bitmap index scans. 2005-04-21 02:28:02 +00:00
Tom Lane
eb4f58ad40 Don't try to run clauseless index scans on index types that don't support
it.  Per report from Marinos Yannikos.
2005-04-20 21:48:04 +00:00
Tom Lane
4a8c5d0375 Create executor and planner-backend support for decoupled heap and index
scans, using in-memory tuple ID bitmaps as the intermediary.  The planner
frontend (path creation and cost estimation) is not there yet, so none
of this code can be executed.  I have tested it using some hacked planner
code that is far too ugly to see the light of day, however.  Committing
now so that the bulk of the infrastructure changes go in before the tree
drifts under me.
2005-04-19 22:35:18 +00:00
Tom Lane
939712ee73 Don't try to constant-fold functions returning RECORD, since the optimizer
isn't presently set up to pass them an expected tuple descriptor.  Bug has
been there since 7.3 but was just recently reported by Thomas Hallgren.
2005-04-14 21:44:09 +00:00
Tom Lane
162bd08b3f Completion of project to use fixed OIDs for all system catalogs and
indexes.  Replace all heap_openr and index_openr calls by heap_open
and index_open.  Remove runtime lookups of catalog OID numbers in
various places.  Remove relcache's support for looking up system
catalogs by name.  Bulky but mostly very boring patch ...
2005-04-14 20:03:27 +00:00
Tom Lane
7ace43e0c2 Fix oversight in MIN/MAX optimization: must not return NULL entries
from index, since the aggregates ignore NULLs.
2005-04-12 05:11:28 +00:00
Tom Lane
2e7a68896b Add aggsortop column to pg_aggregate, so that MIN/MAX optimization can
be supported for all datatypes.  Add CREATE AGGREGATE and pg_dump support
too.  Add specialized min/max aggregates for bpchar, instead of depending
on text's min/max, because otherwise the possible use of bpchar indexes
cannot be recognized.
initdb forced because of catalog changes.
2005-04-12 04:26:34 +00:00
Tom Lane
addc42c339 Create the planner mechanism for optimizing simple MIN and MAX queries
into indexscans on matching indexes.  For the moment, it only handles
int4 and text datatypes; next step is to add a column to pg_aggregate
so that all MIN/MAX aggregates can be handled.  Per my recent proposal.
2005-04-11 23:06:57 +00:00
Tom Lane
acde8b3cab Make constant-folding produce sane output for COALESCE(NULL,NULL),
that is a plain NULL and not a COALESCE with no inputs.  Fixes crash
reported by Michael Williamson.
2005-04-10 20:57:32 +00:00
Tom Lane
6985592967 Split out into a separate function the code in grouping_planner() that
decides whether to use hashed grouping instead of sort-plus-uniq
grouping. The function needs an annoyingly large number of parameters,
but this still seems like a win for legibility, since it removes over
a hundred lines from grouping_planner (which is still too big :-().
2005-04-10 19:50:08 +00:00
Tom Lane
ad161bcc8a Merge Resdom nodes into TargetEntry nodes to simplify code and save a
few palloc's.  I also chose to eliminate the restype and restypmod fields
entirely, since they are redundant with information stored in the node's
contained expression; re-examining the expression at need seems simpler
and more reliable than trying to keep restype/restypmod up to date.

initdb forced due to change in contents of stored rules.
2005-04-06 16:34:07 +00:00
Tom Lane
280de290d7 In cost_mergejoin, the early-exit effect should not apply to the
outer side of an outer join.  Per andrew@supernews.
2005-04-04 01:43:12 +00:00
Tom Lane
47888fe842 First phase of OUT-parameters project. We can now define and use SQL
functions with OUT parameters.  The various PLs still need work, as does
pg_dump.  Rudimentary docs and regression tests included.
2005-03-31 22:46:33 +00:00
Tom Lane
70c9763d48 Convert oidvector and int2vector into variable-length arrays. This
change saves a great deal of space in pg_proc and its primary index,
and it eliminates the former requirement that INDEX_MAX_KEYS and
FUNC_MAX_ARGS have the same value.  INDEX_MAX_KEYS is still embedded
in the on-disk representation (because it affects index tuple header
size), but FUNC_MAX_ARGS is not.  I believe it would now be possible
to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet.
There are still a lot of vestigial references to FUNC_MAX_ARGS, which
I will clean up in a separate pass.  However, getting rid of it
altogether would require changing the FunctionCallInfoData struct,
and I'm not sure I want to buy into that.
2005-03-29 00:17:27 +00:00
Tom Lane
5db2e83852 Rethink the order of expression preprocessing: eval_const_expressions
really ought to run before canonicalize_qual, because it can now produce
forms that canonicalize_qual knows how to improve (eg, NOT clauses).
Also, because eval_const_expressions already knows about flattening
nested ANDs and ORs into N-argument form, the initial flatten_andors
pass in canonicalize_qual is now completely redundant and can be
removed.  This doesn't save a whole lot of code, but the time and
palloc traffic eliminated is a useful gain on large expression trees.
2005-03-28 00:58:26 +00:00
Tom Lane
bf3dbb5881 First steps towards index scans with heap access decoupled from index
access: define new index access method functions 'amgetmulti' that can
fetch multiple TIDs per call.  (The functions exist but are totally
untested as yet.)  Since I was modifying pg_am anyway, remove the
no-longer-needed 'rel' parameter from amcostestimate functions, and
also remove the vestigial amowner column that was creating useless
work for Alvaro's shared-object-dependencies project.
Initdb forced due to changes in pg_am.
2005-03-27 23:53:05 +00:00
Tom Lane
351519affc Teach const-expression simplification to simplify boolean equality cases,
that is 'x = true' becomes 'x' and 'x = false' becomes 'NOT x'.  This isn't
all that amazingly useful in itself, but it ensures that we will recognize
the different forms as being logically equivalent when checking partial
index predicates.  Per example from Patrick Clery.
2005-03-27 19:18:02 +00:00
Tom Lane
926e8a00d3 Add a back-link from IndexOptInfo structs to their parent RelOptInfo
structs.  There are many places in the planner where we were passing
both a rel and an index to subroutines, and now need only pass the
index struct.  Notationally simpler, and perhaps a tad faster.
2005-03-27 06:29:49 +00:00
Tom Lane
febc9a613c Expand the 'special index operator' machinery to handle special cases
for boolean indexes.  Previously we would only use such an index with
WHERE clauses like 'indexkey = true' or 'indexkey = false'.  The new
code transforms the cases 'indexkey', 'NOT indexkey', 'indexkey IS TRUE',
and 'indexkey IS FALSE' into one of these.  While this is only marginally
useful in itself, I intend soon to change constant-expression simplification
so that 'foo = true' and 'foo = false' are reduced to just 'foo' and
'NOT foo' ... which would lose the ability to use boolean indexes for
such queries at all, if the indexscan machinery couldn't make the
reverse transformation.
2005-03-26 23:29:20 +00:00
Tom Lane
208ec47ba3 Tweak planner to use a minimum size estimate of 10 pages for a
never-yet-vacuumed relation.  This restores the pre-8.0 behavior of
avoiding seqscans during initial data loading, while still allowing
reasonable optimization after a table has been vacuumed.  Several
regression test cases revert to 7.4-like behavior, which is probably
a good sign.  Per gripes from Keith Browne and others.
2005-03-24 19:14:49 +00:00
Neil Conway
d344505d1b This patch moves some code for preprocessing FOR UPDATE from
grouping_planner() to preprocess_targetlist(), according to a comment
in grouping_planner(). I think the refactoring makes sense, and moves
some extraneous details out of grouping_planner().
2005-03-17 23:45:09 +00:00
Tom Lane
595ed2a855 Make the behavior of HAVING without GROUP BY conform to the SQL spec.
Formerly, if such a clause contained no aggregate functions we mistakenly
treated it as equivalent to WHERE.  Per spec it must cause the query to
be treated as a grouped query of a single group, the same as appearance
of aggregate functions would do.  Also, the HAVING filter must execute
after aggregate function computation even if it itself contains no
aggregate functions.
2005-03-10 23:21:26 +00:00
Tom Lane
849074f9ae Revise hash join code so that we can increase the number of batches
on-the-fly, and thereby avoid blowing out memory when the planner has
underestimated the hash table size.  Hash join will now obey the
work_mem limit with some faithfulness.  Per my recent proposal
(hash aggregate part isn't done yet though).
2005-03-06 22:15:05 +00:00
Tom Lane
3104a92866 Another go at making pred_test() handle all reasonable combinations
of AND and OR clauses.  The key point here is that an OR on the
predicate side has to be treated gingerly: we may be able to prove
that the OR is implied even when no one of its components is implied.
For example (x OR y) implies (x OR y OR z) even though no one of x,
y, or z can be individually proven.  This code handles both the
example shown recently by Sergey Koshcheyev and the one shown last
October by Dawid Kuroczko.
2005-03-02 04:10:53 +00:00
Tom Lane
95871703e3 Adjust OR indexscan logic to not generate redundant condition-free OR
indexscans involving partial indexes.  These would always be dominated
by a simple indexscan on such an index, so there's no point in considering
them.  Fixes overoptimism in a patch I applied last October.
2005-03-01 01:40:05 +00:00
Tom Lane
4e89bae704 Revert the logic for expanding AND/OR conditions in pred_test() to what
it was in 7.4, and add some comments explaining why it has to be this way.
I broke it for OR'd index predicates in a fit of code cleanup last summer.
Per example from Sergey Koshcheyev.
2005-03-01 00:24:52 +00:00