node, as this behavior is now better done as a bitmap OR indexscan.
This allows considerable simplification in nodeIndexscan.c itself as
well as several planner modules concerned with indexscan plan generation.
Also we can improve the sharing of code between regular and bitmap
indexscans, since they are now working with nigh-identical Plan nodes.
but the code is basically working. Along the way, rewrite the entire
approach to processing OR index conditions, and make it work in join
cases for the first time ever. orindxpath.c is now basically obsolete,
but I left it in for the time being to allow easy comparison testing
against the old implementation.
logic operations during planning. Seems cleaner to create two new Path
node types, instead --- this avoids duplication of cost-estimation code.
Also, create an enable_bitmapscan GUC parameter to control use of bitmap
plans.
scans, using in-memory tuple ID bitmaps as the intermediary. The planner
frontend (path creation and cost estimation) is not there yet, so none
of this code can be executed. I have tested it using some hacked planner
code that is far too ugly to see the light of day, however. Committing
now so that the bulk of the infrastructure changes go in before the tree
drifts under me.
few palloc's. I also chose to eliminate the restype and restypmod fields
entirely, since they are redundant with information stored in the node's
contained expression; re-examining the expression at need seems simpler
and more reliable than trying to keep restype/restypmod up to date.
initdb forced due to change in contents of stored rules.
structs. There are many places in the planner where we were passing
both a rel and an index to subroutines, and now need only pass the
index struct. Notationally simpler, and perhaps a tad faster.
Formerly, if such a clause contained no aggregate functions we mistakenly
treated it as equivalent to WHERE. Per spec it must cause the query to
be treated as a grouped query of a single group, the same as appearance
of aggregate functions would do. Also, the HAVING filter must execute
after aggregate function computation even if it itself contains no
aggregate functions.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
In the past, we used a 'Lispy' linked list implementation: a "list" was
merely a pointer to the head node of the list. The problem with that
design is that it makes lappend() and length() linear time. This patch
fixes that problem (and others) by maintaining a count of the list
length and a pointer to the tail node along with each head node pointer.
A "list" is now a pointer to a structure containing some meta-data
about the list; the head and tail pointers in that structure refer
to ListCell structures that maintain the actual linked list of nodes.
The function names of the list API have also been changed to, I hope,
be more logically consistent. By default, the old function names are
still available; they will be disabled-by-default once the rest of
the tree has been updated to use the new API names.
equivalent sort expressions to use was broken: you can't just look
at the relation membership, you have to actually grovel over the
individual Vars in each expression. I think this did work when it
was written, but it was broken by subsequent optimizations that made
join relations not propagate every single input variable upward.
Must find the Var that got propagated, not choose one at random.
Per bug report from Daniel O'Neill.
check instead of hardwiring assumptions that only certain plan node types
can appear at the places where we are testing. This was always a pretty
fragile assumption, and it turns out to be broken in 7.4 for certain cases
involving IN-subselect tests that need type coercion.
Also, modify code that builds finished Plan tree so that node types that
don't do projection always copy their input node's targetlist, rather than
having the tlist passed in from the caller. The old method makes it too
easy to write broken code that thinks it can modify the tlist when it
cannot.
pointer type when it is not necessary to do so.
For future reference, casting NULL to a pointer type is only necessary
when (a) invoking a function AND either (b) the function has no prototype
OR (c) the function is a varargs function.
regular qpqual ('filter condition'), add special-purpose code to
nodeIndexscan.c to recheck them. This ends being almost no net addition
of code, because the removal of planner code balances out the extra
executor code, but it is significantly more efficient when a lossy
operator is involved in an OR indexscan. The old implementation had
to recheck the entire indexqual in such cases.
with index qual clauses in the Path representation. This saves a little
work during createplan and (probably more importantly) allows reuse of
cached selectivity estimates during indexscan planning. Also fix latent
bug: wrong plan would have been generated for a 'special operator' used
in a nestloop-inner-indexscan join qual, because the special operator
would not have gotten into the list of quals to recheck. This bug is
only latent because at present the special-operator code could never
trigger on a join qual, but sooner or later someone will want to do it.
join conditions in which each OR subclause includes a constraint on
the same relation. This implements the other useful side-effect of
conversion to CNF format, without its unpleasant side-effects. As
per pghackers discussion of a few weeks ago.
the hashclauses field of the parent HashJoin. This avoids problems with
duplicated links to SubPlans in hash clauses, as per report from
Andrew Holm-Hansen.
pghackers proposal of 8-Nov. All the existing cross-type comparison
operators (int2/int4/int8 and float4/float8) have appropriate support.
The original proposal of storing the right-hand-side datatype as part of
the primary key for pg_amop and pg_amproc got modified a bit in the event;
it is easier to store zero as the 'default' case and only store a nonzero
when the operator is actually cross-type. Along the way, remove the
long-since-defunct bigbox_ops operator class.
Remove the 'strategy map' code, which was a large amount of mechanism
that no longer had any use except reverse-mapping from procedure OID to
strategy number. Passing the strategy number to the index AM in the
first place is simpler and faster.
This is a preliminary step in planned support for cross-datatype index
operations. I'm committing it now since the ScanKeyEntryInitialize()
API change touches quite a lot of files, and I want to commit those
changes before the tree drifts under me.
datatype by array_eq and array_cmp; use this to solve problems with memory
leaks in array indexing support. The parser's equality_oper and ordering_oper
routines also use the cache. Change the operator search algorithms to look
for appropriate btree or hash index opclasses, instead of assuming operators
named '<' or '=' have the right semantics. (ORDER BY ASC/DESC now also look
at opclasses, instead of assuming '<' and '>' are the right things.) Add
several more index opclasses so that there is no regression in functionality
for base datatypes. initdb forced due to catalog additions.
yet, though). Avoid using nth() to fetch tlist entries; provide a
common routine get_tle_by_resno() to search a tlist for a particular
resno. This replaces a couple uses of nth() and a dozen hand-coded
search loops. Also, replace a few uses of nth(length-1, list) with
llast().
subplan it starts with, as they may be needed at upper join levels.
See comments added to code for the non-obvious reason why. Per bug report
from Robert Creager.
node emits only those vars that are actually needed above it in the
plan tree. (There were comments in the code suggesting that this was
done at some point in the dim past, but for a long time we have just
made join nodes emit everything that either input emitted.) Aside from
being marginally more efficient, this fixes the problem noted by Peter
Eisentraut where a join above an IN-implemented-as-join might fail,
because the subplan targetlist constructed in the latter case didn't
meet the expectation of including everything.
Along the way, fix some places that were O(N^2) in the targetlist
length. This is not all the trouble spots for wide queries by any
means, but it's a step forward.
silently resolving them to type TEXT. This is comparable to what we
do when faced with UNKNOWN in CASE, UNION, and other contexts. It gets
rid of this and related annoyances:
select distinct f1, '' from int4_tbl;
ERROR: Unable to identify an ordering operator '<' for type unknown
This was discussed many moons ago, but no one got round to fixing it.
some cases of redundant clauses that were formerly not caught. We have
to special-case this because the clauses involved never get attached to
the same join restrictlist and so the existing logic does not notice
that they are redundant.
of an index can now be a computed expression instead of a simple variable.
Restrictions on expressions are the same as for predicates (only immutable
functions, no sub-selects). This fixes problems recently introduced with
inlining SQL functions, because the inlining transformation is applied to
both expression trees so the planner can still match them up. Along the
way, improve efficiency of handling index predicates (both predicates and
index expressions are now cached by the relcache) and fix 7.3 oversight
that didn't record dependencies of predicate expressions.
dropped. The simplest fix for INSERT/UPDATE cases turns out to be for
preptlist.c to insert NULLs of a known-good type (I used INT4) rather
than making them match the deleted column's type. Since the representation
of NULL is actually datatype-independent, this should work fine.
I also re-reverted the patch to disable the use_physical_tlist optimization
in the presence of dropped columns. It still doesn't look worth the
trouble to be smarter, if there are no other bugs to fix.
Added a regression test to catch future problems in this area.
where the table contains dropped columns. If the columns are dropped,
then their types may be gone as well, which causes ExecTypeFromTL() to
fail if the dropped columns appear in a plan node's tlist. This could
be worked around but I don't think the optimization is valuable enough
to be worth the trouble.
the column by table OID and column number, if it's a simple column
reference. Along the way, get rid of reskey/reskeyop fields in Resdoms.
Turns out that representation was not convenient for either the planner
or the executor; we can make the planner deliver exactly what the
executor wants with no more effort.
initdb forced due to change in stored rule representation.
utility statement (DeclareCursorStmt) with a SELECT query dangling from
it, rather than a SELECT query with a few unusual fields in it. Add
code to determine whether a planned query can safely be run backwards.
If DECLARE CURSOR specifies SCROLL, ensure that the plan can be run
backwards by adding a Materialize plan node if it can't. Without SCROLL,
you get an error if you try to fetch backwards from a cursor that can't
handle it. (There is still some discussion about what the exact
behavior should be, but this is necessary infrastructure in any case.)
Along the way, make EXPLAIN DECLARE CURSOR work.
in the case where the node immediately above the scan is a Hash, Sort,
or Material node. In these cases it's better to do the projection
so that we don't store unneeded columns in the hash/sort/materialize
table. Per discussion a few days ago with Anagh Lal.
rid of the assumption that sizeof(Oid)==sizeof(int). This is one small
step towards someday supporting 8-byte OIDs. For the moment, it doesn't
do much except get rid of a lot of unsightly casts.