match in antijoin mode, we should advance to next outer tuple not next inner.
We know we don't want to return this outer tuple, and there is no point in
advancing over matching inner tuples now, because we'd just have to do it
again if the next outer tuple has the same merge key. This makes a noticeable
difference if there are lots of duplicate keys in both inputs.
Similarly, after finding a match in semijoin mode, arrange to advance to
the next outer tuple after returning the current match; or immediately,
if it fails the extra quals. The rationale is the same. (This is a
performance bug in existing releases; perhaps worth back-patching? The
planner tries to avoid using mergejoin with lots of duplicates, so it may
not be a big issue in practice.)
Nestloop and hash got this right to start with, but I made some cosmetic
adjustments there to make the corresponding bits of logic look more similar.
the old JOIN_IN code, but antijoins are new functionality.) Teach the planner
to convert appropriate EXISTS and NOT EXISTS subqueries into semi and anti
joins respectively. Also, LEFT JOINs with suitable upper-level IS NULL
filters are recognized as being anti joins. Unify the InClauseInfo and
OuterJoinInfo infrastructure into "SpecialJoinInfo". With that change,
it becomes possible to associate a SpecialJoinInfo with every join attempt,
which permits some cleanup of join selectivity estimation. That needs to be
taken much further than this patch does, but the next step is to change the
API for oprjoin selectivity functions, which seems like material for a
separate patch. So for the moment the output size estimates for semi and
especially anti joins are quite bogus.
made query plan. Use of ALTER COLUMN TYPE creates a hazard for cached
query plans: they could contain Vars that claim a column has a different
type than it now has. Fix this by checking during plan startup that Vars
at relation scan level match the current relation tuple descriptor. Since
at that point we already have at least AccessShareLock, we can be sure the
column type will not change underneath us later in the query. However,
since a backend's locks do not conflict against itself, there is still a
hole for an attacker to exploit: he could try to execute ALTER COLUMN TYPE
while a query is in progress in the current backend. Seal that hole by
rejecting ALTER TABLE whenever the target relation is already open in
the current backend.
This is a significant security hole: not only can one trivially crash the
backend, but with appropriate misuse of pass-by-reference datatypes it is
possible to read out arbitrary locations in the server process's memory,
which could allow retrieving database content the user should not be able
to see. Our thanks to Jeff Trout for the initial report.
Security: CVE-2007-0556
bits indicating which optional capabilities can actually be exercised
at runtime. This will allow Sort and Material nodes, and perhaps later
other nodes, to avoid unnecessary overhead in common cases.
This commit just adds the infrastructure and arranges to pass the correct
flag values down to plan nodes; none of the actual optimizations are here
yet. I'm committing this separately in case anyone wants to measure the
added overhead. (It should be negligible.)
Simon Riggs and Tom Lane
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...
which does the same thing. Perhaps at one time there was a reason to
allow plan nodes to store their result types in different places, but
AFAICT that's been unnecessary for a good while.
Try to model the effect of rescanning input tuples in mergejoins;
account for JOIN_IN short-circuiting where appropriate. Also, recognize
that mergejoin and hashjoin clauses may now be more than single operator
calls, so we have to charge appropriate execution costs.
There are two implementation techniques: the executor understands a new
JOIN_IN jointype, which emits at most one matching row per left-hand row,
or the result of the IN's sub-select can be fed through a DISTINCT filter
and then joined as an ordinary relation.
Along the way, some minor code cleanup in the optimizer; notably, break
out most of the jointree-rearrangement preprocessing in planner.c and
put it in a new file prep/prepjointree.c.
a per-query memory context created by CreateExecutorState --- and destroyed
by FreeExecutorState. This provides a final solution to the longstanding
problem of memory leaked by various ExecEndNode calls.
execution state trees, and ExecEvalExpr takes an expression state tree
not an expression plan tree. The plan tree is now read-only as far as
the executor is concerned. Next step is to begin actually exploiting
this property.
to plan nodes, not vice-versa. All executor state nodes now inherit from
struct PlanState. Copying of plan trees has been simplified by not
storing a list of SubPlans in Plan nodes (eliminating duplicate links).
The executor still needs such a list, but it can build it during
ExecutorStart since it has to scan the plan tree anyway.
No initdb forced since no stored-on-disk structures changed, but you
will need a full recompile because of node-numbering changes.
for example, an SQL function can be used in a functional index. (I make
no promises about speed, but it'll work ;-).) Clean up and simplify
handling of functions returning sets.
right thing with variable-free clauses that contain noncachable functions,
such as 'WHERE random() < 0.5' --- these are evaluated once per
potential output tuple. Expressions that contain only Params are
now candidates to be indexscan quals --- for example, 'var = ($1 + 1)'
can now be indexed. Cope with RelabelType nodes atop potential indexscan
variables --- this oversight prevents 7.0.* from recognizing some
potentially indexscanable situations.
There's now only one transition value and transition function.
NULL handling in aggregates is a lot cleaner. Also, use Numeric
accumulators instead of integer accumulators for sum/avg on integer
datatypes --- this avoids overflow at the cost of being a little slower.
Implement VARIANCE() and STDDEV() aggregates in the standard backend.
Also, enable new LIKE selectivity estimators by default. Unrelated
change, but as long as I had to force initdb anyway...
memory contexts. Currently, only leaks in expressions executed as
quals or projections are handled. Clean up some old dead cruft in
executor while at it --- unused fields in state nodes, that sort of thing.
from a constraint condition does not violate the constraint (cf. discussion
on pghackers 12/9/99). Implemented by adding a parameter to ExecQual,
specifying whether to return TRUE or FALSE when the qual result is
really NULL in three-valued boolean logic. Currently, ExecRelCheck is
the only caller that asks for TRUE, but if we find any other places that
have the wrong response to NULL, it'll be easy to fix them.
ExecReScan for nodeAgg, nodeHash, nodeHashjoin, nodeNestloop and nodeResult.
Fixed ExecReScan for nodeMaterial.
Get rid of #ifdef INDEXSCAN_PATCH.
Get rid of ExecMarkPos and ExecRestrPos in nodeNestloop.