Foreign data wrappers can use this capability for so-called "join
pushdown"; that is, instead of executing two separate foreign scans
and then joining the results locally, they can generate a path which
performs the join on the remote server and then is scanned locally.
This commit does not extend postgres_fdw to take advantage of this
capability; it just provides the infrastructure.
Custom scan providers can use this in a similar way. Previously,
it was only possible for a custom scan provider to scan a single
relation. Now, it can scan an entire join tree, provided of course
that it knows how to produce the same results that the join would
have produced if executed normally.
KaiGai Kohei, reviewed by Shigeru Hanada, Ashutosh Bapat, and me.
The original security barrier view implementation, on which RLS is
built, prevented all non-leakproof functions from being pushed down to
below the view, even when the function was not receiving any data from
the view. This optimization improves on that situation by, instead of
checking strictly for non-leakproof functions, it checks for Vars being
passed to non-leakproof functions and allows functions which do not
accept arguments or whose arguments are not from the current query level
(eg: constants can be particularly useful) to be pushed down.
As discussed, this does mean that a function which is pushed down might
gain some idea that there are rows meeting a certain criteria based on
the number of times the function is called, but this isn't a
particularly new issue and the documentation in rules.sgml already
addressed similar covert-channel risks. That documentation is updated
to reflect that non-leakproof functions may be pushed down now, if
they meet the above-described criteria.
Author: Dean Rasheed, with a bit of rework to make things clearer,
along with comment and documentation updates from me.
The cross-reference to set_append_rel_pathlist() was obsoleted by
commit e2fa76d80ba571d4de8992de6386536867250474, which split what
had been set_rel_pathlist() and child routines into two sets of
functions. But I (tgl) evidently missed updating this comment.
Back-patch to 9.2 to avoid unnecessary divergence among branches.
Amit Langote
This adds a new GiST opclass method, 'fetch', which is used to reconstruct
the original Datum from the value stored in the index. Also, the 'canreturn'
index AM interface function gains a new 'attno' argument. That makes it
possible to use index-only scans on a multi-column index where some of the
opclasses support index-only scans but some do not.
This patch adds support in the box and point opclasses. Other opclasses
can added later as follow-on patches (btree_gist would be particularly
interesting).
Anastasia Lubennikova, with additional fixes and modifications by me.
While poking at David Kubečka's issue I noticed an ancient logic error
in get_loop_count(): it used 1.0 as a "no data yet" indicator, but since
that is actually a valid rowcount estimate, this doesn't work. If we
have one input relation with 1.0 as rowcount and then another one with
a larger rowcount, we should use 1.0 as the result, but we picked the
larger rowcount instead. (I think when I coded this, I recognized the
conflict, but mistakenly thought that the logic would pick the desired
count anyway.)
Fixing this changed the plan for one existing regression test case.
Since the point of that test is to exercise creation of a particular
shape of nestloop plan, I tweaked the query a little bit so it still
results in the same plan choice.
This is definitely a bug, but I'm hesitant to back-patch since it might
change plan choices unexpectedly, and anyway failure to implement a
heuristic precisely as intended is a pretty low-grade bug.
If we have a semijoin, say
SELECT * FROM x WHERE x1 IN (SELECT y1 FROM y)
and we're estimating the cost of a parameterized indexscan on x, the number
of repetitions of the indexscan should not be taken as the size of y; it'll
really only be the number of distinct values of y1, because the only valid
plan with y on the outside of a nestloop would require y to be unique-ified
before joining it to x. Most of the time this doesn't make that much
difference, but sometimes it can lead to drastically underestimating the
cost of the indexscan and hence choosing a bad plan, as pointed out by
David Kubečka.
Fixing this is a bit difficult because parameterized indexscans are costed
out quite early in the planning process, before we have the information
that would be needed to call estimate_num_groups() and thereby estimate the
number of distinct values of the join column(s). However we can move the
code that extracts a semijoin RHS's unique-ification columns, so that it's
done in initsplan.c rather than on-the-fly in create_unique_path(). That
shouldn't make any difference speed-wise and it's really a bit cleaner too.
The other bit of information we need is the size of the semijoin RHS,
which is easy if it's a single relation (we make those estimates before
considering indexscan costs) but problematic if it's a join relation.
The solution adopted here is just to use the product of the sizes of the
join component rels. That will generally be an overestimate, but since
estimate_num_groups() only uses this input as a clamp, an overestimate
shouldn't hurt us too badly. In any case we don't allow this new logic
to produce a value larger than we would have chosen before, so that at
worst an overestimate leaves us no wiser than we were before.
This code relied on pointer equality to identify which restriction clauses
also appear in the indexquals (and, therefore, don't need to be applied as
simple filter conditions). That was okay once upon a time, years ago,
before we introduced the equivalence-class machinery. Now there's about a
50-50 chance that an equality clause appearing in the indexquals will be
the mirror image (commutator) of its mate in the restriction list. When
that happens, we'd erroneously think that the clause would be re-evaluated
at each visited row, and therefore inflate the cost estimate for the
indexscan by the clause's cost.
Add some logic to catch this case. It seems to me that it continues not to
be worthwhile to expend the extra predicate-proof work that createplan.c
will do on the finally-selected plan, but this case is common enough and
cheap enough to handle that we should do so.
This will make a small difference (about one cpu_operator_cost per row)
in simple cases; but in situations where there's an expensive function in
the indexquals, it can make a very large difference, as seen in recent
example from Jeff Janes.
This is a long-standing bug, but I'm hesitant to back-patch because of the
possibility of destabilizing plan choices that people may be happy with.
Part of the intent of the parameterized-path mechanism was to handle
star-schema queries efficiently, but some overly-restrictive search
limiting logic added in commit e2fa76d80ba571d4de8992de6386536867250474
prevented such cases from working as desired. Fix that and add a
regression test about it. Per gripe from Marc Cousin.
This is arguably a bug rather than a new feature, so back-patch to 9.2
where parameterized paths were introduced.
This requires changing quite a few places that were depending on
sizeof(HeapTupleHeaderData), but it seems for the best.
Michael Paquier, some adjustments by me
This patch adds a function that replaces a bms_membership() test followed
by a bms_singleton_member() call, performing both the test and the
extraction of a singleton set's member in one scan of the bitmapset.
The performance advantage over the old way is probably minimal in current
usage, but it seems worthwhile on notational grounds anyway.
David Rowley
This patch adds a way of iterating through the members of a bitmapset
nondestructively, unlike the old way with bms_first_member(). While
bms_next_member() is very slightly slower than bms_first_member()
(at least for typical-size bitmapsets), eliminating the need to palloc
and pfree a temporary copy of the target bitmapset is a significant win.
So this method should be preferred in all cases where a temporary copy
would be necessary.
Tom Lane, with suggestions from Dean Rasheed and David Rowley
As pointed out by Robert, we should really have named pg_rowsecurity
pg_policy, as the objects stored in that catalog are policies. This
patch fixes that and updates the column names to start with 'pol' to
match the new catalog name.
The security consideration for COPY with row level security, also
pointed out by Robert, has also been addressed by remembering and
re-checking the OID of the relation initially referenced during COPY
processing, to make sure it hasn't changed under us by the time we
finish planning out the query which has been built.
Robert and Alvaro also commented on missing OCLASS and OBJECT entries
for POLICY (formerly ROWSECURITY or POLICY, depending) in various
places. This patch fixes that too, which also happens to add the
ability to COMMENT on policies.
In passing, attempt to improve the consistency of messages, comments,
and documentation as well. This removes various incarnations of
'row-security', 'row-level security', 'Row-security', etc, in favor
of 'policy', 'row level security' or 'row_security' as appropriate.
Happy Thanksgiving!
Instead of register_custom_path_provider and a CreateCustomScanPath
callback, let's just provide a standard function hook in set_rel_pathlist.
This is more flexible than what was previously committed, is more like the
usual conventions for planner hooks, and requires less support code in the
core. We had discussed this design (including centralizing the
set_cheapest() calls) back in March or so, so I'm not sure why it wasn't
done like this already.
This allows extension modules to define their own methods for
scanning a relation, and get the core code to use them. It's
unclear as yet how much use this capability will find, but we
won't find out if we never commit it.
KaiGai Kohei, reviewed at various times and in various levels
of detail by Shigeru Hanada, Tom Lane, Andres Freund, Álvaro
Herrera, and myself.
Since we taught btree to handle ScalarArrayOpExpr quals natively (commit
9e8da0f75731aaa7605cf4656c21ea09e84d2eb1), the planner has always included
ScalarArrayOpExpr quals in index conditions if possible. However, if the
qual is for a non-first index column, this could result in an inferior plan
because we can no longer take advantage of index ordering (cf. commit
807a40c551dd30c8dd5a0b3bd82f5bbb1e7fd285). It can be better to omit the
ScalarArrayOpExpr qual from the index condition and let it be done as a
filter, so that the output doesn't need to get sorted. Indeed, this is
true for the query introduced as a test case by the latter commit.
To fix, restructure get_index_paths and build_index_paths so that we
consider paths both with and without ScalarArrayOpExpr quals in non-first
index columns. Redesign the API of build_index_paths so that it reports
what it found, saving useless second or third calls.
Report and patch by Andrew Gierth (though rather heavily modified by me).
Back-patch to 9.2 where this code was introduced, since the issue can
result in significant performance regressions compared to plans produced
by 9.1 and earlier.
As of commit a87c72915 (which later got backpatched as far as 9.1),
we're explicitly supporting the notion that append relations can be
nested; this can occur when UNION ALL constructs are nested, or when
a UNION ALL contains a table with inheritance children.
Bug #11457 from Nelson Page, as well as an earlier report from Elvis
Pranskevichus, showed that there were still nasty bugs associated with such
cases: in particular the EquivalenceClass mechanism could try to generate
"join" clauses connecting an appendrel child to some grandparent appendrel,
which would result in assertion failures or bogus plans.
Upon investigation I concluded that all current callers of
find_childrel_appendrelinfo() need to be fixed to explicitly consider
multiple levels of parent appendrels. The most complex fix was in
processing of "broken" EquivalenceClasses, which are ECs for which we have
been unable to generate all the derived equality clauses we would like to
because of missing cross-type equality operators in the underlying btree
operator family. That code path is more or less entirely untested by
the regression tests to date, because no standard opfamilies have such
holes in them. So I wrote a new regression test script to try to exercise
it a bit, which turned out to be quite a worthwhile activity as it exposed
existing bugs in all supported branches.
The present patch is essentially the same as far back as 9.2, which is
where parameterized paths were introduced. In 9.0 and 9.1, we only need
to back-patch a small fragment of commit 5b7b5518d, which fixes failure to
propagate out the original WHERE clauses when a broken EC contains constant
members. (The regression test case results show that these older branches
are noticeably stupider than 9.2+ in terms of the quality of the plans
generated; but we don't really care about plan quality in such cases,
only that the plan not be outright wrong. A more invasive fix in the
older branches would not be a good idea anyway from a plan-stability
standpoint.)
We can allow this even without any specific knowledge of the semantics
of the window function, so long as pushed-down quals will either accept
every row in a given window partition, or reject every such row. Because
window functions act only within a partition, such a case can't result
in changing the window functions' outputs for any surviving row.
Eliminating entire partitions in this way obviously can reduce the cost
of the window-function computations substantially.
The fly in the ointment is that it's hard to be entirely sure whether
this is true for an arbitrary qual condition. This patch allows pushdown
if (a) the qual references only partitioning columns, and (b) the qual
contains no volatile functions. We are at risk of incorrect results if
the qual can produce different answers for values that the partitioning
equality operator sees as equal. While it's not hard to invent cases
for which that can happen, it seems to seldom be a problem in practice,
since no one has complained about a similar assumption that we've had
for many years with respect to DISTINCT. The potential performance
gains seem to be worth the risk.
David Rowley, reviewed by Vik Fearing; some credit is due also to
Thomas Mayer who did considerable preliminary investigation.
A WHERE clause applied to the output of a subquery with DISTINCT should
theoretically be applied only once per distinct row; but if we push it
into the subquery then it will be evaluated at each row before duplicate
elimination occurs. If the qual is volatile this can give rise to
observably wrong results, so don't do that.
While at it, refactor a little bit to allow subquery_is_pushdown_safe
to report more than one kind of restrictive condition without indefinitely
expanding its argument list.
Although this is a bug fix, it seems unwise to back-patch it into released
branches, since it might de-optimize plans for queries that aren't giving
any trouble in practice. So apply to 9.4 but not further back.
If a sub-select-in-FROM gets flattened into the upper query, then we
naturally get rid of any output columns that are defined in the sub-select
text but not actually used in the upper query. However, this doesn't
happen when it's not possible to flatten the subquery, for example because
it contains GROUP BY, LIMIT, etc. Allowing the subquery to compute useless
output columns is often fairly harmless, but sometimes it has significant
performance cost: the unused output might be an expensive expression,
or it might be a Var from a relation that we could remove entirely (via
the join-removal logic) if only we realized that we didn't really need
that Var. Situations like this are common when expanding views, so it
seems worth taking the trouble to detect and remove unused outputs.
Because the upper query's Var numbering for subquery references depends on
positions in the subquery targetlist, we don't want to renumber the items
we leave behind. Instead, we can implement "removal" by replacing the
unwanted expressions with simple NULL constants. This wastes a few cycles
at runtime, but not enough to justify more work in the planner.
This reverts commit ee1e5662d8d8330726eaef7d3110cb7add24d058, as well as
a remarkably large number of followup commits, which were mostly concerned
with the fact that the implementation didn't work terribly well. It still
doesn't: we probably need some rather basic work in the GUC infrastructure
if we want to fully support GUCs whose default varies depending on the
value of another GUC. Meanwhile, it also emerged that there wasn't really
consensus in favor of the definition the patch tried to implement (ie,
effective_cache_size should default to 4 times shared_buffers). So whack
it all back to where it was. In a followup commit, I'll do what was
recently agreed to, which is to simply change the default to a higher
value.
The original coding of EquivalenceClasses didn't foresee that appendrel
child relations might themselves be appendrels; but this is possible for
example when a UNION ALL subquery scans a table with inheritance children.
The oversight led to failure to optimize ordering-related issues very well
for the grandchild tables. After some false starts involving explicitly
flattening the appendrel representation, we found that this could be fixed
easily by removing a few implicit assumptions about appendrel parent rels
not being children themselves.
Kyotaro Horiguchi and Tom Lane, reviewed by Noah Misch
The previous method was overly complex and underly correct; in particular,
by assigning the default value with PGC_S_OVERRIDE, it prevented later
attempts to change the setting in postgresql.conf, as noted by Jeff Janes.
We should just assign the default value with source PGC_S_DYNAMIC_DEFAULT,
which will have the desired priority relative to the boot_val as well as
user-set values.
There is still a gap in this method: if there's an explicit assignment of
effective_cache_size = -1 in the postgresql.conf file, and that assignment
appears before shared_buffers is assigned, the code will substitute 4 times
the bootstrap default for shared_buffers, and that value will then persist
(since it will have source PGC_S_FILE). I don't see any very nice way
to avoid that though, and it's not a case to be expected in practice.
The existing comments in guc-file.l look forward to a redesign of the
DYNAMIC_DEFAULT mechanism; if that ever happens, we should consider this
case as one of the things we'd like to improve.
Fix integer overflow issue noted by Magnus Hagander, as well as a bunch
of other infelicities in commit ee1e5662d8d8330726eaef7d3110cb7add24d058
and its unreasonably large number of followups.
We don't need make_restrictinfo_from_bitmapqual() anymore at all.
generate_bitmap_or_paths() doesn't need to be exported, and we can
drop its rather klugy restriction_only flag.
It's possible to extract a restriction OR clause from a join clause that
has the form of an OR-of-ANDs, if each sub-AND includes a clause that
mentions only one specific relation. While PG has been aware of that idea
for many years, the code previously only did it if it could extract an
indexable OR clause. On reflection, though, that seems a silly limitation:
adding a restriction clause can be a win by reducing the number of rows
that have to be filtered at the join step, even if we have to test the
clause as a plain filter clause during the scan. This should be especially
useful for foreign tables, where the change can cut the number of rows that
have to be retrieved from the foreign server; but testing shows it can win
even on local tables. Per a suggestion from Robert Haas.
As a heuristic, I made the code accept an extracted restriction clause
if its estimated selectivity is less than 0.9, which will probably result
in accepting extracted clauses just about always. We might need to tweak
that later based on experience.
Since the code no longer has even a weak connection to Path creation,
remove orindxpath.c and create a new file optimizer/util/orclauses.c.
There's some additional janitorial cleanup of now-dead code that needs
to happen, but it seems like that's a fit subject for a separate commit.
This patch adds the ability to write TABLE( function1(), function2(), ...)
as a single FROM-clause entry. The result is the concatenation of the
first row from each function, followed by the second row from each
function, etc; with NULLs inserted if any function produces fewer rows than
others. This is believed to be a much more useful behavior than what
Postgres currently does with multiple SRFs in a SELECT list.
This syntax also provides a reasonable way to combine use of column
definition lists with WITH ORDINALITY: put the column definition list
inside TABLE(), where it's clear that it doesn't control the ordinality
column as well.
Also implement SQL-compliant multiple-argument UNNEST(), by turning
UNNEST(a,b,c) into TABLE(unnest(a), unnest(b), unnest(c)).
The SQL standard specifies TABLE() with only a single function, not
multiple functions, and it seems to require an implicit UNNEST() which is
not what this patch does. There may be something wrong with that reading
of the spec, though, because if it's right then the spec's TABLE() is just
a pointless alternative spelling of UNNEST(). After further review of
that, we might choose to adopt a different syntax for what this patch does,
but in any case this functionality seems clearly worthwhile.
Andrew Gierth, reviewed by Zoltán Böszörményi and Heikki Linnakangas, and
significantly revised by me
Bug #8591 from Claudio Freire demonstrates that get_eclass_for_sort_expr
must be able to compute valid em_nullable_relids for any new equivalence
class members it creates. I'd worried about this in the commit message
for db9f0e1d9a4a0842c814a464cdc9758c3f20b96c, but claimed that it wasn't a
problem because multi-member ECs should already exist when it runs. That
is transparently wrong, though, because this function is also called by
initialize_mergeclause_eclasses, which runs during deconstruct_jointree.
The example given in the bug report (which the new regression test item
is based upon) fails because the COALESCE() expression is first seen by
initialize_mergeclause_eclasses rather than process_equivalence.
Fixing this requires passing the appropriate nullable_relids set to
get_eclass_for_sort_expr, and it requires new code to compute that set
for top-level expressions such as ORDER BY, GROUP BY, etc. We store
the top-level nullable_relids in a new field in PlannerInfo to avoid
computing it many times. In the back branches, I've added the new
field at the end of the struct to minimize ABI breakage for planner
plugins. There doesn't seem to be a good alternative to changing
get_eclass_for_sort_expr's API signature, though. There probably aren't
any third-party extensions calling that function directly; moreover,
if there are, they probably need to think about what to pass for
nullable_relids anyway.
Back-patch to 9.2, like the previous patch in this area.
The planner largely failed to consider the possibility that a
PlaceHolderVar's expression might contain a lateral reference to a Var
coming from somewhere outside the PHV's syntactic scope. We had a previous
report of a problem in this area, which I tried to fix in a quick-hack way
in commit 4da6439bd8553059766011e2a42c6e39df08717f, but Antonin Houska
pointed out that there were still some problems, and investigation turned
up other issues. This patch largely reverts that commit in favor of a more
thoroughly thought-through solution. The new theory is that a PHV's
ph_eval_at level cannot be higher than its original syntactic level. If it
contains lateral references, those don't change the ph_eval_at level, but
rather they create a lateral-reference requirement for the ph_eval_at join
relation. The code in joinpath.c needs to handle that.
Another issue is that createplan.c wasn't handling nested PlaceHolderVars
properly.
In passing, push knowledge of lateral-reference checks for join clauses
into join_clause_is_movable_to. This is mainly so that FDWs don't need
to deal with it.
This patch doesn't fix the original join-qual-placement problem reported by
Jeremy Evans (and indeed, one of the new regression test cases shows the
wrong answer because of that). But the PlaceHolderVar problems need to be
fixed before that issue can be addressed, so committing this separately
seems reasonable.
This is SQL-standard with a few extensions, namely support for
subqueries and outer references in clause expressions.
catversion bump due to change in Aggref and WindowFunc.
David Fetter, reviewed by Dean Rasheed.
The code in set_append_rel_pathlist() for building parameterized paths
for append relations (inheritance and UNION ALL combinations) supposed
that the cheapest regular path for a child relation would still be cheapest
when reparameterized. Which might not be the case, particularly if the
added join conditions are expensive to compute, as in a recent example from
Jeff Janes. Fix it to compare child path costs *after* reparameterizing.
We can short-circuit that if the cheapest pre-existing path is already
parameterized correctly, which seems likely to be true often enough to be
worth checking for.
Back-patch to 9.2 where parameterized paths were introduced.
The planner is aware that it mustn't push down upper-level quals into
subqueries if the quals reference subquery output columns that contain
set-returning functions or volatile functions, or are non-DISTINCT outputs
of a DISTINCT ON subquery. However, it missed making this check when
there were one or more levels of UNION or INTERSECT above the dangerous
expression. This could lead to "set-valued function called in context that
cannot accept a set" errors, as seen in bug #8213 from Eric Soroos, or to
silently wrong answers in the other cases.
To fix, refactor the checks so that we make the column-is-unsafe checks
during subquery_is_pushdown_safe(), which already has to recursively
inspect all arms of a set-operation tree. This makes
qual_is_pushdown_safe() considerably simpler, at the cost that we will
spend some cycles checking output columns that possibly aren't referenced
in any upper qual. But the cases where this code gets executed at all
are already nontrivial queries, so it's unlikely anybody will notice any
slowdown of planning.
This has been broken since commit 05f916e6add9726bf4ee046e4060c1b03c9961f2,
which makes the bug over ten years old. A bit surprising nobody noticed it
before now.
This reverts the code changes in 50c137487c96e629e0e5372bb3d1b5f1a2f71a88,
which turned out to induce crashes and not completely fix the problem
anyway. That commit only considered single subqueries that were excluded
by constraint-exclusion logic, but actually the problem also exists for
subqueries that are appendrel members (ie part of a UNION ALL list). In
such cases we can't add a dummy subpath to the appendrel's AppendPath list
without defeating the logic that recognizes when an appendrel is completely
excluded. Instead, fix the problem by having setrefs.c scan the rangetable
an extra time looking for subqueries that didn't get into the plan tree.
(This approach depends on the 9.2 change that made set_subquery_pathlist
generate dummy paths for excluded single subqueries, so that the exclusion
behavior is the same for single subqueries and appendrel members.)
Note: it turns out that the appendrel form of the missed-permissions-checks
bug exists as far back as 8.4. However, since the practical effect of that
bug seems pretty minimal, consensus is to not attempt to fix it in the back
branches, at least not yet. Possibly we could back-port this patch once
it's gotten a reasonable amount of testing in HEAD. For the moment I'm
just going to revert the previous patch in 9.2.
A view defined as "select <something> where false" had the curious property
that the system wouldn't check whether users had the privileges necessary
to select from it. More generally, permissions checks could be skipped
for tables referenced in sub-selects or views that were proven empty by
constraint exclusion (although some quick testing suggests this seldom
happens in cases of practical interest). This happened because the planner
failed to include rangetable entries for such tables in the finished plan.
This was noticed in connection with erroneous handling of materialized
views, but actually the issue is quite unrelated to matviews. Therefore,
revert commit 200ba1667b3a8d7a9d559d2f05f83d209c9d8267 in favor of a more
direct test for the real problem.
Back-patch to 9.2 where the bug was introduced (by commit
7741dd6590073719688891898e85f0cb73453159).
This patch gets rid of the concept of, and infrastructure for,
non-canonical PathKeys; we now only ever create canonical pathkey lists.
The need for non-canonical pathkeys came from the desire to have
grouping_planner initialize query_pathkeys and related pathkey lists before
calling query_planner. However, since query_planner didn't actually *do*
anything with those lists before they'd been made canonical, we can get rid
of the whole mess by just not creating the lists at all until the point
where we formerly canonicalized them.
There are several ways in which we could implement that without making
query_planner itself deal with grouping/sorting features (which are
supposed to be the province of grouping_planner). I chose to add a
callback function to query_planner's API; other alternatives would have
required adding more fields to PlannerInfo, which while not bad in itself
would create an ABI break for planner-related plugins in the 9.2 release
series. This still breaks ABI for anything that calls query_planner
directly, but it seems somewhat unlikely that there are any such plugins.
I had originally conceived of this change as merely a step on the way to
fixing bug #8049 from Teun Hoogendoorn; but it turns out that this fixes
that bug all by itself, as per the added regression test. The reason is
that now get_eclass_for_sort_expr is adding the ORDER BY expression at the
end of EquivalenceClass creation not the start, and so anything that is in
a multi-member EquivalenceClass has already been created with correct
em_nullable_relids. I am suspicious that there are related scenarios in
which we still need to teach get_eclass_for_sort_expr to compute correct
nullable_relids, but am not eager to risk destabilizing either 9.2 or 9.3
to fix bugs that are only hypothetical. So for the moment, do this and
stop here.
Back-patch to 9.2 but not to earlier branches, since they don't exhibit
this bug for lack of join-clause-movement logic that depends on
em_nullable_relids being correct. (We might have to revisit that choice
if any related bugs turn up.) In 9.2, don't change the signature of
make_pathkeys_for_sortclauses nor remove canonicalize_pathkeys, so as
not to risk more plugin breakage than we have to.
In commit 0f61d4dd1b4f95832dcd81c9688dac56fd6b5687, I added code to copy up
column width estimates for each column of a subquery. That code supposed
that the subquery couldn't have any output columns that didn't correspond
to known columns of the current query level --- which is true when a query
is parsed from scratch, but the assumption fails when planning a view that
depends on another view that's been redefined (adding output columns) since
the upper view was made. This results in an assertion failure or even a
crash, as per bug #8025 from lindebg. Remove the Assert and instead skip
the column if its resno is out of the expected range.
I wasn't going to ship this without having at least some example of how
to do that. This version isn't terribly bright; in particular it won't
consider any combinations of multiple join clauses. Given the cost of
executing a remote EXPLAIN, I'm not sure we want to be very aggressive
about doing that, anyway.
In support of this, refactor generate_implied_equalities_for_indexcol
so that it can be used to extract equivalence clauses that aren't
necessarily tied to an index.
This saves several catalog lookups per reference. It's not all that
exciting right now, because we'd managed to minimize the number of places
that need to fetch the data; but the upcoming writable-foreign-tables patch
needs this info in a lot more places.
In a query such as "SELECT DISTINCT min(x) FROM tab", the DISTINCT is
pretty useless (there being only one output row), but nonetheless it
shouldn't fail. But it could fail if "tab" is an inheritance parent,
because planagg.c's code for fixing up equivalence classes after making the
index-optimized MIN/MAX transformation wasn't prepared to find child-table
versions of the aggregate expression. The least ugly fix seems to be
to add an option to mutate_eclass_expressions() to skip child-table
equivalence class members, which aren't used anymore at this stage of
planning so it's not really necessary to fix them. Since child members
are ignored in many cases already, it seems plausible for
mutate_eclass_expressions() to have an option to ignore them too.
Per bug #7703 from Maxim Boguk.
Back-patch to 9.1. Although the same code exists before that, it cannot
encounter child-table aggregates AFAICS, because the index optimization
transformation cannot succeed on inheritance trees before 9.1 (for lack
of MergeAppend).
Traditionally check_partial_indexes() has only looked at restriction
clauses while trying to prove partial indexes usable in queries. However,
join clauses can also be used in some cases; mainly, that a strict operator
on "x" proves an "x IS NOT NULL" index predicate, even if the operator is
in a join clause rather than a restriction clause. Adding this code fixes
a regression in 9.2, because previously we would take join clauses into
account when considering whether a partial index could be used in a
nestloop inner indexscan path. 9.2 doesn't handle nestloop inner
indexscans in the same way, and this consideration was overlooked in the
rewrite. Moving the work to check_partial_indexes() is a better solution
anyway, since the proof applies whether or not we actually use the index
in that particular way, and we don't have to do it over again for each
possible outer relation. Per report from Dave Cramer.
This function currently lacks the option to throw error if the provided
targetlist doesn't have any matching entry for a Var to be replaced.
Two of the four existing call sites would be better off with an error,
as would the usage in the pending auto-updatable-views patch, so it seems
past time to extend the API to support that. To do so, replace the "event"
parameter (historically of type CmdType, though it was declared plain int)
with a special-purpose enum type.
It's unclear whether this function might be called by third-party code.
Since many C compilers wouldn't warn about a call site continuing to use
the old calling convention, rename the function to forcibly break any
such code that hasn't been updated. The old name was none too well chosen
anyhow.
In bug #7626, Brian Dunavant exposes a performance problem created by
commit 3b8968f25232ad09001bf35ab4cc59f5a501193e: that commit attempted to
consider *all* possible combinations of indexable join clauses, but if said
clauses join to enough different relations, there's an exponential increase
in the number of outer-relation sets considered.
In Brian's example, all the clauses come from the same equivalence class,
which means it's redundant to use more than one of them in an indexscan
anyway. So we can prevent the problem in this class of cases (which is
probably the majority of real examples) by rejecting combinations that
would only serve to add a known-redundant clause.
But that still leaves us exposed to exponential growth of planning time
when the query has a lot of non-equivalence join clauses that are usable
with the same index. I chose to prevent such cases by setting an upper
limit on the number of relation sets considered, equal to ten times the
number of index clauses considered so far. (This sliding limit still
allows new relsets to be added on as we move to additional index columns,
which is probably more important than considering even more combinations of
clauses for the previous column.) This should keep the amount of work done
roughly linear rather than exponential in the apparent query complexity.
This part of the fix is pretty ad-hoc; but without a clearer idea of
real-world cases for which this would result in markedly inferior plans,
it's hard to see how to do better.
generate_base_implied_equalities_const() should prefer plain Consts over
other em_is_const eclass members when choosing the "pivot" value that
all the other members will be equated to. This makes it more likely that
the generated equalities will be useful in constraint-exclusion proofs.
Per report from Rushabh Lathia.