1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-15 19:15:29 +03:00

2016 Commits

Author SHA1 Message Date
Amit Kapila
40ca70ebcc Allow using the updated tuple while moving it to a different partition.
An update that causes the tuple to be moved to a different partition was
missing out on re-constructing the to-be-updated tuple, based on the latest
tuple in the update chain.  Instead, it's simply deleting the latest tuple
and inserting a new tuple in the new partition based on the old tuple.
Commit 2f17844104 didn't consider this case, so some of the updates were
getting lost.

In passing, change the argument order for output parameter in ExecDelete
and add some commentary about it.

Reported-by: Pavan Deolasee
Author: Amit Khandekar, with minor changes by me
Reviewed-by: Dilip Kumar, Amit Kapila and Alvaro Herrera
Backpatch-through: 11
Discussion: https://postgr.es/m/CAJ3gD9fRbEzDqdeDq1jxqZUb47kJn+tQ7=Bcgjc8quqKsDViKQ@mail.gmail.com
2018-07-12 12:51:39 +05:30
Tom Lane
ff4f889164 Fix bugs with degenerate window ORDER BY clauses in GROUPS/RANGE mode.
nodeWindowAgg.c failed to cope with the possibility that no ordering
columns are defined in the window frame for GROUPS mode or RANGE OFFSET
mode, leading to assertion failures or odd errors, as reported by Masahiko
Sawada and Lukas Eder.  In RANGE OFFSET mode, an ordering column is really
required, so add an Assert about that.  In GROUPS mode, the code would
work, except that the node initialization code wasn't in sync with the
execution code about when to set up tuplestore read pointers and spare
slots.  Fix the latter for consistency's sake (even though I think the
changes described below make the out-of-sync cases unreachable for now).

Per SQL spec, a single ordering column is required for RANGE OFFSET mode,
and at least one ordering column is required for GROUPS mode.  The parser
enforced the former but not the latter; add a check for that.

We were able to reach the no-ordering-column cases even with fully spec
compliant queries, though, because the planner would drop partitioning
and ordering columns from the generated plan if they were redundant with
earlier columns according to the redundant-pathkey logic, for instance
"PARTITION BY x ORDER BY y" in the presence of a "WHERE x=y" qual.
While in principle that's an optimization that could save some pointless
comparisons at runtime, it seems unlikely to be meaningful in the real
world.  I think this behavior was not so much an intentional optimization
as a side-effect of an ancient decision to construct the plan node's
ordering-column info by reverse-engineering the PathKeys of the input
path.  If we give up redundant-column removal then it takes very little
code to generate the plan node info directly from the WindowClause,
ensuring that we have the expected number of ordering columns in all
cases.  (If anyone does complain about this, the planner could perhaps
be taught to remove redundant columns only when it's safe to do so,
ie *not* in RANGE OFFSET mode.  But I doubt anyone ever will.)

With these changes, the WindowAggPath.winpathkeys field is not used for
anything anymore, so remove it.

The test cases added here are not actually very interesting given the
removal of the redundant-column-removal logic, but they would represent
important corner cases if anyone ever tries to put that back.

Tom Lane and Masahiko Sawada.  Back-patch to v11 where RANGE OFFSET
and GROUPS modes were added.

Discussion: https://postgr.es/m/CAD21AoDrWqycq-w_+Bx1cjc+YUhZ11XTj9rfxNiNDojjBx8Fjw@mail.gmail.com
Discussion: https://postgr.es/m/153086788677.17476.8002640580496698831@wrigleys.postgresql.org
2018-07-11 12:07:20 -04:00
Peter Eisentraut
2e78c5b522 Fix assert in nested SQL procedure call
When executing CALL in PL/pgSQL, we need to set a snapshot before
invoking the to-be-called procedure.  Otherwise, the to-be-called
procedure might end up running without a snapshot.  For LANGUAGE SQL
procedures, this would result in an assertion failure.  (For most other
languages, this is usually not a problem, because those use SPI and SPI
sets snapshots in most cases.)  Setting the snapshot restores the
behavior of how CALL worked when it was handled as a generic SQL
statement in PL/pgSQL (exec_stmt_execsql()).

This change revealed another problem:  In SPI_commit(), we popped the
active snapshot before committing the transaction, to avoid "snapshot %p
still active" errors.  However, there is no particular reason why only
at most one snapshot should be on the stack.  So change this to pop all
active snapshots instead of only one.
2018-07-06 23:25:44 +02:00
Andrew Dunstan
1e9c858090 pgindent run prior to branching 2018-06-30 12:25:49 -04:00
Amit Kapila
2e61c50785 Fix thinko in comments.
A slot can not be stored in a tuple but it's vice versa.

Reported-by: Ashutosh Bapat
Author: Ashutosh Bapat
Reviewed-by: Amit Kapila
Discussion: https://postgr.es/m/CAFjFpRcHhNhXdegyJv3KKDWrwO1_NB_KYZM_ZSDeMOZaL1A5jQ@mail.gmail.com
2018-06-27 18:05:24 +05:30
Tom Lane
19832753f1 Fix some ill-chosen names for globally-visible partition support functions.
"compute_hash_value" is particularly gratuitously generic, but IMO
all of these ought to have names clearly related to partitioning.
2018-06-13 13:18:02 -04:00
Tom Lane
e23bae82cf Fix up run-time partition pruning's use of relcache's partition data.
The previous coding saved pointers into the partitioned table's relcache
entry, but then closed the relcache entry, causing those pointers to
nominally become dangling.  Actual trouble would be seen in the field
only if a relcache flush occurred mid-query, but that's hardly out of
the question.

While we could fix this by copying all the data in question at query
start, it seems better to just hold the relcache entry open for the
whole query.

While at it, improve the handling of support-function lookups: do that
once per query not once per pruning test.  There's still something to be
desired here, in that we fail to exploit the possibility of caching data
across queries in the fn_extra fields of the relcache's FmgrInfo structs,
which could happen if we just used those structs in-place rather than
copying them.  However, combining that with the possibility of per-query
lookups of cross-type comparison functions seems to require changes in the
APIs of a lot of the pruning support functions, so it's too invasive to
consider as part of this patch.  A win would ensue only for complex
partition key data types (e.g. arrays), so it may not be worth the
trouble.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/17850.1528755844@sss.pgh.pa.us
2018-06-13 12:03:26 -04:00
Tom Lane
69025c5a07 Improve ExecFindInitialMatchingSubPlans's subplan renumbering logic.
We don't need two passes if we scan child partitions before parents,
as that way the children's present_parts are up to date before they're
needed.  I (tgl) think there's actually a bug being fixed here, for the
case of an intermediate partitioned table with no direct leaf children,
but haven't attempted to construct a test case to prove it.

David Rowley

Discussion: https://postgr.es/m/CAKJS1f-6GODRNgEtdPxCnAPme2h2hTztB6LmtfdmcYAAOE0kQg@mail.gmail.com
2018-06-11 17:35:53 -04:00
Alvaro Herrera
5b0c7e2f75 Don't needlessly check the partition contraint twice
Starting with commit f0e44751d717, ExecConstraints was in charge of
running the partition constraint; commit 19c47e7c8202 modified that so
that caller could request to skip that checking depending on some
conditions, but that commit and 15ce775faa42 together introduced a small
bug there which caused ExecInsert to request skipping the constraint
check but have this not be honored -- in effect doing the check twice.
This could have been fixed in a very small patch, but on further
analysis of the involved function and its callsites, it turns out to be
simpler to give the responsibility of checking the partition constraint
fully to the caller, and return ExecConstraints to its original
(pre-partitioning) shape where it only checked tuple descriptor-related
constraints.  Each caller must do partition constraint checking on its
own schedule, which is more convenient after commit 2f178441044 anyway.

Reported-by: David Rowley
Author: David Rowley, Álvaro Herrera
Reviewed-by: Amit Langote, Amit Khandekar, Simon Riggs
Discussion: https://postgr.es/m/CAKJS1f8w8+awsxgea8wt7_UX8qzOQ=Tm1LD+U1fHqBAkXxkW2w@mail.gmail.com
2018-06-11 17:12:16 -04:00
Peter Eisentraut
387543f7bd Make new error code name match SQL standard more closely
Discussion: https://www.postgresql.org/message-id/dff3d555-bea4-ac24-29b2-29521b9d08e8%402ndquadrant.com
2018-06-11 11:15:28 -04:00
Tom Lane
321f648a31 Assorted cosmetic cleanup of run-time-partition-pruning code.
Use "subplan" rather than "subnode" to refer to the child plans of
a partitioning Append; this seems a bit more specific and hence
clearer.  Improve assorted comments.  No non-cosmetic changes.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/CAFj8pRBjrufA3ocDm8o4LPGNye9Y+pm1b9kCwode4X04CULG3g@mail.gmail.com
2018-06-10 18:24:34 -04:00
Tom Lane
939449de0e Relocate partition pruning structs to a saner place.
These struct definitions were originally dropped into primnodes.h,
which is a poor choice since that's mainly intended for primitive
expression node types; these are not in that category.  What they
are is auxiliary info in Plan trees, so move them to plannodes.h.

For consistency, also relocate some related code that was apparently
placed with the aid of a dartboard.

There's no interesting code changes in this commit, just reshuffling.

David Rowley and Tom Lane

Discussion: https://postgr.es/m/CAFj8pRBjrufA3ocDm8o4LPGNye9Y+pm1b9kCwode4X04CULG3g@mail.gmail.com
2018-06-10 16:30:14 -04:00
Tom Lane
73b7f48f78 Improve run-time partition pruning to handle any stable expression.
The initial coding of the run-time-pruning feature only coped with cases
where the partition key(s) are compared to Params.  That is a bit silly;
we can allow it to work with any non-Var-containing stable expression, as
long as we take special care with expressions containing PARAM_EXEC Params.
The code is hardly any longer this way, and it's considerably clearer
(IMO at least).  Per gripe from Pavel Stehule.

David Rowley, whacked around a bit by me

Discussion: https://postgr.es/m/CAFj8pRBjrufA3ocDm8o4LPGNye9Y+pm1b9kCwode4X04CULG3g@mail.gmail.com
2018-06-10 15:22:32 -04:00
Thomas Munro
86a2218eb0 Limit Parallel Hash's bucket array to MaxAllocSize.
Make sure that we don't exceed MaxAllocSize when increasing the number of
buckets.  Perhaps later we'll remove that limit and use DSA_ALLOC_HUGE, but
for now just prevent further increases like the non-parallel code.  This
change avoids the error from bug report #15225.

Author: Thomas Munro
Reviewed-By: Tom Lane
Reported-by: Frits Jalvingh
Discussion: https://postgr.es/m/152802081668.26724.16985037679312485972%40wrigleys.postgresql.org
2018-06-10 20:30:25 +12:00
Peter Eisentraut
acad8b409a Fix typo 2018-06-08 11:55:12 -04:00
Heikki Linnakangas
57f06a7611 Fix obsolete comment.
The 'orig_slot' argument was removed in commit c0a8ae7be392, but that
commit forgot to update the comment.

Author: Amit Langote
Discussion: https://www.postgresql.org/message-id/194ac4bf-7b4a-c887-bf26-bc1a85ea995a@lab.ntt.co.jp
2018-06-07 09:56:22 +03:00
Tom Lane
f755a152d4 Improve spelling of new FINALFUNC_MODIFY aggregate attribute.
I'd used SHARABLE as a value originally, but Peter Eisentraut points out
that dictionaries agree that SHAREABLE is the preferred spelling.
Run around and change that before it's too late.

Discussion: https://postgr.es/m/d2e1afd4-659c-50d6-1b20-7cfd3675e909@2ndquadrant.com
2018-05-21 11:41:42 -04:00
Robert Haas
b949bbcb7e Further adjust comment in get_partition_dispatch_recurse.
In editing 09b12d52db1cf1a4c72d876f3fb6c9d06919e51a I made it wrong;
fix that and try to more clearly explain the situation.

Patch by me, reviewed by David Rowley and Amit Langote

Discussion: http://postgr.es/m/CA+TgmobAq+mA5hzm0a5OS38qQY5758DDDGqa3sBJN4hvir-H9w@mail.gmail.com
2018-05-18 16:11:22 -04:00
Robert Haas
09b12d52db Improve comment in get_partition_dispatch_recurse.
David Rowley, reviewed by Amit Langote, and revised a bit by me.

Discussion: http://postgr.es/m/CAKJS1f9yyimYyFzbHM4EwE+tkj4jvrHqSH0H4S4Kbas=UFpc9Q@mail.gmail.com
2018-05-16 10:47:57 -04:00
Tom Lane
05ca21b878 Fix type checking for support functions of parallel VARIADIC aggregates.
The impact of VARIADIC on the combine/serialize/deserialize support
functions of an aggregate wasn't thought through carefully.  There is
actually no impact, because variadicity isn't passed through to these
functions (and it doesn't seem like it would need to be).  However,
lookup_agg_function was mistakenly told to check things as though it were
passed through.  The net result was that it was impossible to declare an
aggregate that had both VARIADIC input and parallelism support functions.

In passing, fix a runtime check in nodeAgg.c for the combine function's
strictness to make its error message agree with the creation-time check.
The previous message was actually backwards, and it doesn't seem like
there's a good reason to have two versions of this message text anyway.

Back-patch to 9.6 where parallel aggregation was introduced.

Alexey Bashtanov; message fix by me

Discussion: https://postgr.es/m/f86dde87-fef4-71eb-0480-62754aaca01b@imap.cc
2018-05-15 15:06:53 -04:00
Peter Eisentraut
30c66e77be Fix SPI error cleanup and memory leak
Since the SPI stack has been moved from TopTransactionContext to
TopMemoryContext, setting _SPI_stack to NULL in AtEOXact_SPI() leaks
memory.  In fact, we don't need to do that anymore: We just leave the
allocated stack around for the next SPI use.

Also, refactor the SPI cleanup so that it is run both at transaction end
and when returning to the main loop on an exception.  The latter is
necessary when a procedure calls a COMMIT or ROLLBACK command that
itself causes an error.
2018-05-03 08:39:15 -04:00
Tom Lane
41c912cad1 Clean up warnings from -Wimplicit-fallthrough.
Recent gcc can warn about switch-case fall throughs that are not
explicitly labeled as intentional.  This seems like a good thing,
so clean up the warnings exposed thereby by labeling all such
cases with comments that gcc will recognize.

In files that already had one or more suitable comments, I generally
matched the existing style of those.  Otherwise I went with
/* FALLTHROUGH */, which is one of the spellings approved at the
more-restrictive-than-default level -Wimplicit-fallthrough=4.
(At the default level you can also spell it /* FALL ?THRU */,
and it's not picky about case.  What you can't do is include
additional text in the same comment, so some existing comments
containing versions of this aren't good enough.)

Testing with gcc 8.0.1 (Fedora 28's current version), I found that
I also had to put explicit "break"s after elog(ERROR) or ereport(ERROR);
apparently, for this purpose gcc doesn't recognize that those don't
return.  That seems like possibly a gcc bug, but it's fine because
in most places we did that anyway; so this amounts to a visit from the
style police.

Discussion: https://postgr.es/m/15083.1525207729@sss.pgh.pa.us
2018-05-01 19:35:08 -04:00
Robert Haas
37a3058bc7 Fix interaction of foreign tuple routing with remote triggers.
Without these fixes, changes to the inserted tuple made by remote
triggers are ignored when building local RETURNING tuples.

In the core code, call ExecInitRoutingInfo at a later point from
within ExecInitPartitionInfo so that the FDW callback gets invoked
after the returning list has been built.  But move CheckValidResultRel
out of ExecInitRoutingInfo so that it can happen at an earlier stage.

In postgres_fdw, refactor assorted deparsing functions to work with
the RTE rather than the PlannerInfo, which saves us having to
construct a fake PlannerInfo in cases where we don't have a real one.
Then, we can pass down a constructed RTE that yields the correct
deparse result when no real one exists.  Unfortunately, this
necessitates a hack that understands how the core code manages RT
indexes for update tuple routing, which is ugly, but we don't have a
better idea right now.

Original report, analysis, and patch by Etsuro Fujita.  Heavily
refactored by me.  Then worked over some more by Amit Langote.

Discussion: http://postgr.es/m/5AD4882B.10002@lab.ntt.co.jp
2018-05-01 13:21:46 -04:00
Tom Lane
bdf46af748 Post-feature-freeze pgindent run.
Discussion: https://postgr.es/m/15719.1523984266@sss.pgh.pa.us
2018-04-26 14:47:16 -04:00
Alvaro Herrera
bd4aad3239 Update ExecInitPartitionInfo comment
Remove the words "if not already done."  This obsolete wording
corresponds to an early development version of what became edd44738bc8.

Author: Etsuro Fujita
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/5ADF117B.5030606@lab.ntt.co.jp
2018-04-24 23:00:48 -03:00
Alvaro Herrera
1957f8dabf Initialize ExprStates once in run-time partition pruning
Instead of doing ExecInitExpr every time a Param needs to be evaluated
in run-time partition pruning, do it once during run-time pruning
set-up and cache the exprstate in PartitionPruneContext, saving a lot of
work.

Author: David Rowley
Reviewed-by: Amit Langote, Álvaro Herrera
Discussion: https://postgr.es/m/CAKJS1f8-x+q-90QAPDu_okhQBV4DPEtPz8CJ=m0940GyT4DA4w@mail.gmail.com
2018-04-24 14:03:10 -03:00
Tom Lane
a66c03f698 Add missing "static" marker.
Per pademelon.
2018-04-21 11:21:18 -04:00
Alvaro Herrera
79b2e52615 Remove quick path in ExecInitPartitionInfo for equal tupdescs
I added this "optimization" on top of Amit Langote's 158b7bc6d779, but
the quick path is never taken because the partition uses a different
pg_type oid than its parent table (causing equalTupleDescs to return
false).  Changing that requires more analysis and is too considered
dangerous at this point in the cycle, so revert it.

We might make it work someday, but not for pg11.

Discussion: https://postgr.es/m/825031be-942c-8c24-6163-13c27f217a3d@lab.ntt.co.jp
2018-04-19 16:46:53 -03:00
Alvaro Herrera
b7e2cbc5b4 Update Append's idea of first_partial_plan
It turns out that after runtime partition pruning, Append's
first_partial_plan does not accurately represent partial plans to run,
if any of those got pruned.  This could limit participation of workers
in some partial subplans, if other subplans got pruned.  Fix it by
keeping an index of the first valid partial subplan in the state node,
determined at execnode Init time.

Author: David Rowley, with cosmetic changes by me.
Discussion: https://postgr.es/m/CAKJS1f8o2Yd=rOP=Et3A0FWgF+gSAOkFSU6eNhnGzTPV7nN8sQ@mail.gmail.com
2018-04-17 16:25:02 -03:00
Alvaro Herrera
158b7bc6d7 Ignore whole-rows in INSERT/CONFLICT with partitioned tables
We had an Assert() preventing whole-row expressions from being used in
the SET clause of INSERT ON CONFLICT, but it seems unnecessary, given
some tests, so remove it.  Add a new test to exercise the case.

Still at ExecInitPartitionInfo, we used map_partition_varattnos (which
constructs an attribute map, then calls map_variable_attnos) using
the same two relations many times in different expressions and with
different parameters.  Constructing the map over and over is a waste.
To avoid this repeated work, construct the map once, and use
map_variable_attnos() directly instead.

Author: Amit Langote, per comments by me (Álvaro)
Discussion: https://postgr.es/m/20180326142016.m4st5e34chrzrknk@alvherre.pgsql
2018-04-16 15:52:28 -03:00
Alvaro Herrera
da6f3e45dd Reorganize partitioning code
There's been a massive addition of partitioning code in PostgreSQL 11,
with little oversight on its placement, resulting in a
catalog/partition.c with poorly defined boundaries and responsibilities.
This commit tries to set a couple of distinct modules to separate things
a little bit.  There are no code changes here, only code movement.

There are three new files:
  src/backend/utils/cache/partcache.c
  src/include/partitioning/partdefs.h
  src/include/utils/partcache.h

The previous arrangement of #including catalog/partition.h almost
everywhere is no more.

Authors: Amit Langote and Álvaro Herrera
Discussion: https://postgr.es/m/98e8d509-790a-128c-be7f-e48a5b2d8d97@lab.ntt.co.jp
	https://postgr.es/m/11aa0c50-316b-18bb-722d-c23814f39059@lab.ntt.co.jp
	https://postgr.es/m/143ed9a4-6038-76d4-9a55-502035815e68@lab.ntt.co.jp
	https://postgr.es/m/20180413193503.nynq7bnmgh6vs5vm@alvherre.pgsql
2018-04-14 21:12:14 -03:00
Simon Riggs
08ea7a2291 Revert MERGE patch
This reverts commits d204ef63776b8a00ca220adec23979091564e465,
83454e3c2b28141c0db01c7d2027e01040df5249 and a few more commits thereafter
(complete list at the end) related to MERGE feature.

While the feature was fully functional, with sufficient test coverage and
necessary documentation, it was felt that some parts of the executor and
parse-analyzer can use a different design and it wasn't possible to do that in
the available time. So it was decided to revert the patch for PG11 and retry
again in the future.

Thanks again to all reviewers and bug reporters.

List of commits reverted, in reverse chronological order:

 f1464c5380 Improve parse representation for MERGE
 ddb4158579 MERGE syntax diagram correction
 530e69e59b Allow cpluspluscheck to pass by renaming variable
 01b88b4df5 MERGE minor errata
 3af7b2b0d4 MERGE fix variable warning in non-assert builds
 a5d86181ec MERGE INSERT allows only one VALUES clause
 4b2d44031f MERGE post-commit review
 4923550c20 Tab completion for MERGE
 aa3faa3c7a WITH support in MERGE
 83454e3c2b New files for MERGE
 d204ef6377 MERGE SQL Command following SQL:2016

Author: Pavan Deolasee
Reviewed-by: Michael Paquier
2018-04-12 11:22:56 +01:00
Alvaro Herrera
15a8f8caad Fix IndexOnlyScan counter for heap fetches in parallel mode
The HeapFetches counter was using a simple value in IndexOnlyScanState,
which fails to propagate values from parallel workers; so the counts are
wrong when IndexOnlyScan runs in parallel.  Move it to Instrumentation,
like all the other counters.

While at it, change INSERT ON CONFLICT conflicting tuple counter to use
the new ntuples2 instead of nfiltered2, which is a blatant misuse.

Discussion: https://postgr.es/m/20180409215851.idwc75ct2bzi6tea@alvherre.pgsql
2018-04-10 15:56:15 -03:00
Alvaro Herrera
468abb8f7a Fix incorrect logic for choosing the next Parallel Append subplan
In 499be013de support for pruning unneeded Append subnodes was added.
The logic in that commit was not correctly checking if the next subplan
was in fact a valid subplan. This could cause parallel workers processes
to be given a subplan to work on which didn't require any work.

Per code review following an otherwise unexplained regression failure in
buildfarm member Pademelon.  (We haven't been able to reproduce the
failure, so this is a bit of a blind fix in terms of whether it'll
actually fix it; but it is a clear bug nonetheless).

In passing, also add a comment to explain what first_partial_plan means.

Author: David Rowley
Discussion: https://postgr.es/m/CAKJS1f_E5r05hHUVG3UmCQJ49DGKKHtN=SHybD44LdzBn+CJng@mail.gmail.com
2018-04-09 17:23:49 -03:00
Alvaro Herrera
d7a95f06a1 Minor comment updates
Fix a couple of typos, and update a comment about why we set a BMS to
NULL.

Author: David Rowley
Discussion: http://postgr.es/m/CAKJS1f-tux=KdUz6ENJ9GHM_V2qgxysadYiOyQS9Ko9PTteVhQ@mail.gmail.com
2018-04-09 11:17:35 -03:00
Tom Lane
cefa387153 Merge catalog/pg_foo_fn.h headers back into pg_foo.h headers.
Traditionally, include/catalog/pg_foo.h contains extern declarations
for functions in backend/catalog/pg_foo.c, in addition to its function
as the authoritative definition of the pg_foo catalog's rowtype.
In some cases, we'd been forced to split out those extern declarations
into separate pg_foo_fn.h headers so that the catalog definitions
could be #include'd by frontend code.  That problem is gone as of
commit 9c0a0de4c, so let's undo the splits to make things less
confusing.

Discussion: https://postgr.es/m/23690.1523031777@sss.pgh.pa.us
2018-04-08 14:35:29 -04:00
Alvaro Herrera
499be013de Support partition pruning at execution time
Existing partition pruning is only able to work at plan time, for query
quals that appear in the parsed query.  This is good but limiting, as
there can be parameters that appear later that can be usefully used to
further prune partitions.

This commit adds support for pruning subnodes of Append which cannot
possibly contain any matching tuples, during execution, by evaluating
Params to determine the minimum set of subnodes that can possibly match.
We support more than just simple Params in WHERE clauses. Support
additionally includes:

1. Parameterized Nested Loop Joins: The parameter from the outer side of the
   join can be used to determine the minimum set of inner side partitions to
   scan.

2. Initplans: Once an initplan has been executed we can then determine which
   partitions match the value from the initplan.

Partition pruning is performed in two ways.  When Params external to the plan
are found to match the partition key we attempt to prune away unneeded Append
subplans during the initialization of the executor.  This allows us to bypass
the initialization of non-matching subplans meaning they won't appear in the
EXPLAIN or EXPLAIN ANALYZE output.

For parameters whose value is only known during the actual execution
then the pruning of these subplans must wait.  Subplans which are
eliminated during this stage of pruning are still visible in the EXPLAIN
output.  In order to determine if pruning has actually taken place, the
EXPLAIN ANALYZE must be viewed.  If a certain Append subplan was never
executed due to the elimination of the partition then the execution
timing area will state "(never executed)".  Whereas, if, for example in
the case of parameterized nested loops, the number of loops stated in
the EXPLAIN ANALYZE output for certain subplans may appear lower than
others due to the subplan having been scanned fewer times.  This is due
to the list of matching subnodes having to be evaluated whenever a
parameter which was found to match the partition key changes.

This commit required some additional infrastructure that permits the
building of a data structure which is able to perform the translation of
the matching partition IDs, as returned by get_matching_partitions, into
the list index of a subpaths list, as exist in node types such as
Append, MergeAppend and ModifyTable.  This allows us to translate a list
of clauses into a Bitmapset of all the subpath indexes which must be
included to satisfy the clause list.

Author: David Rowley, based on an earlier effort by Beena Emerson
Reviewers: Amit Langote, Robert Haas, Amul Sul, Rajkumar Raghuwanshi,
Jesper Pedersen
Discussion: https://postgr.es/m/CAOG9ApE16ac-_VVZVvv0gePSgkg_BwYEV1NBqZFqDR2bBE0X0A@mail.gmail.com
2018-04-07 17:54:39 -03:00
Andres Freund
f16241bef7 Raise error when affecting tuple moved into different partition.
When an update moves a row between partitions (supported since
2f178441044b), our normal logic for following update chains in READ
COMMITTED mode doesn't work anymore. Cross partition updates are
modeled as an delete from the old and insert into the new
partition. No ctid chain exists across partitions, and there's no
convenient space to introduce that link.

Not throwing an error in a partitioned context when one would have
been thrown without partitioning is obviously problematic. This commit
introduces infrastructure to detect when a tuple has been moved, not
just plainly deleted. That allows to throw an error when encountering
a deletion that's actually a move, while attempting to following a
ctid chain.

The row deleted as part of a cross partition update is marked by
pointing it's t_ctid to an invalid block, instead of self as a normal
update would.  That was deemed to be the least invasive and most
future proof way to represent the knowledge, given how few infomask
bits are there to be recycled (there's also some locking issues with
using infomask bits).

External code following ctid chains should be updated to check for
moved tuples. The most likely consequence of not doing so is a missed
error.

Author: Amul Sul, editorialized by me
Reviewed-By: Amit Kapila, Pavan Deolasee, Andres Freund, Robert Haas
Discussion: http://postgr.es/m/CAAJ_b95PkwojoYfz0bzXU8OokcTVGzN6vYGCNVUukeUDrnF3dw@mail.gmail.com
2018-04-07 13:24:27 -07:00
Teodor Sigaev
8224de4f42 Indexes with INCLUDE columns and their support in B-tree
This patch introduces INCLUDE clause to index definition.  This clause
specifies a list of columns which will be included as a non-key part in
the index.  The INCLUDE columns exist solely to allow more queries to
benefit from index-only scans.  Also, such columns don't need to have
appropriate operator classes.  Expressions are not supported as INCLUDE
columns since they cannot be used in index-only scans.

Index access methods supporting INCLUDE are indicated by amcaninclude flag
in IndexAmRoutine.  For now, only B-tree indexes support INCLUDE clause.

In B-tree indexes INCLUDE columns are truncated from pivot index tuples
(tuples located in non-leaf pages and high keys).  Therefore, B-tree indexes
now might have variable number of attributes.  This patch also provides
generic facility to support that: pivot tuples contain number of their
attributes in t_tid.ip_posid.  Free 13th bit of t_info is used for indicating
that.  This facility will simplify further support of index suffix truncation.
The changes of above are backward-compatible, pg_upgrade doesn't need special
handling of B-tree indexes for that.

Bump catalog version

Author: Anastasia Lubennikova with contribition by Alexander Korotkov and me
Reviewed by: Peter Geoghegan, Tomas Vondra, Antonin Houska, Jeff Janes,
			 David Rowley, Alexander Korotkov
Discussion: https://www.postgresql.org/message-id/flat/56168952.4010101@postgrespro.ru
2018-04-07 23:00:39 +03:00
Robert Haas
3d956d9562 Allow insert and update tuple routing and COPY for foreign tables.
Also enable this for postgres_fdw.

Etsuro Fujita, based on an earlier patch by Amit Langote. The larger
patch series of which this is a part has been reviewed by Amit
Langote, David Fetter, Maksim Milyutin, Álvaro Herrera, Stephen Frost,
and me.  Minor documentation changes to the final version by me.

Discussion: http://postgr.es/m/29906a26-da12-8c86-4fb9-d8f88442f2b9@lab.ntt.co.jp
2018-04-06 19:22:03 -04:00
Simon Riggs
01b88b4df5 MERGE minor errata 2018-04-05 13:19:13 +01:00
Simon Riggs
3af7b2b0d4 MERGE fix variable warning in non-assert builds
Author: Jesper Pedersen
2018-04-05 13:02:29 +01:00
Simon Riggs
4b2d44031f MERGE post-commit review
Review comments from Andres Freund

* Consolidate code into AfterTriggerGetTransitionTable()
* Rename nodeMerge.c to execMerge.c
* Rename nodeMerge.h to execMerge.h
* Move MERGE handling in ExecInitModifyTable()
  into a execMerge.c ExecInitMerge()
* Move mt_merge_subcommands flags into execMerge.h
* Rename opt_and_condition to opt_merge_when_and_condition
* Wordsmith various comments

Author: Pavan Deolasee
Reviewer: Simon Riggs
2018-04-05 09:54:07 +01:00
Bruce Momjian
242408dbef C comment: mention null handling in BuildTupleFromCStrings()
Discussion: https://postgr.es/m/CAFjFpRcF-wNbe0w-m3NpkEwr9shmOZ=GoESOzd2Wog9h55J8sA@mail.gmail.com

Author: Ashutosh Bapat
2018-04-03 14:01:14 -04:00
Simon Riggs
83454e3c2b New files for MERGE 2018-04-03 10:22:21 +01:00
Simon Riggs
d204ef6377 MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.

MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.

MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.

Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.

This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.

Various issues reported via sqlsmith by Andreas Seltenreich

Authors: Pavan Deolasee, Simon Riggs
Reviewer: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs

Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-03 09:28:16 +01:00
Simon Riggs
aa5877bb26 Revert "MERGE SQL Command following SQL:2016"
This reverts commit e6597dc3533946b98acba7871bd4ca1f7a3d4c1d.
2018-04-02 21:36:38 +01:00
Simon Riggs
7cf8a5c302 Revert "Modified files for MERGE"
This reverts commit 354f13855e6381d288dfaa52bcd4f2cb0fd4a5eb.
2018-04-02 21:34:15 +01:00
Simon Riggs
354f13855e Modified files for MERGE 2018-04-02 21:12:47 +01:00
Simon Riggs
e6597dc353 MERGE SQL Command following SQL:2016
MERGE performs actions that modify rows in the target table
using a source table or query. MERGE provides a single SQL
statement that can conditionally INSERT/UPDATE/DELETE rows
a task that would other require multiple PL statements.
e.g.

MERGE INTO target AS t
USING source AS s
ON t.tid = s.sid
WHEN MATCHED AND t.balance > s.delta THEN
  UPDATE SET balance = t.balance - s.delta
WHEN MATCHED THEN
  DELETE
WHEN NOT MATCHED AND s.delta > 0 THEN
  INSERT VALUES (s.sid, s.delta)
WHEN NOT MATCHED THEN
  DO NOTHING;

MERGE works with regular and partitioned tables, including
column and row security enforcement, as well as support for
row, statement and transition triggers.

MERGE is optimized for OLTP and is parameterizable, though
also useful for large scale ETL/ELT. MERGE is not intended
to be used in preference to existing single SQL commands
for INSERT, UPDATE or DELETE since there is some overhead.
MERGE can be used statically from PL/pgSQL.

MERGE does not yet support inheritance, write rules,
RETURNING clauses, updatable views or foreign tables.
MERGE follows SQL Standard per the most recent SQL:2016.

Includes full tests and documentation, including full
isolation tests to demonstrate the concurrent behavior.

This version written from scratch in 2017 by Simon Riggs,
using docs and tests originally written in 2009. Later work
from Pavan Deolasee has been both complex and deep, leaving
the lead author credit now in his hands.
Extensive discussion of concurrency from Peter Geoghegan,
with thanks for the time and effort contributed.

Various issues reported via sqlsmith by Andreas Seltenreich

Authors: Pavan Deolasee, Simon Riggs
Reviewers: Peter Geoghegan, Amit Langote, Tomas Vondra, Simon Riggs

Discussion:
https://postgr.es/m/CANP8+jKitBSrB7oTgT9CY2i1ObfOt36z0XMraQc+Xrz8QB0nXA@mail.gmail.com
https://postgr.es/m/CAH2-WzkJdBuxj9PO=2QaO9-3h3xGbQPZ34kJH=HukRekwM-GZg@mail.gmail.com
2018-04-02 21:04:35 +01:00