A comment was left behind referencing sample rate adjustment removed
from 8ad51b5f44. So clean that up. While at it also remove the sample
rate clamping which should not be necessary without the clamping, and
just check that with an assert.
Reported-by: Tom Lane
Discussion: https://postgr.es/m/951485.1672461744%40sss.pgh.pa.us
When collecting ANALYZE sample on foreign tables, postgres_fdw fetched
all rows and performed the sampling locally. For large tables this means
transferring and immediately discarding large amounts of data.
This commit allows the sampling to be performed on the remote server,
transferring only the much smaller sample. The sampling is performed
using the built-in TABLESAMPLE methods (system, bernoulli) or random()
function, depending on the remote server version.
Remote sampling can be enabled by analyze_sampling on the foreign server
and/or foreign table, with supported values 'off', 'auto', 'system',
'bernoulli' and 'random'. The default value is 'auto' which uses either
'bernoulli' (TABLESAMPLE method) or 'random' (for remote servers without
TABLESAMPLE support).
Commit 927f453a9 disallowed batching added by commit b663a4136 to be
used for the inserts performed as part of cross-partition updates of
partitioned tables, mainly because the previous code in
nodeModifyTable.c couldn't handle pending inserts into foreign-table
partitions that are also UPDATE target partitions. But we don't have
such a limitation anymore (cf. commit ffbb7e65a), so let's allow for
this by removing from execPartition.c the restriction added by commit
927f453a9 that batching is only allowed if the query command type is
CMD_INSERT.
In postgres_fdw, since commit 86dc90056 changed it to effectively
disable cross-partition updates in the case where a foreign-table
partition chosen to insert rows into is also an UPDATE target partition,
allow batching in the case where a foreign-table partition chosen to
do so is *not* also an UPDATE target partition. This is enabled by the
"batch_size" option added by commit b663a4136, which is disabled by
default.
This patch also adjusts the test case added by commit 927f453a9 to
confirm that the inserts performed as part of a cross-partition update
of a partitioned table indeed uses batching.
Amit Langote, reviewed and/or tested by Georgios Kokolatos, Zhihong Yu,
Bharath Rupireddy, Hou Zhijie, Vignesh C, and me.
Discussion: http://postgr.es/m/CA%2BHiwqH1Lz1yJmPs%3DaD-pzd_HLLynLHvq5iYeT9mB0bBV7oJ6w%40mail.gmail.com
Currently, information about the permissions to be checked on relations
mentioned in a query is stored in their range table entries. So the
executor must scan the entire range table looking for relations that
need to have permissions checked. This can make the permission checking
part of the executor initialization needlessly expensive when many
inheritance children are present in the range range. While the
permissions need not be checked on the individual child relations, the
executor still must visit every range table entry to filter them out.
This commit moves the permission checking information out of the range
table entries into a new plan node called RTEPermissionInfo. Every
top-level (inheritance "root") RTE_RELATION entry in the range table
gets one and a list of those is maintained alongside the range table.
This new list is initialized by the parser when initializing the range
table. The rewriter can add more entries to it as rules/views are
expanded. Finally, the planner combines the lists of the individual
subqueries into one flat list that is passed to the executor for
checking.
To make it quick to find the RTEPermissionInfo entry belonging to a
given relation, RangeTblEntry gets a new Index field 'perminfoindex'
that stores the corresponding RTEPermissionInfo's index in the query's
list of the latter.
ExecutorCheckPerms_hook has gained another List * argument; the
signature is now:
typedef bool (*ExecutorCheckPerms_hook_type) (List *rangeTable,
List *rtePermInfos,
bool ereport_on_violation);
The first argument is no longer used by any in-core uses of the hook,
but we leave it in place because there may be other implementations that
do. Implementations should likely scan the rtePermInfos list to
determine which operations to allow or deny.
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqGjJDmUhDSfv-U2qhKJjt9ST7Xh9JXC_irsAQ1TAUsJYg@mail.gmail.com
A future commit will move the checkAsUser field from RangeTblEntry
to a new node that, unlike RTEs, will only be created for tables
mentioned in the query but not for the inheritance child relations
added to the query by the planner. So, checkAsUser value for a
given child relation will have to be obtained by referring to that
for its ancestor mentioned in the query.
In preparation, it seems better to expand the use of RelOptInfo.userid
during planning in place of rte->checkAsUser so that there will be
fewer places to adjust for the above change.
Given that the child-to-ancestor mapping is not available during the
execution of a given "child" ForeignScan node, add a checkAsUser
field to ForeignScan to carry the child relation's RelOptInfo.userid.
Author: Amit Langote <amitlangote09@gmail.com>
Discussion: https://postgr.es/m/CA+HiwqGFCs2uq7VRKi7g+FFKbP6Ea_2_HkgZb2HPhUfaAKT3ng@mail.gmail.com
Preprocessing of the HAVING clause will reduce havingQual to NIL
if the clause is constant-TRUE. This is one case where that
convention is rather unfortunate, because "HAVING TRUE" is not at all
the same as not having any HAVING clause at all. (Per the SQL spec,
it still forces the query to be grouped.) The planner deals with this
by having a boolean hasHavingQual that records whether havingQual was
originally nonempty; places that just want to check whether HAVING
was specified are supposed to consult that.
I found three places that got that wrong. Fortunately, these could
only affect cost estimates not correctness. It'd be hard even
to demonstrate the errors; for example, the one in allpaths.c would
only matter in a query that has HAVING TRUE but no GROUP BY and no
aggregates, which would require a completely variable-free SELECT
list, making the case probably of only academic interest. Hence,
while these are worth fixing before someone copies the incorrect
coding somewhere more critical, they don't seem worth back-patching.
I didn't bother trying to devise regression tests, either.
Discussion: https://postgr.es/m/2503888.1666042643@sss.pgh.pa.us
Commit 3d956d956 allowed the COPY, but it's done by inserting individual
rows to the foreign table, so it can be inefficient due to the overhead
caused by each round-trip to the foreign server. To improve performance
of the COPY in such a case, this patch allows batch insertion, by
extending the multi-insert machinery in CopyFrom() to the foreign-table
case so that we insert multiple rows to the foreign table at once using
the FDW callback routine added by commit b663a4136. This patch also
allows this for postgres_fdw. It is enabled by the "batch_size" option
added by commit b663a4136, which is disabled by default.
When doing batch insertion, we update progress of the COPY command after
performing the FDW callback routine, to count rows not suppressed by the
FDW as well as a BEFORE ROW INSERT trigger. For consistency, this patch
changes the timing of updating it for plain tables: previously, we
updated it immediately after adding each row to the multi-insert buffer,
but we do so only after writing the rows stored in the buffer out to the
table using table_multi_insert(), which I think would be consistent even
with non-batching mode, because in that mode we update it after writing
each row out to the table using table_tuple_insert().
Andrey Lepikhov, heavily revised by me, with review from Ian Barwick,
Andrey Lepikhov, and Zhihong Yu.
Discussion: https://postgr.es/m/bc489202-9855-7550-d64c-ad2d83c24867%40postgrespro.ru
Make sure that function declarations use names that exactly match the
corresponding names from function definitions in contrib code.
Like other recent commits that cleaned up function parameter names, this
commit was written with help from clang-tidy.
Author: Peter Geoghegan <pg@bowt.ie>
Discussion: https://postgr.es/m/CAH2-WznJt9CMM9KJTMjJh_zbL5hD9oX44qdJ4aqZtjFi-zA3Tg@mail.gmail.com
The tlist of the EvalPlanQual outer plan for a ForeignScan node is
adjusted to produce a tuple whose descriptor matches the scan tuple slot
for the ForeignScan node. But in the case where the outer plan contains
an extra Sort node, if the new tlist contained columns required only for
evaluating PlaceHolderVars or columns required only for evaluating local
conditions, this would cause setrefs.c to fail with the error.
The cause of this is that when creating the outer plan by injecting the
Sort node into an alternative local join plan that could emit such extra
columns as well, we fail to arrange for the outer plan to propagate them
up through the Sort node, causing setrefs.c to fail to match up them in
the new tlist to what is available from the outer plan. Repair.
Per report from Alexander Pyhalov.
Richard Guo and Etsuro Fujita, reviewed by Alexander Pyhalov and Tom Lane.
Backpatch to all supported versions.
Discussion: http://postgr.es/m/cfb17bf6dfdf876467bd5ef533852d18%40postgrespro.ru
When inserting a view referencing a foreign table that has WITH CHECK
OPTION constraints, in single-insert mode postgres_fdw retrieves the
data that was actually inserted on the remote side so that the WITH
CHECK OPTION constraints are enforced with the data locally, but in
batch-insert mode it cannot currently retrieve the data (except for the
row first inserted through the view), resulting in enforcing the WITH
CHECK OPTION constraints with the data passed from the core (except for
the first-inserted row), which led to incorrect results when inserting
into a view referencing a foreign table in which a remote BEFORE ROW
INSERT trigger changes the rows inserted through the view so that they
violate the view's WITH CHECK OPTION constraint. Also, the query
inserting into the view caused an assertion failure in assert-enabled
builds.
Fix these by disabling batch insertion when inserting into such a view.
Back-patch to v14 where batch insertion was added.
Discussion: https://postgr.es/m/CAPmGK17LpbTZs4m4a_6THP54UBeK9fHvX8aVVA%2BC6yEZDZwQcg%40mail.gmail.com
When postgres_fdw begins an asynchronous data fetch, it submits FETCH query
by using PQsendQuery(). If PQsendQuery() fails and returns 0, postgres_fdw
should report an error. But, previously, postgres_fdw reported an error
only when the return value is less than 0, though PQsendQuery() never return
the values other than 0 and 1. Therefore postgres_fdw could not handle
the failure to send FETCH query in an asynchronous data fetch.
This commit fixes postgres_fdw so that it reports an error
when PQsendQuery() returns 0.
Back-patch to v14 where asynchronous execution was supported in postgres_fdw.
Author: Fujii Masao
Reviewed-by: Japin Li, Tom Lane
Discussion: https://postgr.es/m/b187a7cf-d4e3-5a32-4d01-8383677797f3@oss.nttdata.com
The motivation for this is to ensure successful transmission of the
values of constants of regconfig and other reg* types. The remote
will be reading them with search_path = 'pg_catalog', so schema
qualification is necessary when referencing objects in other schemas.
Per bug #17483 from Emmanuel Quincerot. Back-patch to all supported
versions. (There's some other stuff to do here, but it's less
back-patchable.)
Discussion: https://postgr.es/m/1423433.1652722406@sss.pgh.pa.us
This replaces all MemSet() calls with struct initialization where that
is easily and obviously possible. (For example, some cases have to
worry about padding bits, so I left those.)
(The same could be done with appropriate memset() calls, but this
patch is part of an effort to phase out MemSet(), so it doesn't touch
memset() calls.)
Reviewed-by: Ranier Vilela <ranier.vf@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Discussion: https://www.postgresql.org/message-id/9847b13c-b785-f4e2-75c3-12ec77a3b05c@enterprisedb.com
Previously, we allowed this, but such triggers might query the table to
insert into and act differently if the tuples that have already been
processed and prepared for insertion are not there, so disable it in
such cases.
Back-patch to v14 where batch insert was added.
Discussion: https://postgr.es/m/CAPmGK16_uPqsmgK0-LpLSUk54_BoK13bPrhxhfjSoSTVz414hA%40mail.gmail.com
postgres_fdw would push ORDER BY clauses to the remote side without
verifying that the sort operator is safe to ship. Moreover, it failed
to print a suitable USING clause if the sort operator isn't default
for the sort expression's type. The net result of this is that the
remote sort might not have anywhere near the semantics we expect,
which'd be disastrous for locally-performed merge joins in particular.
We addressed similar issues in the context of ORDER BY within an
aggregate function call in commit 7012b132d, but failed to notice
that query-level ORDER BY was broken. Thus, much of the necessary
logic already existed, but it requires refactoring to be usable
in both cases.
Back-patch to all supported branches. In HEAD only, remove the
core code's copy of find_em_expr_for_rel, which is no longer used
and really should never have been pushed into equivclass.c in the
first place.
Ronan Dunklau, per report from David Rowley;
reviews by David Rowley, Ranier Vilela, and myself
Discussion: https://postgr.es/m/CAApHDvr4OeC2DBVY--zVP83-K=bYrTD7F8SZDhN4g+pj2f2S-A@mail.gmail.com
Commit 27e1f1456 failed to process a pending asynchronous request made
for a given ForeignScan node in postgresReScanForeignScan() (if any) in
cases where we would only reset the next_tuple counter in that function,
contradicting the assumption that there should be no pending
asynchronous requests that have been made for async-capable subplans for
the parent Append node after ReScan. This led to an assert failure in
an assert-enabled build. I think this would also lead to mis-rewinding
the cursor in that function in the case where we have already fetched
one batch for the ForeignScan node and the asynchronous request has been
made for the second batch, because even in that case we would just reset
the counter when called from that function, so we would fail to execute
MOVE BACKWARD ALL.
To fix, modify that function to process the asynchronous request before
restarting the scan.
While at it, add a comment to a function to match other places.
Per bug #17344 from Alexander Lakhin. Back-patch to v14 where the
aforesaid commit came in.
Patch by me. Test case by Alexander Lakhin, adjusted by me. Reviewed
and tested by Alexander Lakhin and Dmitry Dolgov.
Discussion: https://postgr.es/m/17344-226b78b00de73a7e@postgresql.org
Before, SQL-level boolean constants were represented by a string with
a cast, and internal Boolean values in DDL commands were usually
represented by Integer nodes. This takes the place of both of these
uses, making the intent clearer and having some amount of type safety.
Reviewed-by: Pavel Stehule <pavel.stehule@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/8c1a2e37-c68d-703c-5a83-7a6077f4f997@enterprisedb.com
fsstate in process_pending_requests (in postgres_fdw.c) was added in
8998e3cafa2 as an assertion-only variable, 1ec7fca8592 stated using
the variable outside of assertions.
rd_index in get_index_column_opclass (in lsyscache.c) was introduced
in 2a6368343ff, and then promptly used in the fix commit 7e041603904
shortly thereafter.
This removes the PG_USED_FOR_ASSERTS_ONLY variable decoration from
the above mentioned variables.
Reviewed-by: Greg Nancarrow <gregn4422@gmail.com>
Discussion: https://postgr.es/m/F959106C-0F21-43A5-B2AE-D007D51ACBEE@yesql.se
Standardize on xoroshiro128** as our basic PRNG algorithm, eliminating
a bunch of platform dependencies as well as fundamentally-obsolete PRNG
code. In addition, this API replacement will ease replacing the
algorithm again in future, should that become necessary.
xoroshiro128** is a few percent slower than the drand48 family,
but it can produce full-width 64-bit random values not only 48-bit,
and it should be much more trustworthy. It's likely to be noticeably
faster than the platform's random(), depending on which platform you
are thinking about; and we can have non-global state vectors easily,
unlike with random(). It is not cryptographically strong, but neither
are the functions it replaces.
Fabien Coelho, reviewed by Dean Rasheed, Aleksander Alekseev, and myself
Discussion: https://postgr.es/m/alpine.DEB.2.22.394.2105241211230.165418@pseudo
Commit c7b7311f6 adjusted conversion_error_callback to always use
information from the query's rangetable, to avoid doing catalog lookups
in an already-failed transaction. However, as a result of the utterly
inadequate documentation for make_tuple_from_result_row, I failed to
realize that fsstate could be NULL in some contexts. That led to a
crash if we got a conversion error in such a context. Fix by falling
back to the previous coding when fsstate is NULL. Improve the
commentary, too.
Per report from Andrey Borodin. Back-patch to 9.6, like the previous
patch.
Discussion: https://postgr.es/m/08916396-55E4-4D68-AB3A-BD6066F9E5C0@yandex-team.ru
The Value node struct is a weird construct. It is its own node type,
but most of the time, it actually has a node type of Integer, Float,
String, or BitString. As a consequence, the struct name and the node
type don't match most of the time, and so it has to be treated
specially a lot. There doesn't seem to be any value in the special
construct. There is very little code that wants to accept all Value
variants but nothing else (and even if it did, this doesn't provide
any convenient way to check it), and most code wants either just one
particular node type (usually String), or it accepts a broader set of
node types besides just Value.
This change removes the Value struct and node type and replaces them
by separate Integer, Float, String, and BitString node types that are
proper node types and structs of their own and behave mostly like
normal node types.
Also, this removes the T_Null node tag, which was previously also a
possible variant of Value but wasn't actually used outside of the
Value contained in A_Const. Replace that by an isnull field in
A_Const.
Reviewed-by: Dagfinn Ilmari Mannsåker <ilmari@ilmari.org>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://www.postgresql.org/message-id/flat/5ba6bc5b-3f95-04f2-2419-f8ddb4c046fb@enterprisedb.com
Avoid repeating fragments of the query we're building, along the
same lines as recent cleanup in pg_dump. I got annoyed about this
because aa769f80e broke my pending patch to change postgres_fdw's
collation handling, due to each of us having incompletely done
this same refactoring. Let's finish that job in hopes of having
a more stable base.
postgres_fdw imported generated columns from the remote tables as plain
columns, and caused failures like "ERROR: cannot insert a non-DEFAULT
value into column "foo"" when inserting into the foreign tables, as it
tried to insert values into the generated columns. To fix, we do the
following under the assumption that generated columns in a postgres_fdw
foreign table are defined so that they represent generated columns in
the underlying remote table:
* Send DEFAULT for the generated columns to the foreign server on insert
or update, not generated column values computed on the local server.
* Add to postgresImportForeignSchema() an option "import_generated" to
include column generated expressions in the definitions of foreign
tables imported from a foreign server. The option is true by default.
The assumption seems reasonable, because that would make a query of the
postgres_fdw foreign table return values for the generated columns that
are consistent with the generated expression.
While here, fix another issue in postgresImportForeignSchema(): it tried
to include column generated expressions as column default expressions in
the foreign table definitions when the import_default option was enabled.
Per bug #16631 from Daniel Cherniy. Back-patch to v12 where generated
columns were added.
Discussion: https://postgr.es/m/16631-e929fe9db0ffc7cf%40postgresql.org
I failed to account for the possibility that when
ExecAppendAsyncEventWait() notifies multiple async-capable nodes using
postgres_fdw, a preceding node might invoke process_pending_request() to
process a pending asynchronous request made by a succeeding node. In
that case the succeeding node should produce a tuple to return to the
parent Append node from tuples fetched by process_pending_request() when
notified. Repair.
Per buildfarm via Michael Paquier. Back-patch to v14, like the previous
commit.
Thanks to Tom Lane for testing.
Discussion: https://postgr.es/m/YQP0UPT8KmPiHTMs%40paquier.xyz
A pending asynchronous request is handled by process_pending_request(),
which previously not only processed an in-progress remote query but
performed ExecForeignScan() to produce a tuple to return to the local
server asynchronously from the result of the remote query. But that led
to a server crash when executing a query or led to an "InstrStartNode
called twice in a row" or "InstrEndLoop called on running node" failure
when doing EXPLAIN ANALYZE of it, in cases where the plan tree for it
contained multiple async-capable nodes accessing the same
initplan/subplan that contained multiple async-capable nodes scanning
the same foreign tables as for the parent async-capable nodes, as
reported by Andrey Lepikhov. The reason is that the second step in
process_pending_request() invoked when executing the initplan/subplan
for one of the parent async-capable nodes caused recursive execution of
the initplan/subplan for another of the parent async-capable nodes.
To fix, split process_pending_request() into the two steps and postpone
the second step until ForeignAsyncConfigureWait() is called for each of
the pending asynchronous requests. Also, in ExecAppendAsyncEventWait()
we assumed that FDWs would register at least one wait event in a
WaitEventSet created there when they were called from
ForeignAsyncConfigureWait() in that function, but allow FDWs to register
zero wait events in the WaitEventSet; modify ExecAppendAsyncEventWait()
to just return in that case.
Oversight in commit 27e1f1456. Back-patch to v14 where that commit went
in.
Andrey Lepikhov and Etsuro Fujita
Discussion: https://postgr.es/m/fe5eaa19-1704-e4a4-76ee-3b9d37ade399@postgrespro.ru
This patch causes EXPLAIN output to refer to objects that are in
the current session's temp schema with the "pg_temp" schema alias
rather than that schema's actual name. This is useful for our own
testing purposes since it will stabilize EXPLAIN VERBOSE output
for such cases, allowing us to use that in regression tests.
It should be less confusing for end users too.
Since ruleutils.c needs to change behavior for this, the change
also leaks into a few other users of ruleutils.c, for example
pg_get_viewdef(). AFAICS that won't cause any problems.
We did find that aggressively trying to change this behavior
across-the-board would cause issues, but as long as "pg_temp"
only appears within generated SQL text, I think it'll be fine.
Along the way, make get_namespace_name_or_temp conform to the
same API as get_namespace_name, ie that it returns a palloc'd
string or NULL. The current behavior hasn't caused any bugs
since no callers attempt to pfree the result, but if it gets
more widespread usage that could become a problem.
Amul Sul, reviewed and extended by me
Discussion: https://postgr.es/m/CAAJ_b97W=QaGmag9AhWNbmx3uEYsNkXWL+OVW1_E1D3BtgWvtw@mail.gmail.com
In 86dc90056, I'd written find_modifytable_subplan with the assumption
that if the immediate child of a ModifyTable is a Result, it must be
a projecting Result with a subplan. However, if the UPDATE or DELETE
has a provably-constant-false WHERE clause, that's not so: we'll
generate a dummy subplan with a childless Result. Add the missing
null-check so we don't crash on such cases.
Per report from Alexander Pyhalov.
Discussion: https://postgr.es/m/b9a6f53549456b2f3e2fd150dcd79d72@postgrespro.ru
Previously the values such as '100$%$#$#', '9,223,372,' were accepted and
treated as valid integers for postgres_fdw options batch_size and fetch_size.
Whereas this is not the case with fdw_startup_cost and fdw_tuple_cost options
for which an error is thrown. This was because endptr was not used
while converting strings to integers using strtol.
This commit changes the logic so that it uses parse_int function
instead of strtol as it serves the purpose by returning false in case
if it is unable to convert the string to integer. Note that
this function also rounds off the values such as '100.456' to 100 and
'100.567' or '100.678' to 101.
While on this, use parse_real for fdw_startup_cost and fdw_tuple_cost options.
Since parse_int and parse_real are being used for reloptions and GUCs,
it is more appropriate to use in postgres_fdw rather than using strtol
and strtod directly.
Back-patch to v14.
Author: Bharath Rupireddy
Reviewed-by: Ashutosh Bapat, Tom Lane, Kyotaro Horiguchi, Fujii Masao
Discussion: https://postgr.es/m/CALj2ACVMO6wY5Pc4oe1OCgUOAtdjHuFsBDw8R5uoYR86eWFQDA@mail.gmail.com
As in 50371df26, this is a bad idea since the callback can't really
know what error is being thrown and thus whether or not it is safe
to attempt catalog accesses. Rather than pushing said accesses into
the mainline code where they'd usually be a waste of cycles, we can
look at the query's rangetable instead.
This change does mean that we'll be printing query aliases (if any
were used) rather than the table or column's true name. But that
doesn't seem like a bad thing: it's certainly a more useful definition
in self-join cases, for instance. In any case, it seems unlikely that
any applications would be depending on this detail, so it seems safe
to change.
Patch by me. Original complaint by Andres Freund; Bharath Rupireddy
noted the connection to conversion_error_callback.
Discussion: https://postgr.es/m/20210106020229.ne5xnuu6wlondjpe@alap3.anarazel.de
The FE/BE protocol identifies parameters with an Int16 index, which
limits the maximum number of parameters per query to 65535. With
batching added to postges_fdw this limit is much easier to hit, as
the whole batch is essentially a single query, making this error much
easier to hit.
The failures are a bit unpredictable, because it also depends on the
number of columns in the query. So instead of just failing, this patch
tweaks the batch_size to not exceed the maximum number of parameters.
Reported-by: Hou Zhijie <houzj.fnst@cn.fujitsu.com>
Reviewed-by: Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com>
Discussion: https://postgr.es/m/OS0PR01MB571603973C0AC2874AD6BF2594299%40OS0PR01MB5716.jpnprd01.prod.outlook.com
Commit 86dc90056 expects that FDWs can cope with whole-row Vars for
their tables, even if the Vars are marked with vartype RECORDOID.
Previously, whole-row Vars generated by the planner had vartype equal
to the relevant table's rowtype OID. (The point behind this change is
to enable sharing of resjunk columns across inheritance child tables.)
It turns out that postgres_fdw fails to cope with this, though through
bad fortune none of its test cases exposed that. Things mostly work,
but when we try to read back a value of such a Var, the expected
rowtype is not available to record_in(). Fortunately, it's not
difficult to hack up the tupdesc that controls this process to
substitute the foreign table's rowtype for RECORDOID. Thus we can
solve the runtime problem while still sharing the resjunk column
with other tables.
Per report from Alexander Pyhalov.
Discussion: https://postgr.es/m/7817fb9ebd6661cdf9b67dec6e129a78@postgrespro.ru
Commits 27e1f1456 and 86dc90056, which were independently discussed,
cause a crash when executing an inherited foreign UPDATE/DELETE query
with asynchronous execution enabled, where children of an Append node
that is the direct/indirect child of the ModifyTable node are rewritten
so as to modify foreign tables directly by postgresPlanDirectModify();
as in that case the direct modifications are executed asynchronously,
which is not currently supported by asynchronous execution. Fix by
disabling asynchronous execution of the direct modifications in that
function.
Author: Etsuro Fujita
Reviewed-by: Amit Langote
Discussion: https://postgr.es/m/CAPmGK158e9sJOfuWxfn%2B0ynrspXQU3JhNjSCbaoeSzMvnga%2Bbw%40mail.gmail.com
Also "make reformat-dat-files".
The only change worthy of note is that pgindent messed up the formatting
of launcher.c's struct LogicalRepWorkerId, which led me to notice that
that struct wasn't used at all anymore, so I just took it out.
EXPLAIN ANALYZE for an async-capable ForeignScan node associated with
postgres_fdw is done just by using instrumentation for ExecProcNode()
called from the node's callbacks, causing the following problems:
1) If the remote table to scan is empty, the node is incorrectly
considered as "never executed" by the command even if the node is
executed, as ExecProcNode() isn't called from the node's callbacks at
all in that case.
2) The command fails to collect timings for things other than
ExecProcNode() done in the node, such as creating a cursor for the
node's remote query.
To fix these problems, add instrumentation for async-capable nodes, and
modify postgres_fdw accordingly.
My oversight in commit 27e1f1456.
While at it, update a comment for the AsyncRequest struct in execnodes.h
and the documentation for the ForeignAsyncRequest API in fdwhandler.sgml
to match the code in ExecAsyncAppendResponse() in nodeAppend.c, and fix
typos in comments in nodeAppend.c.
Per report from Andrey Lepikhov, though I didn't use his patch.
Reviewed-by: Andrey Lepikhov
Discussion: https://postgr.es/m/2eb662bb-105d-fc20-7412-2f027cc3ca72%40postgrespro.ru
When executing the INSERT with batching, we may need to rebuild the
query when the batch size changes, in which case we pfree the current
string. We must not release the original string, stored in fdw_private,
because that may be needed in EXPLAIN ANALYZE. So make copy of the SQL,
but only for INSERT queries.
Reported-by: Pavel Stehule
Discussion: https://postgr.es/m/CAFj8pRCL_Rjw-MCR6J7VX9OF7MR6PA5K8qUbrMvprW_e-aHkfQ%40mail.gmail.com
Commit 8ff1c94649 allowed TRUNCATE command to truncate foreign tables.
Previously the information about "ONLY" options specified in TRUNCATE
command were passed to the foreign data wrapper. Then postgres_fdw
constructed the TRUNCATE command to issue the remote server and
included "ONLY" options in it based on the passed information.
On the other hand, "ONLY" options specified in SELECT, UPDATE or DELETE
have no effect when accessing or modifying the remote table, i.e.,
are not passed to the foreign data wrapper. So it's inconsistent to
make only TRUNCATE command pass the "ONLY" options to the foreign data
wrapper. Therefore this commit changes the TRUNCATE command so that
it doesn't pass the "ONLY" options to the foreign data wrapper,
for the consistency with other statements. Also this commit changes
postgres_fdw so that it always doesn't include "ONLY" options in
the TRUNCATE command that it constructs.
Author: Fujii Masao
Reviewed-by: Bharath Rupireddy, Kyotaro Horiguchi, Justin Pryzby, Zhihong Yu
Discussion: https://postgr.es/m/551ed8c1-f531-818b-664a-2cecdab99cd8@oss.nttdata.com
This is cleanup for commit 27e1f1456:
* ExecAppendAsyncEventWait(), which was modified a bit further by commit
a8af856d3, duplicated the same nevents calculation. Simplify the code
a little bit to avoid the duplication. Update comments there.
* Add an assertion to ExecAppendAsyncRequest().
* Update a comment about merging the async_capable options from input
relations in merge_fdw_options(), per complaint from Kyotaro Horiguchi.
* Add a comment for fetch_more_data_begin().
Author: Etsuro Fujita
Discussion: https://postgr.es/m/CAPmGK1637W30Wx3MnrReewhafn6F_0J76mrJGoFXFnpPq4QfvA%40mail.gmail.com
This commit introduces new foreign data wrapper API for TRUNCATE.
It extends TRUNCATE command so that it accepts foreign tables as
the targets to truncate and invokes that API. Also it extends postgres_fdw
so that it can issue TRUNCATE command to foreign servers, by adding
new routine for that TRUNCATE API.
The information about options specified in TRUNCATE command, e.g.,
ONLY, CACADE, etc is passed to FDW via API. The list of foreign tables to
truncate is also passed to FDW. FDW truncates the foreign data sources
that the passed foreign tables specify, based on those information.
For example, postgres_fdw constructs TRUNCATE command using them
and issues it to the foreign server.
For performance, TRUNCATE command invokes the FDW routine for
TRUNCATE once per foreign server that foreign tables to truncate belong to.
Author: Kazutaka Onishi, Kohei KaiGai, slightly modified by Fujii Masao
Reviewed-by: Bharath Rupireddy, Michael Paquier, Zhihong Yu, Alvaro Herrera, Stephen Frost, Ashutosh Bapat, Amit Langote, Daniel Gustafsson, Ibrar Ahmed, Fujii Masao
Discussion: https://postgr.es/m/CAOP8fzb_gkReLput7OvOK+8NHgw-RKqNv59vem7=524krQTcWA@mail.gmail.com
Discussion: https://postgr.es/m/CAJuF6cMWDDqU-vn_knZgma+2GMaout68YUgn1uyDnexRhqqM5Q@mail.gmail.com
Commit f49bcd4ef3 disallowed postgres_fdw to import table partitions.
Because all data can be accessed through the partitioned table which
is the root of the partitioning hierarchy, importing only partitioned
table should allow access to all the data without creating extra objects.
This is a reasonable default when importing a whole schema. But there
may be the case where users want to explicitly import one of
a partitioned tables' partitions.
For that use case, this commit allows postgres_fdw to import tables or
foreign tables which are partitions of some other table only when they
are explicitly specified in LIMIT TO clause. It doesn't change
the behavior that any partitions not specified in LIMIT TO are
automatically excluded in IMPORT FOREIGN SCHEMA command.
Author: Matthias van de Meent
Reviewed-by: Bernd Helmle, Amit Langote, Michael Paquier, Fujii Masao
Discussion: https://postgr.es/m/CAEze2Whwg4i=mzApMe+PXxCEfgoZmHGqdqQFW7J4bmj_5p6t1A@mail.gmail.com
This patch makes two closely related sets of changes:
1. For UPDATE, the subplan of the ModifyTable node now only delivers
the new values of the changed columns (i.e., the expressions computed
in the query's SET clause) plus row identity information such as CTID.
ModifyTable must re-fetch the original tuple to merge in the old
values of any unchanged columns. The core advantage of this is that
the changed columns are uniform across all tables of an inherited or
partitioned target relation, whereas the other columns might not be.
A secondary advantage, when the UPDATE involves joins, is that less
data needs to pass through the plan tree. The disadvantage of course
is an extra fetch of each tuple to be updated. However, that seems to
be very nearly free in context; even worst-case tests don't show it to
add more than a couple percent to the total query cost. At some point
it might be interesting to combine the re-fetch with the tuple access
that ModifyTable must do anyway to mark the old tuple dead; but that
would require a good deal of refactoring and it seems it wouldn't buy
all that much, so this patch doesn't attempt it.
2. For inherited UPDATE/DELETE, instead of generating a separate
subplan for each target relation, we now generate a single subplan
that is just exactly like a SELECT's plan, then stick ModifyTable
on top of that. To let ModifyTable know which target relation a
given incoming row refers to, a tableoid junk column is added to
the row identity information. This gets rid of the horrid hack
that was inheritance_planner(), eliminating O(N^2) planning cost
and memory consumption in cases where there were many unprunable
target relations.
Point 2 of course requires point 1, so that there is a uniform
definition of the non-junk columns to be returned by the subplan.
We can't insist on uniform definition of the row identity junk
columns however, if we want to keep the ability to have both
plain and foreign tables in a partitioning hierarchy. Since
it wouldn't scale very far to have every child table have its
own row identity column, this patch includes provisions to merge
similar row identity columns into one column of the subplan result.
In particular, we can merge the whole-row Vars typically used as
row identity by FDWs into one column by pretending they are type
RECORD. (It's still okay for the actual composite Datums to be
labeled with the table's rowtype OID, though.)
There is more that can be done to file down residual inefficiencies
in this patch, but it seems to be committable now.
FDW authors should note several API changes:
* The argument list for AddForeignUpdateTargets() has changed, and so
has the method it must use for adding junk columns to the query. Call
add_row_identity_var() instead of manipulating the parse tree directly.
You might want to reconsider exactly what you're adding, too.
* PlanDirectModify() must now work a little harder to find the
ForeignScan plan node; if the foreign table is part of a partitioning
hierarchy then the ForeignScan might not be the direct child of
ModifyTable. See postgres_fdw for sample code.
* To check whether a relation is a target relation, it's no
longer sufficient to compare its relid to root->parse->resultRelation.
Instead, check it against all_result_relids or leaf_result_relids,
as appropriate.
Amit Langote and Tom Lane
Discussion: https://postgr.es/m/CA+HiwqHpHdqdDn48yCEhynnniahH78rwcrv1rEX65-fsZGBOLQ@mail.gmail.com
This implements asynchronous execution, which runs multiple parts of a
non-parallel-aware Append concurrently rather than serially to improve
performance when possible. Currently, the only node type that can be
run concurrently is a ForeignScan that is an immediate child of such an
Append. In the case where such ForeignScans access data on different
remote servers, this would run those ForeignScans concurrently, and
overlap the remote operations to be performed simultaneously, so it'll
improve the performance especially when the operations involve
time-consuming ones such as remote join and remote aggregation.
We may extend this to other node types such as joins or aggregates over
ForeignScans in the future.
This also adds the support for postgres_fdw, which is enabled by the
table-level/server-level option "async_capable". The default is false.
Robert Haas, Kyotaro Horiguchi, Thomas Munro, and myself. This commit
is mostly based on the patch proposed by Robert Haas, but also uses
stuff from the patch proposed by Kyotaro Horiguchi and from the patch
proposed by Thomas Munro. Reviewed by Kyotaro Horiguchi, Konstantin
Knizhnik, Andrey Lepikhov, Movead Li, Thomas Munro, Justin Pryzby, and
others.
Discussion: https://postgr.es/m/CA%2BTgmoaXQEt4tZ03FtQhnzeDEMzBck%2BLrni0UWHVVgOTnA6C1w%40mail.gmail.com
Discussion: https://postgr.es/m/CA%2BhUKGLBRyu0rHrDCMC4%3DRn3252gogyp1SjOgG8SEKKZv%3DFwfQ%40mail.gmail.com
Discussion: https://postgr.es/m/20200228.170650.667613673625155850.horikyota.ntt%40gmail.com
Here we add a new output parameter to estimate_num_groups() to allow it to
inform the caller of additional, possibly useful information about the
estimation.
The new output parameter is a struct that currently contains just a single
field with a set of flags. This was done rather than having the flags as
an output parameter to allow future fields to be added without having to
change the signature of the function at a later date when we want to pass
back further information that might not be suitable to store in the flags
field.
It seems reasonable that one day in the future that the planner would want
to know more about the estimation. For example, how many individual sets
of statistics was the estimation generated from? The planner may want to
take that into account if we ever want to consider risks as well as costs
when generating plans.
For now, there's only 1 flag we set in the flags field. This is to
indicate if the estimation fell back on using the hard-coded constants in
any part of the estimation. Callers may like to change their behavior if
this is set, and this gives them the ability to do so. Callers may pass
the flag pointer as NULL if they have no interest in obtaining any
additional information about the estimate.
We're not adding any actual usages of these flags here. Some follow-up
commits will make use of this feature. Additionally, we're also not
making any changes to add support for clauselist_selectivity() and
clauselist_selectivity_ext(). However, if this is required in the future
then the same struct being added here should be fine to use as a new
output argument for those functions too.
Author: David Rowley
Discussion: https://postgr.es/m/CAApHDvqQqpk=1W-G_ds7A9CsXX3BggWj_7okinzkLVhDubQzjA@mail.gmail.com