1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

Allow FDWs and custom scan providers to replace joins with scans.

Foreign data wrappers can use this capability for so-called "join
pushdown"; that is, instead of executing two separate foreign scans
and then joining the results locally, they can generate a path which
performs the join on the remote server and then is scanned locally.
This commit does not extend postgres_fdw to take advantage of this
capability; it just provides the infrastructure.

Custom scan providers can use this in a similar way.  Previously,
it was only possible for a custom scan provider to scan a single
relation.  Now, it can scan an entire join tree, provided of course
that it knows how to produce the same results that the join would
have produced if executed normally.

KaiGai Kohei, reviewed by Shigeru Hanada, Ashutosh Bapat, and me.
This commit is contained in:
Robert Haas
2015-05-01 08:50:35 -04:00
parent 2b22795b32
commit e7cb7ee145
20 changed files with 448 additions and 71 deletions

View File

@ -81,6 +81,28 @@ typedef struct CustomPath
detailed below.
</para>
<para>
A custom scan provider can also add join paths; in this case, the scan
must produce the same output as would normally be produced by the join
it replaces. To do this, the join provider should set the following hook.
This hook may be invoked repeatedly for the same pair of relations, with
different combinations of inner and outer relations; it is the
responsibility of the hook to minimize duplicated work.
<programlisting>
typedef void (*set_join_pathlist_hook_type) (PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
List *restrictlist,
JoinType jointype,
SpecialJoinInfo *sjinfo,
SemiAntiJoinFactors *semifactors,
Relids param_source_rels,
Relids extra_lateral_rels);
extern PGDLLIMPORT set_join_pathlist_hook_type set_join_pathlist_hook;
</programlisting>
</para>
<sect2 id="custom-scan-path-callbacks">
<title>Custom Path Callbacks</title>
@ -124,7 +146,9 @@ typedef struct CustomScan
Scan scan;
uint32 flags;
List *custom_exprs;
List *custom_ps_tlist;
List *custom_private;
List *custom_relids;
const CustomScanMethods *methods;
} CustomScan;
</programlisting>
@ -141,11 +165,27 @@ typedef struct CustomScan
is only used by the custom scan provider itself. Plan trees must be able
to be duplicated using <function>copyObject</>, so all the data stored
within these two fields must consist of nodes that function can handle.
<literal>custom_relids</> is set by the core code to the set of relations
which this scan node must handle; except when this scan is replacing a
join, it will have only one member.
<structfield>methods</> must point to a (usually statically allocated)
object implementing the required custom scan methods, which are further
detailed below.
</para>
<para>
When a <structname>CustomScan</> scans a single relation,
<structfield>scan.scanrelid</> should be the range table index of the table
to be scanned, and <structfield>custom_ps_tlist</> should be
<literal>NULL</>. When it replaces a join, <structfield>scan.scanrelid</>
should be zero, and <structfield>custom_ps_tlist</> should be a list of
<structname>TargetEntry</> nodes. This is necessary because, when a join
is replaced, the target list cannot be constructed from the table
definition. At execution time, this list will be used to initialize the
tuple descriptor of the <structname>TupleTableSlot</>. It will also be
used by <command>EXPLAIN</>, when deparsing.
</para>
<sect2 id="custom-scan-plan-callbacks">
<title>Custom Scan Callbacks</title>
<para>

View File

@ -598,6 +598,42 @@ IsForeignRelUpdatable (Relation rel);
</sect2>
<sect2>
<title>FDW Routines For Remote Joins</title>
<para>
<programlisting>
void
GetForeignJoinPaths(PlannerInfo *root,
RelOptInfo *joinrel,
RelOptInfo *outerrel,
RelOptInfo *innerrel,
List *restrictlist,
JoinType jointype,
SpecialJoinInfo *sjinfo,
SemiAntiJoinFactors *semifactors,
Relids param_source_rels,
Relids extra_lateral_rels);
</programlisting>
Create possible access paths for a join of two foreign tables managed
by the same foreign data wrapper.
This optional function is called during query planning.
</para>
<para>
This function the FDW to add <structname>ForeignScan</> paths for the
supplied <literal>joinrel</>. Typically, the FDW will send the whole
join to the remote server as a single query, as performing the join
remotely rather than locally is typically much more efficient.
</para>
<para>
Since we cannot construct the slot descriptor for a remote join from
the catalogs, the FDW should set the <structfield>scanrelid</> of the
<structname>ForeignScan</> to zero and <structfield>fdw_ps_tlist</>
to an appropriate list of <structfield>TargetEntry</> nodes.
Junk entries will be ignored, but can be present for the benefit of
deparsing performed by <command>EXPLAIN</>.
</para>
</sect2>
<sect2 id="fdw-callbacks-explain">
<title>FDW Routines for <command>EXPLAIN</></title>