mirror of
				https://github.com/postgres/postgres.git
				synced 2025-11-03 09:13:20 +03:00 
			
		
		
		
	Add postgres_fdw contrib module.
There's still a lot of room for improvement, but it basically works, and we need this to be present before we can do anything much with the writable-foreign-tables patch. So let's commit it and get on with testing. Shigeru Hanada, reviewed by KaiGai Kohei and Tom Lane
This commit is contained in:
		
							
								
								
									
										325
									
								
								doc/src/sgml/postgres-fdw.sgml
									
									
									
									
									
										Normal file
									
								
							
							
						
						
									
										325
									
								
								doc/src/sgml/postgres-fdw.sgml
									
									
									
									
									
										Normal file
									
								
							@@ -0,0 +1,325 @@
 | 
			
		||||
<!-- doc/src/sgml/postgres-fdw.sgml -->
 | 
			
		||||
 | 
			
		||||
<sect1 id="postgres-fdw" xreflabel="postgres_fdw">
 | 
			
		||||
 <title>postgres_fdw</title>
 | 
			
		||||
 | 
			
		||||
 <indexterm zone="postgres-fdw">
 | 
			
		||||
  <primary>postgres_fdw</primary>
 | 
			
		||||
 </indexterm>
 | 
			
		||||
 | 
			
		||||
 <para>
 | 
			
		||||
  The <filename>postgres_fdw</> module provides the foreign-data wrapper
 | 
			
		||||
  <literal>postgres_fdw</literal>, which can be used to access data
 | 
			
		||||
  stored in external <productname>PostgreSQL</productname> servers.
 | 
			
		||||
 </para>
 | 
			
		||||
 | 
			
		||||
 <para>
 | 
			
		||||
  The functionality provided by this module overlaps substantially
 | 
			
		||||
  with the functionality of the older <xref linkend="dblink"> module.
 | 
			
		||||
  But <filename>postgres_fdw</> provides more transparent and
 | 
			
		||||
  standards-compliant syntax for accessing remote tables, and can give
 | 
			
		||||
  better performance in many cases.
 | 
			
		||||
 </para>
 | 
			
		||||
 | 
			
		||||
 <para>
 | 
			
		||||
  To prepare for remote access using <filename>postgres_fdw</>:
 | 
			
		||||
  <orderedlist spacing="compact">
 | 
			
		||||
   <listitem>
 | 
			
		||||
    <para>
 | 
			
		||||
     Install the  <filename>postgres_fdw</> extension using <xref
 | 
			
		||||
     linkend="sql-createextension">.
 | 
			
		||||
    </para>
 | 
			
		||||
   </listitem>
 | 
			
		||||
   <listitem>
 | 
			
		||||
    <para>
 | 
			
		||||
     Create a foreign server object, using <xref linkend="sql-createserver">,
 | 
			
		||||
     to represent each remote database you want to connect to.
 | 
			
		||||
     Specify connection information, except <literal>user</literal> and
 | 
			
		||||
     <literal>password</literal>, as options of the server object.
 | 
			
		||||
    </para>
 | 
			
		||||
   </listitem>
 | 
			
		||||
   <listitem>
 | 
			
		||||
    <para>
 | 
			
		||||
     Create a user mapping, using <xref linkend="sql-createusermapping">, for
 | 
			
		||||
     each database user you want to allow to access each foreign server.
 | 
			
		||||
     Specify the remote user name and password to use as
 | 
			
		||||
     <literal>user</literal> and <literal>password</literal> options of the
 | 
			
		||||
     user mapping.
 | 
			
		||||
    </para>
 | 
			
		||||
   </listitem>
 | 
			
		||||
   <listitem>
 | 
			
		||||
    <para>
 | 
			
		||||
     Create a foreign table, using <xref linkend="sql-createforeigntable">,
 | 
			
		||||
     for each remote table you want to access.  The columns of the foreign
 | 
			
		||||
     table must match the referenced remote table.  You can, however, use
 | 
			
		||||
     table and/or column names different from the remote table's, if you
 | 
			
		||||
     specify the correct remote names as options of the foreign table object.
 | 
			
		||||
    </para>
 | 
			
		||||
   </listitem>
 | 
			
		||||
  </orderedlist>
 | 
			
		||||
 </para>
 | 
			
		||||
 | 
			
		||||
 <para>
 | 
			
		||||
  Now you need only <command>SELECT</> from a foreign table to access
 | 
			
		||||
  the data stored in its underlying remote table.
 | 
			
		||||
 </para>
 | 
			
		||||
 | 
			
		||||
 <para>
 | 
			
		||||
  It is generally recommended that the columns of a foreign table be declared
 | 
			
		||||
  with exactly the same data types, and collations if applicable, as the
 | 
			
		||||
  referenced columns of the remote table.  Although <filename>postgres_fdw</>
 | 
			
		||||
  is currently rather forgiving about performing data type conversions at
 | 
			
		||||
  need, surprising semantic anomalies may arise when types or collations do
 | 
			
		||||
  not match, due to the remote server interpreting <literal>WHERE</> clauses
 | 
			
		||||
  slightly differently from the local server.
 | 
			
		||||
 </para>
 | 
			
		||||
 | 
			
		||||
 <para>
 | 
			
		||||
  Note that a foreign table can be declared with fewer columns, or with a
 | 
			
		||||
  different column order, than its underlying remote table has.  Matching
 | 
			
		||||
  of columns to the remote table is by name, not position.
 | 
			
		||||
 </para>
 | 
			
		||||
 | 
			
		||||
 <sect2>
 | 
			
		||||
  <title>FDW Options of postgres_fdw</title>
 | 
			
		||||
 | 
			
		||||
  <sect3>
 | 
			
		||||
   <title>Connection Options</title>
 | 
			
		||||
 | 
			
		||||
   <para>
 | 
			
		||||
    A foreign server using the <filename>postgres_fdw</> foreign data wrapper
 | 
			
		||||
    can have the same options that <application>libpq</> accepts in
 | 
			
		||||
    connection strings, as described in <xref linkend="libpq-paramkeywords">,
 | 
			
		||||
    except that these options are not allowed:
 | 
			
		||||
 | 
			
		||||
    <itemizedlist spacing="compact">
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       <literal>user</literal> and <literal>password</literal> (specify these
 | 
			
		||||
       for a user mapping, instead)
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       <literal>client_encoding</> (this is automatically set from the local
 | 
			
		||||
       server encoding)
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       <literal>fallback_application_name</> (always set to
 | 
			
		||||
       <literal>postgres_fdw</>)
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
    </itemizedlist>
 | 
			
		||||
   </para>
 | 
			
		||||
 | 
			
		||||
   <para>
 | 
			
		||||
    Only superusers may connect to foreign servers without password
 | 
			
		||||
    authentication, so always specify the <literal>password</literal> option
 | 
			
		||||
    for user mappings belonging to non-superusers.
 | 
			
		||||
   </para>
 | 
			
		||||
  </sect3>
 | 
			
		||||
 | 
			
		||||
  <sect3>
 | 
			
		||||
   <title>Object Name Options</title>
 | 
			
		||||
 | 
			
		||||
   <para>
 | 
			
		||||
    These options can be used to control the names used in SQL statements
 | 
			
		||||
    sent to the remote <productname>PostgreSQL</productname> server.  These
 | 
			
		||||
    options are needed when a foreign table is created with names different
 | 
			
		||||
    from the underlying remote table's names.
 | 
			
		||||
   </para>
 | 
			
		||||
 | 
			
		||||
   <variablelist>
 | 
			
		||||
 | 
			
		||||
    <varlistentry>
 | 
			
		||||
     <term><literal>schema_name</literal></term>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       This option, which can be specified for a foreign table, gives the
 | 
			
		||||
       schema name to use for the foreign table on the remote server.  If this
 | 
			
		||||
       option is omitted, the name of the foreign table's schema is used.
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
    </varlistentry>
 | 
			
		||||
 | 
			
		||||
    <varlistentry>
 | 
			
		||||
     <term><literal>table_name</literal></term>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       This option, which can be specified for a foreign table, gives the
 | 
			
		||||
       table name to use for the foreign table on the remote server.  If this
 | 
			
		||||
       option is omitted, the foreign table's name is used.
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
    </varlistentry>
 | 
			
		||||
 | 
			
		||||
    <varlistentry>
 | 
			
		||||
     <term><literal>column_name</literal></term>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       This option, which can be specified for a column of a foreign table,
 | 
			
		||||
       gives the column name to use for the column on the remote server.
 | 
			
		||||
       If this option is omitted, the column's name is used.
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
    </varlistentry>
 | 
			
		||||
 | 
			
		||||
   </variablelist>
 | 
			
		||||
 | 
			
		||||
  </sect3>
 | 
			
		||||
 | 
			
		||||
  <sect3>
 | 
			
		||||
   <title>Cost Estimation Options</title>
 | 
			
		||||
 | 
			
		||||
   <para>
 | 
			
		||||
    <filename>postgres_fdw</> retrieves remote data by executing queries
 | 
			
		||||
    against remote servers, so ideally the estimated cost of scanning a
 | 
			
		||||
    foreign table should be whatever it costs to be done on the remote
 | 
			
		||||
    server, plus some overhead for communication.  The most reliable way to
 | 
			
		||||
    get such an estimate is to ask the remote server and then add something
 | 
			
		||||
    for overhead — but for simple queries, it may not be worth the cost
 | 
			
		||||
    of an additional remote query to get a cost estimate.
 | 
			
		||||
    So <filename>postgres_fdw</> provides the following options to control
 | 
			
		||||
    how cost estimation is done:
 | 
			
		||||
   </para>
 | 
			
		||||
 | 
			
		||||
   <variablelist>
 | 
			
		||||
 | 
			
		||||
    <varlistentry>
 | 
			
		||||
     <term><literal>use_remote_estimate</literal></term>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       This option, which can be specified for a foreign table or a foreign
 | 
			
		||||
       server, controls whether <filename>postgres_fdw</> issues remote
 | 
			
		||||
       <command>EXPLAIN</command> commands to obtain cost estimates.
 | 
			
		||||
       A setting for a foreign table overrides any setting for its server,
 | 
			
		||||
       but only for that table.
 | 
			
		||||
       The default is <literal>false</literal>.
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
    </varlistentry>
 | 
			
		||||
 | 
			
		||||
    <varlistentry>
 | 
			
		||||
     <term><literal>fdw_startup_cost</literal></term>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       This option, which can be specified for a foreign server, is a numeric
 | 
			
		||||
       value that is added to the estimated startup cost of any foreign-table
 | 
			
		||||
       scan on that server.  This represents the additional overhead of
 | 
			
		||||
       establishing a connection, parsing and planning the query on the
 | 
			
		||||
       remote side, etc.
 | 
			
		||||
       The default value is <literal>100</literal>.
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
    </varlistentry>
 | 
			
		||||
 | 
			
		||||
    <varlistentry>
 | 
			
		||||
     <term><literal>fdw_tuple_cost</literal></term>
 | 
			
		||||
     <listitem>
 | 
			
		||||
      <para>
 | 
			
		||||
       This option, which can be specified for a foreign server, is a numeric
 | 
			
		||||
       value that is used as extra cost per-tuple for foreign-table
 | 
			
		||||
       scans on that server.  This represents the additional overhead of
 | 
			
		||||
       data transfer between servers.  You might increase or decrease this
 | 
			
		||||
       number to reflect higher or lower network delay to the remote server.
 | 
			
		||||
       The default value is <literal>0.01</literal>.
 | 
			
		||||
      </para>
 | 
			
		||||
     </listitem>
 | 
			
		||||
    </varlistentry>
 | 
			
		||||
 | 
			
		||||
   </variablelist>
 | 
			
		||||
 | 
			
		||||
   <para>
 | 
			
		||||
    When <literal>use_remote_estimate</literal> is true,
 | 
			
		||||
    <filename>postgres_fdw</> obtains rowcount and cost estimates from the
 | 
			
		||||
    remote server and then adds <literal>fdw_startup_cost</literal> and
 | 
			
		||||
    <literal>fdw_tuple_cost</literal> to the cost estimates.  When
 | 
			
		||||
    <literal>use_remote_estimate</literal> is false,
 | 
			
		||||
    <filename>postgres_fdw</> performs local rowcount and cost estimation
 | 
			
		||||
    and then adds <literal>fdw_startup_cost</literal> and
 | 
			
		||||
    <literal>fdw_tuple_cost</literal> to the cost estimates.  This local
 | 
			
		||||
    estimation is unlikely to be very accurate unless local copies of the
 | 
			
		||||
    remote table's statistics are available.  Running
 | 
			
		||||
    <xref linkend="sql-analyze"> on the foreign table is the way to update
 | 
			
		||||
    the local statistics; this will perform a scan of the remote table and
 | 
			
		||||
    then calculate and store statistics just as though the table were local.
 | 
			
		||||
    Keeping local statistics can be a useful way to reduce per-query planning
 | 
			
		||||
    overhead for a remote table — but if the remote table is
 | 
			
		||||
    frequently updated, the local statistics will soon be obsolete.
 | 
			
		||||
   </para>
 | 
			
		||||
 | 
			
		||||
  </sect3>
 | 
			
		||||
 </sect2>
 | 
			
		||||
 | 
			
		||||
 <sect2>
 | 
			
		||||
  <title>Connection Management</title>
 | 
			
		||||
 | 
			
		||||
  <para>
 | 
			
		||||
   <filename>postgres_fdw</filename> establishes a connection to a
 | 
			
		||||
   foreign server during the first query that uses a foreign table
 | 
			
		||||
   associated with the foreign server.  This connection is kept and
 | 
			
		||||
   re-used for subsequent queries in the same session.  However, if
 | 
			
		||||
   multiple user identities (user mappings) are used to access the foreign
 | 
			
		||||
   server, a connection is established for each user mapping.
 | 
			
		||||
  </para>
 | 
			
		||||
 </sect2>
 | 
			
		||||
 | 
			
		||||
 <sect2>
 | 
			
		||||
  <title>Transaction Management</title>
 | 
			
		||||
 | 
			
		||||
  <para>
 | 
			
		||||
   During a query that references any remote tables on a foreign server,
 | 
			
		||||
   <filename>postgres_fdw</filename> opens a transaction on the
 | 
			
		||||
   remote server if one is not already open corresponding to the current
 | 
			
		||||
   local transaction.  The remote transaction is committed or aborted when
 | 
			
		||||
   the local transaction commits or aborts.  Savepoints are similarly
 | 
			
		||||
   managed by creating corresponding remote savepoints.
 | 
			
		||||
  </para>
 | 
			
		||||
 | 
			
		||||
  <para>
 | 
			
		||||
   The remote transaction uses <literal>SERIALIZABLE</>
 | 
			
		||||
   isolation level when the local transaction has <literal>SERIALIZABLE</>
 | 
			
		||||
   isolation level; otherwise it uses <literal>REPEATABLE READ</>
 | 
			
		||||
   isolation level.  This choice ensures that if a query performs multiple
 | 
			
		||||
   table scans on the remote server, it will get snapshot-consistent results
 | 
			
		||||
   for all the scans.  A consequence is that successive queries within a
 | 
			
		||||
   single transaction will see the same data from the remote server, even if
 | 
			
		||||
   concurrent updates are occurring on the remote server due to other
 | 
			
		||||
   activities.  That behavior would be expected anyway if the local
 | 
			
		||||
   transaction uses <literal>SERIALIZABLE</> or <literal>REPEATABLE READ</>
 | 
			
		||||
   isolation level, but it might be surprising for a <literal>READ
 | 
			
		||||
   COMMITTED</> local transaction.  A future
 | 
			
		||||
   <productname>PostgreSQL</productname> release might modify these rules.
 | 
			
		||||
  </para>
 | 
			
		||||
 </sect2>
 | 
			
		||||
 | 
			
		||||
 <sect2>
 | 
			
		||||
  <title>Remote Query Optimization</title>
 | 
			
		||||
 | 
			
		||||
  <para>
 | 
			
		||||
   <filename>postgres_fdw</> attempts to optimize remote queries to reduce
 | 
			
		||||
   the amount of data transferred from foreign servers.  This is done by
 | 
			
		||||
   sending query <literal>WHERE</> clauses to the remote server for
 | 
			
		||||
   execution, and by not retrieving table columns that are not needed for
 | 
			
		||||
   the current query.  To reduce the risk of misexecution of queries,
 | 
			
		||||
   <literal>WHERE</> clauses are not sent to the remote server unless they use
 | 
			
		||||
   only built-in data types, operators, and functions.  Operators and
 | 
			
		||||
   functions in the clauses must be <literal>IMMUTABLE</> as well.
 | 
			
		||||
  </para>
 | 
			
		||||
 | 
			
		||||
  <para>
 | 
			
		||||
   The query that is actually sent to the remote server for execution can
 | 
			
		||||
   be examined using <command>EXPLAIN VERBOSE</>.
 | 
			
		||||
  </para>
 | 
			
		||||
 </sect2>
 | 
			
		||||
 | 
			
		||||
 <sect2>
 | 
			
		||||
  <title>Author</title>
 | 
			
		||||
  <para>
 | 
			
		||||
   Shigeru Hanada <email>shigeru.hanada@gmail.com</email>
 | 
			
		||||
  </para>
 | 
			
		||||
 </sect2>
 | 
			
		||||
 | 
			
		||||
</sect1>
 | 
			
		||||
		Reference in New Issue
	
	Block a user