1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-25 13:17:41 +03:00
Commit Graph

84 Commits

Author SHA1 Message Date
Robert Haas
e4106b2528 postgres_fdw: Push down joins to remote servers.
If we've got a relatively straightforward join between two tables,
this pushes that join down to the remote server instead of fetching
the rows for each table and performing the join locally.  Some cases
are not handled yet, such as SEMI and ANTI joins.  Also, we don't
yet attempt to create presorted join paths or parameterized join
paths even though these options do get tried for a base relation
scan.  Nevertheless, this seems likely to be a very significant win
in many practical cases.

Shigeru Hanada and Ashutosh Bapat, reviewed by Robert Haas, with
additional review at various points by Tom Lane, Etsuro Fujita,
KaiGai Kohei, and Jeevan Chalke.
2016-02-09 14:00:50 -05:00
Robert Haas
37c84570b1 postgres_fdw: Avoid possible misbehavior when RETURNING tableoid column only.
deparseReturningList ended up adding up RETURNING NULL to the code, but
code elsewhere saw an empty list of attributes and concluded that it
should not expect tuples from the remote side.

Etsuro Fujita and Robert Haas, reviewed by Thom Brown
2016-02-04 22:27:13 -05:00
Robert Haas
dc203dc3ac postgres_fdw: Allow fetch_size to be set per-table or per-server.
The default fetch size of 100 rows might not be right in every
environment, so allow users to configure it.

Corey Huinker, reviewed by Kyotaro Horiguchi, Andres Freund, and me.
2016-02-03 09:07:35 -05:00
Tom Lane
7e22470471 Fix incorrect pattern-match processing in psql's \det command.
listForeignTables' invocation of processSQLNamePattern did not match up
with the other ones that handle potentially-schema-qualified names; it
failed to make use of pg_table_is_visible() and also passed the name
arguments in the wrong order.  Bug seems to have been aboriginal in commit
0d692a0dc9.  It accidentally sort of worked as long as you didn't
inquire too closely into the behavior, although the silliness was later
exposed by inconsistencies in the test queries added by 59efda3e50
(which I probably should have questioned at the time, but didn't).

Per bug #13899 from Reece Hart.  Patch by Reece Hart and Tom Lane.
Back-patch to all affected branches.
2016-01-29 10:28:02 +01:00
Robert Haas
ccd8f97922 postgres_fdw: Consider requesting sorted data so we can do a merge join.
When use_remote_estimate is enabled, consider adding ORDER BY to the
query we sending to the remote server so that we can use that ordered
data for a merge join.  Commit f18c944b61
arranges to push down the query pathkeys, which seems like the case
mostly likely to be a win, but testing shows this can sometimes win,
too.

For a regular table, we know which indexes are present and therefore
test whether the ordering provided by each such index is useful.  Here,
we take the opposite approach: guess what orderings would be useful if
they could be generated cheaply, and then ask the remote side what those
will cost.

Ashutosh Bapat, with very substantial cosmetic revisions by me.  Also
reviewed by Rushabh Lathia.
2015-12-22 13:46:40 -05:00
Tom Lane
b9f117d6cd Add regression tests for remote execution of extension operators/functions.
Rather than relying on other extensions to be available for installation,
let's just add some test objects to the postgres_fdw extension itself
within the regression script.
2015-11-04 12:03:30 -05:00
Robert Haas
f18c944b61 postgres_fdw: Add ORDER BY to some remote SQL queries.
If the join problem's entire ORDER BY clause can be pushed to the
remote server, consider a path that adds this ORDER BY clause.  If
use_remote_estimate is on, we cost this path using an additional
remote EXPLAIN.  If not, we just estimate that the path costs 20%
more, which is intended to be large enough that we won't request a
remote sort when it's not helpful, but small enough that we'll have
the remote side do the sort when in doubt.  In some cases, the remote
sort might actually be free, because the remote query plan might
happen to produce output that is ordered the way we need, but without
remote estimates we have no way of knowing that.

It might also be useful to request sorted output from the remote side
if it enables an efficient merge join, but this patch doesn't attempt
to handle that case.

Ashutosh Bapat with revisions by me.  Also reviewed by Fabrízio de Royes
Mello and Jeevan Chalke.
2015-11-03 13:04:42 -05:00
Tom Lane
76f965ff1f Improve handling of collations in contrib/postgres_fdw.
If we have a local Var of say varchar type with default collation, and
we apply a RelabelType to convert that to text with default collation, we
don't want to consider that as creating an FDW_COLLATE_UNSAFE situation.
It should be okay to compare that to a remote Var, so long as the remote
Var determines the comparison collation.  (When we actually ship such an
expression to the remote side, the local Var would become a Param with
default collation, meaning the remote Var would in fact control the
comparison collation, because non-default implicit collation overrides
default implicit collation in parse_collate.c.)  To fix, be more precise
about what FDW_COLLATE_NONE means: it applies either to a noncollatable
data type or to a collatable type with default collation, if that collation
can't be traced to a remote Var.  (When it can, FDW_COLLATE_SAFE is
appropriate.)  We were essentially using that interpretation already at
the Var/Const/Param level, but we weren't bubbling it up properly.

An alternative fix would be to introduce a separate FDW_COLLATE_DEFAULT
value to describe the second situation, but that would add more code
without changing the actual behavior, so it didn't seem worthwhile.

Also, since we're clarifying the rule to be that we care about whether
operator/function input collations match, there seems no need to fail
immediately upon seeing a Const/Param/non-foreign-Var with nondefault
collation.  We only have to reject if it appears in a collation-sensitive
context (for example, "var IS NOT NULL" is perfectly safe from a collation
standpoint, whatever collation the var has).  So just set the state to
UNSAFE rather than failing immediately.

Per report from Jeevan Chalke.  This essentially corrects some sloppy
thinking in commit ed3ddf918b, so back-patch
to 9.3 where that logic appeared.
2015-09-24 12:47:29 -04:00
Andres Freund
168d5805e4 Add support for INSERT ... ON CONFLICT DO NOTHING/UPDATE.
The newly added ON CONFLICT clause allows to specify an alternative to
raising a unique or exclusion constraint violation error when inserting.
ON CONFLICT refers to constraints that can either be specified using a
inference clause (by specifying the columns of a unique constraint) or
by naming a unique or exclusion constraint.  DO NOTHING avoids the
constraint violation, without touching the pre-existing row.  DO UPDATE
SET ... [WHERE ...] updates the pre-existing tuple, and has access to
both the tuple proposed for insertion and the existing tuple; the
optional WHERE clause can be used to prevent an update from being
executed.  The UPDATE SET and WHERE clauses have access to the tuple
proposed for insertion using the "magic" EXCLUDED alias, and to the
pre-existing tuple using the table name or its alias.

This feature is often referred to as upsert.

This is implemented using a new infrastructure called "speculative
insertion". It is an optimistic variant of regular insertion that first
does a pre-check for existing tuples and then attempts an insert.  If a
violating tuple was inserted concurrently, the speculatively inserted
tuple is deleted and a new attempt is made.  If the pre-check finds a
matching tuple the alternative DO NOTHING or DO UPDATE action is taken.
If the insertion succeeds without detecting a conflict, the tuple is
deemed inserted.

To handle the possible ambiguity between the excluded alias and a table
named excluded, and for convenience with long relation names, INSERT
INTO now can alias its target table.

Bumps catversion as stored rules change.

Author: Peter Geoghegan, with significant contributions from Heikki
    Linnakangas and Andres Freund. Testing infrastructure by Jeff Janes.
Reviewed-By: Heikki Linnakangas, Andres Freund, Robert Haas, Simon Riggs,
    Dean Rasheed, Stephen Frost and many others.
2015-05-08 05:43:10 +02:00
Tom Lane
cb1ca4d800 Allow foreign tables to participate in inheritance.
Foreign tables can now be inheritance children, or parents.  Much of the
system was already ready for this, but we had to fix a few things of
course, mostly in the area of planner and executor handling of row locks.

As side effects of this, allow foreign tables to have NOT VALID CHECK
constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to
accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS.  Continuing to
disallow these things would've required bizarre and inconsistent special
cases in inheritance behavior.  Since foreign tables don't enforce CHECK
constraints anyway, a NOT VALID one is a complete no-op, but that doesn't
mean we shouldn't allow it.  And it's possible that some FDWs might have
use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops
for most.

An additional change in support of this is that when a ModifyTable node
has multiple target tables, they will all now be explicitly identified
in EXPLAIN output, for example:

 Update on pt1  (cost=0.00..321.05 rows=3541 width=46)
   Update on pt1
   Foreign Update on ft1
   Foreign Update on ft2
   Update on child3
   ->  Seq Scan on pt1  (cost=0.00..0.00 rows=1 width=46)
   ->  Foreign Scan on ft1  (cost=100.00..148.03 rows=1170 width=46)
   ->  Foreign Scan on ft2  (cost=100.00..148.03 rows=1170 width=46)
   ->  Seq Scan on child3  (cost=0.00..25.00 rows=1200 width=46)

This was done mainly to provide an unambiguous place to attach "Remote SQL"
fields, but it is useful for inherited updates even when no foreign tables
are involved.

Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro
Horiguchi, some additional hacking by me
2015-03-22 13:53:21 -04:00
Tom Lane
fc2ac1fb41 Allow CHECK constraints to be placed on foreign tables.
As with NOT NULL constraints, we consider that such constraints are merely
reports of constraints that are being enforced by the remote server (or
other underlying storage mechanism).  Their only real use is to allow
planner optimizations, for example in constraint-exclusion checks.  Thus,
the code changes here amount to little more than removal of the error that
was formerly thrown for applying CHECK to a foreign table.

(In passing, do a bit of cleanup of the ALTER FOREIGN TABLE reference page,
which had accumulated some weird decisions about ordering etc.)

Shigeru Hanada and Etsuro Fujita, reviewed by Kyotaro Horiguchi and
Ashutosh Bapat.
2014-12-17 17:00:53 -05:00
Tom Lane
8ec8760fc8 Revert misguided change to postgres_fdw FOR UPDATE/SHARE code.
In commit 462bd95705, I changed postgres_fdw
to rely on get_plan_rowmark() instead of get_parse_rowmark().  I still
think that's a good idea in the long run, but as Etsuro Fujita pointed out,
it doesn't work today because planner.c forces PlanRowMarks to have
markType = ROW_MARK_COPY for all foreign tables.  There's no urgent reason
to change this in the back branches, so let's just revert that part of
yesterday's commit rather than trying to design a better solution under
time pressure.

Also, add a regression test case showing what postgres_fdw does with FOR
UPDATE/SHARE.  I'd blithely assumed there was one already, else I'd have
realized yesterday that this code didn't work.
2014-12-12 12:41:49 -05:00
Tom Lane
9c58101117 Fix mishandling of system columns in FDW queries.
postgres_fdw would send query conditions involving system columns to the
remote server, even though it makes no effort to ensure that system
columns other than CTID match what the remote side thinks.  tableoid,
in particular, probably won't match and might have some use in queries.
Hence, prevent sending conditions that include non-CTID system columns.

Also, create_foreignscan_plan neglected to check local restriction
conditions while determining whether to set fsSystemCol for a foreign
scan plan node.  This again would bollix the results for queries that
test a foreign table's tableoid.

Back-patch the first fix to 9.3 where postgres_fdw was introduced.
Back-patch the second to 9.2.  The code is probably broken in 9.1 as
well, but the patch doesn't apply cleanly there; given the weak state
of support for FDWs in 9.1, it doesn't seem worth fixing.

Etsuro Fujita, reviewed by Ashutosh Bapat, and somewhat modified by me
2014-11-22 16:01:05 -05:00
Andres Freund
57ca1d4f01 Specify the port in dblink and postgres_fdw tests.
That allows to run those tests against a postmaster listening on a
nonstandard port without requiring to export PGPORT in postmaster's
environment.

This still doesn't support connecting to a nondefault host without
configuring it in postmaster's environment. That's harder and less
frequently used though. So this is a useful step.
2014-08-26 12:28:08 +02:00
Andres Freund
ddc2504dbc Don't hardcode contrib_regression dbname in postgres_fdw and dblink tests.
That allows parallel installcheck to succeed inside contrib/. The
output is not particularly pretty unless make's -O option to
synchronize the output is used.

There's other tests, outside contrib, that use a hardcoded,
non-unique, database name. Those prohibit paralell installcheck to be
used across more directories; but that's something for a separate
patch.
2014-08-26 12:27:26 +02:00
Tom Lane
59efda3e50 Implement IMPORT FOREIGN SCHEMA.
This command provides an automated way to create foreign table definitions
that match remote tables, thereby reducing tedium and chances for error.
In this patch, we provide the necessary core-server infrastructure and
implement the feature fully in the postgres_fdw foreign-data wrapper.
Other wrappers will throw a "feature not supported" error until/unless
they are updated.

Ronan Dunklau and Michael Paquier, additional work by me
2014-07-10 15:01:43 -04:00
Tom Lane
5b68d81697 Fix contrib/postgres_fdw's remote-estimate representation of array Params.
We were emitting "(SELECT null::typename)", which is usually interpreted
as a scalar subselect, but not so much in the context "x = ANY(...)".
This led to remote-side parsing failures when remote_estimate is enabled.
A quick and ugly fix is to stick in an extra cast step,
"((SELECT null::typename)::typename)".  The cast will be thrown away as
redundant by parse analysis, but not before it's done its job of making
sure the grammar sees the ANY argument as an a_expr rather than a
select_with_parens.  Per an example from Hannu Krosing.
2014-04-16 17:21:57 -04:00
Noah Misch
b2b2491b06 Don't test xmin/xmax columns of a postgres_fdw foreign table.
Their values are unspecified and system-dependent.

Per buildfarm member kouprey.
2014-03-23 03:48:17 -04:00
Noah Misch
7cbe57c34d Offer triggers on foreign tables.
This covers all the SQL-standard trigger types supported for regular
tables; it does not cover constraint triggers.  The approach for
acquiring the old row mirrors that for view INSTEAD OF triggers.  For
AFTER ROW triggers, we spool the foreign tuples to a tuplestore.

This changes the FDW API contract; when deciding which columns to
populate in the slot returned from data modification callbacks, writable
FDWs will need to check for AFTER ROW triggers in addition to checking
for a RETURNING clause.

In support of the feature addition, refactor the TriggerFlags bits and
the assembly of old tuples in ModifyTable.

Ronan Dunklau, reviewed by KaiGai Kohei; some additional hacking by me.
2014-03-23 02:16:34 -04:00
Tom Lane
83204e100c Fix contrib/postgres_fdw to handle multiple join conditions properly.
The previous coding supposed that it could consider just a single join
condition in any one parameterized path for the foreign table.  But in
reality, the parameterized-path machinery forces all join clauses that are
"movable to" the foreign table to be evaluated at that node; including
clauses that we might not consider safe to send across.  Such cases would
result in an Assert failure in an assert-enabled build, and otherwise in
sending an unsafe clause to the foreign server, which might result in
errors or silently-wrong answers.  A lesser problem was that the
cost/rowcount estimates generated for the parameterized path failed to
account for any additional join quals that get assigned to the scan.

To fix, rewrite postgresGetForeignPaths so that it correctly collects all
the movable quals for any one outer relation when generating parameterized
paths; we'll now generate just one path per outer relation not one per join
qual.  Also fix bogus assumptions in postgresGetForeignPlan and
estimate_path_cost_size that only safe-to-send join quals will be
presented.

Based on complaint from Etsuro Fujita that the path costs were being
miscalculated, though this is significantly different from his proposed
patch.
2014-03-07 16:36:40 -05:00
Tom Lane
dc3eb56383 Improve updatability checking for views and foreign tables.
Extend the FDW API (which we already changed for 9.3) so that an FDW can
report whether specific foreign tables are insertable/updatable/deletable.
The default assumption continues to be that they're updatable if the
relevant executor callback function is supplied by the FDW, but finer
granularity is now possible.  As a test case, add an "updatable" option to
contrib/postgres_fdw.

This patch also fixes the information_schema views, which previously did
not think that foreign tables were ever updatable, and fixes
view_is_auto_updatable() so that a view on a foreign table can be
auto-updatable.

initdb forced due to changes in information_schema views and the functions
they rely on.  This is a bit unfortunate to do post-beta1, but if we don't
change this now then we'll have another API break for FDWs when we do
change it.

Dean Rasheed, somewhat editorialized on by Tom Lane
2013-06-12 17:53:33 -04:00
Tom Lane
e0b451e432 Tweak postgres_fdw regression test so autovacuum doesn't change results.
Autovacuum occurring while the test runs could allow some of the inserts to
go into recycled space, thus changing the output ordering of later queries.
While we could complicate those queries to force sorting of their output
rows, it doesn't seem like that would make the test better in any
meaningful way, and conceivably it could hide unexpected diffs.  Instead,
tweak the affected queries so that the inserted rows aren't updated by the
following UPDATE.  Per buildfarm.
2013-06-09 19:41:52 -04:00
Tom Lane
b142068622 Allow CREATE FOREIGN TABLE to include SERIAL columns.
The behavior is that the required sequence is created locally, which is
appropriate because the default expression will be evaluated locally.
Per gripe from Brad Nicholson that this case was refused with a confusing
error message.  We could have improved the error message but it seems
better to just allow the case.

Also, remove ALTER TABLE's arbitrary prohibition against being applied to
foreign tables, which was pretty inconsistent considering we allow it for
views, sequences, and other relation types that aren't even called tables.
This is needed to avoid breaking pg_dump, which sometimes emits column
defaults using separate ALTER TABLE commands.  (I think this can happen
even when the default is not associated with a sequence, so that was a
pre-existing bug once we allowed column defaults for foreign tables.)
2013-05-15 19:03:29 -04:00
Tom Lane
e690b95150 Avoid retrieving dummy NULL columns in postgres_fdw.
This should provide some marginal overall savings, since it surely takes
many more cycles for the remote server to deal with the NULL columns than
it takes for postgres_fdw not to emit them.  But really the reason is to
keep the emitted queries from looking quite so silly ...
2013-03-22 00:31:11 -04:00
Tom Lane
9cbc4b80dd Redo postgres_fdw's planner code so it can handle parameterized paths.
I wasn't going to ship this without having at least some example of how
to do that.  This version isn't terribly bright; in particular it won't
consider any combinations of multiple join clauses.  Given the cost of
executing a remote EXPLAIN, I'm not sure we want to be very aggressive
about doing that, anyway.

In support of this, refactor generate_implied_equalities_for_indexcol
so that it can be used to extract equivalence clauses that aren't
necessarily tied to an index.
2013-03-21 19:44:32 -04:00
Tom Lane
ed3ddf918b Introduce less-bogus handling of collations in contrib/postgres_fdw.
Treat expressions as being remotely executable only if all collations used
in them are determined by Vars of the foreign table.  This means that, if
the foreign server gets different answers than we do, it's the user's fault
for not having marked the foreign table columns with collations equivalent
to the remote table's.  This rule allows most simple expressions such as
"var < 'constant'" to be sent to the remote side, because the constant
isn't determining the collation (the Var's collation would win).  There's
still room for improvement, but it's hard to see how to do it without a
lot more knowledge and/or assumptions about what the remote side will do.
2013-03-13 19:46:31 -04:00
Tom Lane
50c19fc76f Fix contrib/postgres_fdw's handling of column defaults.
Adopt the position that only locally-defined defaults matter.  Any defaults
defined in the remote database do not affect insertions performed through
a foreign table (unless they are for columns not known to the foreign
table).  While it'd arguably be more useful to permit remote defaults to be
used, making that work in a consistent fashion requires far more work than
seems possible for 9.3.
2013-03-12 18:58:13 -04:00
Tom Lane
0247d43dd9 Avoid row-processing-order dependency in postgres_fdw regression test.
A test intended to provoke an error on the remote side was coded in such
a way that multiple rows should be updated, so the output would vary
depending on which one was processed first.  Per buildfarm.
2013-03-12 10:47:04 -04:00
Tom Lane
cc3f281ffb Fix postgres_fdw's issues with inconsistent interpretation of data values.
For datatypes whose output formatting depends on one or more GUC settings,
we have to worry about whether the other server will interpret the value
the same way it was meant.  pg_dump has been aware of this hazard for a
long time, but postgres_fdw needs to deal with it too.  To fix data
retrieval from the remote server, set the necessary remote GUC settings at
connection startup.  (We were already assuming that settings made then
would persist throughout the remote session.)  To fix data transmission to
the remote server, temporarily force the relevant GUCs to the right values
when we're about to convert any data values to text for transmission.

This is all pretty grotty, and not very cheap either.  It's tempting to
think of defining one uber-GUC that would override any settings that might
render printed data values unportable.  But of course, older remote servers
wouldn't know any such thing and would still need this logic.

While at it, revert commit f7951eef89, since
this provides a real fix.  (The timestamptz given in the error message
returned from the "remote" server will now reliably be shown in UTC.)
2013-03-11 21:31:28 -04:00
Tom Lane
f7951eef89 Band-aid for regression test expected-results problem with timestamptz.
We probably need to tell the remote server to use specific timezone and
datestyle settings, and maybe other things.  But for now let's just hack
the postgres_fdw regression test to not provoke failures when run in
non-EST5EDT environments.  Per buildfarm.
2013-03-10 15:07:38 -04:00
Tom Lane
21734d2fb8 Support writable foreign tables.
This patch adds the core-system infrastructure needed to support updates
on foreign tables, and extends contrib/postgres_fdw to allow updates
against remote Postgres servers.  There's still a great deal of room for
improvement in optimization of remote updates, but at least there's basic
functionality there now.

KaiGai Kohei, reviewed by Alexander Korotkov and Laurenz Albe, and rather
heavily revised by Tom Lane.
2013-03-10 14:16:02 -04:00
Tom Lane
09a7cd409e Rename postgres_fdw's use_remote_explain option to use_remote_estimate.
The new name was originally my typo, but per discussion it seems like a
better name anyway.  So make the code match the docs, not vice versa.
2013-02-23 12:20:48 -05:00
Tom Lane
6da378dbc9 Fix whole-row references in postgres_fdw.
The optimization to not retrieve unnecessary columns wasn't smart enough.
Noted by Thom Brown.
2013-02-22 09:21:50 -05:00
Tom Lane
d0d75c4022 Add postgres_fdw contrib module.
There's still a lot of room for improvement, but it basically works,
and we need this to be present before we can do anything much with the
writable-foreign-tables patch.  So let's commit it and get on with testing.

Shigeru Hanada, reviewed by KaiGai Kohei and Tom Lane
2013-02-21 05:27:16 -05:00