1
0
mirror of https://github.com/postgres/postgres.git synced 2025-04-20 00:42:27 +03:00
postgres/contrib/postgres_fdw/sql/query_cancel.sql
Tom Lane c431986de1 postgres_fdw: re-issue cancel requests a few times if necessary.
Despite the best efforts of commit 0e5c82380, we're still seeing
occasional failures of postgres_fdw's query_cancel test in the
buildfarm.  Investigation suggests that its 100ms timeout is
still not enough to reliably ensure that the remote side starts
the query before receiving the cancel request --- and if it
hasn't, it will just discard the request because it's idle.

We discussed allowing a cancel request to kill the next-received
query, but that would have wide and perhaps unpleasant side-effects.
What seems safer is to make postgres_fdw do what a human user would
likely do, which is issue another cancel request if the first one
didn't seem to do anything.  We'll keep the same overall 30 second
grace period before concluding things are broken, but issue additional
cancel requests after 1 second, then 2 more seconds, then 4, then 8.
(The next one in series is 16 seconds, but we'll hit the 30 second
timeout before that.)

Having done that, revert the timeout in query_cancel.sql to 10 ms.
That will still be enough on most machines, most of the time, for
the remote query to start; but now we're intentionally risking the
race condition occurring sometimes in the buildfarm, so that the
repeat-cancel code path will get some testing.

As before, back-patch to v17.  We might eventually contemplate
back-patching this further, and/or adding similar logic to dblink.
But given the lack of field complaints to date, this feels like
mostly an exercise in test case stabilization, so v17 is enough.

Discussion: https://postgr.es/m/colnv3lzzmc53iu5qoawynr6qq7etn47lmggqr65ddtpjliq5d@glkveb4m6nop
2024-12-23 15:14:30 -05:00

23 lines
938 B
PL/PgSQL

SELECT version() ~ 'cygwin' AS skip_test \gset
\if :skip_test
\quit
\endif
-- Let's test canceling a remote query. Use a table that does not have
-- remote_estimate enabled, else there will be multiple queries to the
-- remote and we might unluckily send the cancel in between two of them.
-- First let's confirm that the query is actually pushed down.
EXPLAIN (VERBOSE, COSTS OFF)
SELECT count(*) FROM ft1 a CROSS JOIN ft1 b CROSS JOIN ft1 c CROSS JOIN ft1 d;
BEGIN;
-- Make sure that connection is open and set up.
SELECT count(*) FROM ft1 a;
-- On most machines, 10ms will be enough to be sure that we've sent the slow
-- query. We may sometimes exercise the race condition where we send cancel
-- before the remote side starts the query, but that's fine too.
SET LOCAL statement_timeout = '10ms';
-- This would take very long if not canceled:
SELECT count(*) FROM ft1 a CROSS JOIN ft1 b CROSS JOIN ft1 c CROSS JOIN ft1 d;
COMMIT;