From 94e21021cdf25e50e28be3e4f3147ed56796433a Mon Sep 17 00:00:00 2001 From: Tom Lane Date: Sat, 12 Jun 2021 15:12:10 -0400 Subject: [PATCH] Restore robustness of TAP tests that wait for postmaster restart. Several TAP tests use poll_query_until() to wait for the postmaster to restart. They were checking to see if a trivial query (e.g. "SELECT 1") succeeds. However, that's problematic in the wake of commit 11e9caff8, because now that we feed said query to psql via stdin, we risk IPC::Run whining about a SIGPIPE failure if psql quits before reading the query. Hence, we can't use a nonempty query in cases where we need to wait for connection failures to stop happening. Per the precedent of commits c757a3da0 and 6d41dd045, we can pass "undef" as the query in such cases to ensure that IPC::Run has nothing to write. However, then we have to say that the expected output is empty, and this exposes a deficiency in poll_query_until: if psql fails altogether and returns empty stdout, poll_query_until will treat that as a success! That's because, contrary to its documentation, it makes no actual check for psql failure, looking neither at the exit status nor at stderr. To fix that, adjust poll_query_until to insist on empty stderr as well as a stdout match. (I experimented with checking exit status instead, but it seems that psql often does exit(1) in cases that we need to consider successes. That might be something to fix someday, but it would be a non-back-patchable behavior change.) Back-patch to v10. The test cases needing this exist only as far back as v11, but it seems wise to keep poll_query_until's behavior the same in v10, in case we back-patch another such test case in future. (9.6 does not currently need this change, because in that branch poll_query_until can't be told to accept empty stdout as a success case.) Per assorted buildfarm failures, mostly on hoverfly. Discussion: https://postgr.es/m/CAA4eK1+zM6L4QSA1XMvXY_qqWwdUmqkOS1+hWvL8QcYEBGA1Uw@mail.gmail.com --- src/test/perl/PostgresNode.pm | 6 +++--- src/test/recovery/t/013_crash_restart.pl | 10 +++------- 2 files changed, 6 insertions(+), 10 deletions(-) diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm index 68f0958ac35..52f552be85b 100644 --- a/src/test/perl/PostgresNode.pm +++ b/src/test/perl/PostgresNode.pm @@ -1668,8 +1668,10 @@ sub poll_query_until $stdout =~ s/\r\n/\n/g if $Config{osname} eq 'msys'; chomp($stdout); + $stderr =~ s/\r\n/\n/g if $Config{osname} eq 'msys'; + chomp($stderr); - if ($stdout eq $expected) + if ($stdout eq $expected && $stderr eq '') { return 1; } @@ -1682,8 +1684,6 @@ sub poll_query_until # The query result didn't change in 180 seconds. Give up. Print the # output from the last attempt, hopefully that's useful for debugging. - $stderr =~ s/\r\n/\n/g if $Config{osname} eq 'msys'; - chomp($stderr); diag qq(poll_query_until timed out executing this query: $query expecting this output: diff --git a/src/test/recovery/t/013_crash_restart.pl b/src/test/recovery/t/013_crash_restart.pl index 2c477978e7d..3e3d50b5dd8 100644 --- a/src/test/recovery/t/013_crash_restart.pl +++ b/src/test/recovery/t/013_crash_restart.pl @@ -134,12 +134,8 @@ ok( pump_until( $monitor->finish; # Wait till server restarts -is( $node->poll_query_until( - 'postgres', - 'SELECT $$restarted after sigquit$$;', - 'restarted after sigquit'), - "1", - "reconnected after SIGQUIT"); +is($node->poll_query_until('postgres', undef, ''), + "1", "reconnected after SIGQUIT"); # restart psql processes, now that the crash cycle finished @@ -214,7 +210,7 @@ ok( pump_until( $monitor->finish; # Wait till server restarts -is($node->poll_query_until('postgres', 'SELECT 1', '1'), +is($node->poll_query_until('postgres', undef, ''), "1", "reconnected after SIGKILL"); # Make sure the committed rows survived, in-progress ones not