mirror of
https://github.com/postgres/postgres.git
synced 2025-08-30 06:01:21 +03:00
xl_tot_len comes first in a WAL record. Usually we don't trust it to be
the true length until we've validated the record header. If the record
header was split across two pages, previously we wouldn't do the
validation until after we'd already tried to allocate enough memory to
hold the record, which was bad because it might actually be garbage
bytes from a recycled WAL file, so we could try to allocate a lot of
memory. Release 15 made it worse.
Since 70b4f82a4b
, we'd at least generate an end-of-WAL condition if the
garbage 4 byte value happened to be > 1GB, but we'd still try to
allocate up to 1GB of memory bogusly otherwise. That was an
improvement, but unfortunately release 15 tries to allocate another
object before that, so you could get a FATAL error and recovery could
fail.
We can fix both variants of the problem more fundamentally using
pre-existing page-level validation, if we just re-order some logic.
The new order of operations in the split-header case defers all memory
allocation based on xl_tot_len until we've read the following page. At
that point we know that its first few bytes are not recycled data, by
checking its xlp_pageaddr, and that its xlp_rem_len agrees with
xl_tot_len on the preceding page. That is strong evidence that
xl_tot_len was truly the start of a record that was logged.
This problem was most likely to occur on a standby, because
walreceiver.c recycles WAL files without zeroing out trailing regions of
each page. We could fix that too, but it wouldn't protect us from rare
crash scenarios where the trailing zeroes don't make it to disk.
With reliable xl_tot_len validation in place, the ancient policy of
considering malloc failure to indicate corruption at end-of-WAL seems
quite surprising, but changing that is left for later work.
Also included is a new TAP test to exercise various cases of end-of-WAL
detection by writing contrived data into the WAL from Perl.
Back-patch to 12. We decided not to put this change into the final
release of 11.
Author: Thomas Munro <thomas.munro@gmail.com>
Author: Michael Paquier <michael@paquier.xyz>
Reported-by: Alexander Lakhin <exclusion@gmail.com>
Reviewed-by: Noah Misch <noah@leadboat.com> (the idea, not the code)
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Sergei Kornilov <sk@zsrv.org>
Reviewed-by: Alexander Lakhin <exclusion@gmail.com>
Discussion: https://postgr.es/m/17928-aa92416a70ff44a2%40postgresql.org
PostgreSQL tests ================ This directory contains a variety of test infrastructure as well as some of the tests in PostgreSQL. Not all tests are here -- in particular, there are more in individual contrib/ modules and in src/bin. Not all these tests get run by "make check". Check src/test/Makefile to see which tests get run automatically. authentication/ Tests for authentication (but see also below) examples/ Demonstration programs for libpq that double as regression tests via "make check" isolation/ Tests for concurrent behavior at the SQL level kerberos/ Tests for Kerberos/GSSAPI authentication and encryption ldap/ Tests for LDAP-based authentication locale/ Sanity checks for locale data, encodings, etc mb/ Tests for multibyte encoding (UTF-8) support modules/ Extensions used only or mainly for test purposes, generally not suitable for installing in production databases perl/ Infrastructure for Perl-based TAP tests recovery/ Test suite for recovery and replication regress/ PostgreSQL's main regression test suite, pg_regress ssl/ Tests to exercise and verify SSL certificate handling subscription/ Tests for logical replication