1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-25 13:17:41 +03:00

Fix mis-attribution of checksum failure stats to the wrong database

Checksum failure stats could be attributed to the wrong database in two cases:

- when a read of a shared relation encountered a checksum error , it would be
  attributed to the current database, instead of the "database" representing
  shared relations

- when using CREATE DATABASE ... STRATEGY WAL_LOG checksum errors in the
  source database would be attributed to the current database

The checksum stats reporting via PageIsVerifiedExtended(PIV_REPORT_STAT) does
not have access to the information about what database a page belongs to.

This fixes the issue by removing PIV_REPORT_STAT and delegating the
responsibility to report stats to the caller, which now can learn about the
number of stats via a new optional argument.

As this changes the signature of PageIsVerifiedExtended() and all callers
should adapt to the new signature, use the occasion to rename the function to
PageIsVerified() and remove the compatibility macro.

We could instead have fixed this by adding information about the database to
the args of PageIsVerified(), but there are soon-to-be-applied patches that
need to separate the stats reporting from the PageIsVerified() call
anyway. Those patches also include testing for the failure paths, something we
inexplicably have not had.

As there is no caller of pgstat_report_checksum_failure() left, remove it.

It'd be possible, but awkward to fix this in the back branches. We considered
doing the work not quite worth it, as mis-attributed stats should still elicit
concern. The emitted error messages do allow to attribute the errors
correctly.

Discussion: https://postgr.es/m/5tyic6epvdlmd6eddgelv47syg2b5cpwffjam54axp25xyq2ga@ptwkinxqo3az
Discussion: https://postgr.es/m/mglpvvbhighzuwudjxzu4br65qqcxsnyvio3nl4fbog3qknwhg@e4gt7npsohuz
This commit is contained in:
Andres Freund
2025-03-29 13:38:35 -04:00
parent 68f97aeadb
commit dee8002468
6 changed files with 48 additions and 34 deletions

View File

@@ -465,31 +465,27 @@ do { \
#define PAI_OVERWRITE (1 << 0)
#define PAI_IS_HEAP (1 << 1)
/* flags for PageIsVerifiedExtended() */
/* flags for PageIsVerified() */
#define PIV_LOG_WARNING (1 << 0)
#define PIV_REPORT_STAT (1 << 1)
#define PageAddItem(page, item, size, offsetNumber, overwrite, is_heap) \
PageAddItemExtended(page, item, size, offsetNumber, \
((overwrite) ? PAI_OVERWRITE : 0) | \
((is_heap) ? PAI_IS_HEAP : 0))
#define PageIsVerified(page, blkno) \
PageIsVerifiedExtended(page, blkno, \
PIV_LOG_WARNING | PIV_REPORT_STAT)
/*
* Check that BLCKSZ is a multiple of sizeof(size_t). In
* PageIsVerifiedExtended(), it is much faster to check if a page is
* full of zeroes using the native word size. Note that this assertion
* is kept within a header to make sure that StaticAssertDecl() works
* across various combinations of platforms and compilers.
* Check that BLCKSZ is a multiple of sizeof(size_t). In PageIsVerified(), it
* is much faster to check if a page is full of zeroes using the native word
* size. Note that this assertion is kept within a header to make sure that
* StaticAssertDecl() works across various combinations of platforms and
* compilers.
*/
StaticAssertDecl(BLCKSZ == ((BLCKSZ / sizeof(size_t)) * sizeof(size_t)),
"BLCKSZ has to be a multiple of sizeof(size_t)");
extern void PageInit(Page page, Size pageSize, Size specialSize);
extern bool PageIsVerifiedExtended(PageData *page, BlockNumber blkno, int flags);
extern bool PageIsVerified(PageData *page, BlockNumber blkno, int flags,
bool *checksum_failure_p);
extern OffsetNumber PageAddItemExtended(Page page, Item item, Size size,
OffsetNumber offsetNumber, int flags);
extern Page PageGetTempPage(const PageData *page);