mirror of
https://github.com/postgres/postgres.git
synced 2025-08-21 10:42:50 +03:00
Align blocks in incremental backups to BLCKSZ
Align blocks stored in incremental files to BLCKSZ, so that the incremental backups work well with CoW filesystems. The header of the incremental file is padded with \0 to a multiple of BLCKSZ, so that the block data (also BLCKSZ) is aligned to BLCKSZ. The padding is added only to files containing block data, so files with just the header remain small. This adds a bit of extra space, but as the number of blocks increases the overhead gets negligible very quickly. And as the padding is \0 bytes, it does compress extremely well. The alignment is important for CoW filesystems that usually require the blocks to be aligned to filesystem page size for features like block sharing, deduplication etc. to work well. With the variable sized header the blocks in the increments were not aligned at all, negating the benefits of the CoW filesystems. This matters even for non-CoW filesystems, for example when placed on a RAID array. If the block is not aligned, it may easily span multiple devices, causing read and write amplification. It might be better to align the blocks to the filesystem page, not BLCKSZ, but we have no good way to determine that. Even if we determine the page size at the time of taking the backup, the backup may move. For now the BLCKSZ seems sufficient - the filesystem page is usually 4K, so the default BLCKSZ (8K by default) is aligned to that. Author: Tomas Vondra Reviewed-by: Robert Haas, Jakub Wartak Discussion: https://postgr.es/m/3024283a-7491-4240-80d0-421575f6bb23%40enterprisedb.com
This commit is contained in:
@@ -1623,6 +1623,8 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
|
||||
{
|
||||
unsigned magic = INCREMENTAL_MAGIC;
|
||||
size_t header_bytes_done = 0;
|
||||
char padding[BLCKSZ];
|
||||
size_t paddinglen;
|
||||
|
||||
/* Emit header data. */
|
||||
push_to_sink(sink, &checksum_ctx, &header_bytes_done,
|
||||
@@ -1635,6 +1637,23 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
|
||||
incremental_blocks,
|
||||
sizeof(BlockNumber) * num_incremental_blocks);
|
||||
|
||||
/*
|
||||
* Add padding to align header to a multiple of BLCKSZ, but only if
|
||||
* the incremental file has some blocks. If there are no blocks we
|
||||
* don't want make the file unnecessarily large, as that might make
|
||||
* some filesystem optimizations impossible.
|
||||
*/
|
||||
if (num_incremental_blocks > 0)
|
||||
{
|
||||
paddinglen = (BLCKSZ - (header_bytes_done % BLCKSZ));
|
||||
|
||||
memset(padding, 0, paddinglen);
|
||||
bytes_done += paddinglen;
|
||||
|
||||
push_to_sink(sink, &checksum_ctx, &header_bytes_done,
|
||||
padding, paddinglen);
|
||||
}
|
||||
|
||||
/* Flush out any data still in the buffer so it's again empty. */
|
||||
if (header_bytes_done > 0)
|
||||
{
|
||||
@@ -1748,6 +1767,13 @@ sendFile(bbsink *sink, const char *readfilename, const char *tarfilename,
|
||||
blkno += cnt / BLCKSZ;
|
||||
bytes_done += cnt;
|
||||
|
||||
/*
|
||||
* Make sure incremental files with block data are properly aligned
|
||||
* (header is a multiple of BLCKSZ, blocks are BLCKSZ too).
|
||||
*/
|
||||
Assert(!((incremental_blocks != NULL && num_incremental_blocks > 0) &&
|
||||
(bytes_done % BLCKSZ != 0)));
|
||||
|
||||
/* Archive the data we just read. */
|
||||
bbsink_archive_contents(sink, cnt);
|
||||
|
||||
|
Reference in New Issue
Block a user