Make sure that function declarations use names that exactly match the
corresponding names from function definitions in a few places. These
inconsistencies were all introduced relatively recently, after the code
base had parameter name mismatches fixed in bulk (see commits starting
with commits 4274dc22 and 035ce1fe).
pg_bsd_indent still has a couple of similar inconsistencies, which I
(pgeoghegan) have left untouched for now.
Like all earlier commits that cleaned up function parameter names, this
commit was written with help from clang-tidy.
The primary bottlenecks for relation extension are:
1) The extension lock is held while acquiring a victim buffer for the new
page. Acquiring a victim buffer can require writing out the old page
contents including possibly needing to flush WAL.
2) When extending via ReadBuffer() et al, we write a zero page during the
extension, and then later write out the actual page contents. This can
nearly double the write rate.
3) The existing bulk relation extension infrastructure in hio.c just amortized
the cost of acquiring the relation extension lock, but none of the other
costs.
Unfortunately 1) cannot currently be addressed in a central manner as the
callers to ReadBuffer() need to acquire the extension lock. To address that,
this this commit moves the responsibility for acquiring the extension lock
into bufmgr.c functions. That allows to acquire the relation extension lock
for just the required time. This will also allow us to improve relation
extension further, without changing callers.
The reason we write all-zeroes pages during relation extension is that we hope
to get ENOSPC errors earlier that way (largely works, except for CoW
filesystems). It is easier to handle out-of-space errors gracefully if the
page doesn't yet contain actual tuples. This commit addresses 2), by using the
recently introduced smgrzeroextend(), which extends the relation, without
dirtying the kernel page cache for all the extended pages.
To address 3), this commit introduces a function to extend a relation by
multiple blocks at a time.
There are three new exposed functions: ExtendBufferedRel() for extending the
relation by a single block, ExtendBufferedRelBy() to extend a relation by
multiple blocks at once, and ExtendBufferedRelTo() for extending a relation up
to a certain size.
To avoid duplicating code between ReadBuffer(P_NEW) and the new functions,
ReadBuffer(P_NEW) now implements relation extension with
ExtendBufferedRel(), using a flag to tell ExtendBufferedRel() that the
relation lock is already held.
Note that this commit does not yet lead to a meaningful performance or
scalability improvement - for that uses of ReadBuffer(P_NEW) will need to be
converted to ExtendBuffered*(), which will be done in subsequent commits.
Reviewed-by: Heikki Linnakangas <hlinnaka@iki.fi>
Reviewed-by: Melanie Plageman <melanieplageman@gmail.com>
Discussion: https://postgr.es/m/20221029025420.eplyow6k7tgu6he3@awork3.anarazel.de
Some clang versions whine about comparing an enum variable to
a value outside the range of the enum, on the grounds that the
result must be constant. In the cases we fix here, the loops
will terminate only if the enum variable can in fact hold a
value one beyond its declared range. While that's very likely
to always be true for these enum types, it still seems like a
poor coding practice to assume it; so use "int" loop variables
instead to silence the warnings. (This matches what we've done
in other places, for example loops over the range of ForkNumber.)
While at it, let's drop the XXX_FIRST macros for these enums and just
write zeroes for the loop start values. The apparent flexibility
seems rather illusory given that iterating up to one-less-than-
the-number-of-values is only correct for a zero-based range.
Melanie Plageman
Discussion: https://postgr.es/m/20520.1677435600@sss.pgh.pa.us
This commit adds the infrastructure for more detailed IO statistics. The calls
to actually count IOs, a system view to access the new statistics,
documentation and tests will be added in subsequent commits, to make review
easier.
While we already had some IO statistics, e.g. in pg_stat_bgwriter and
pg_stat_database, they did not provide sufficient detail to understand what
the main sources of IO are, or whether configuration changes could avoid
IO. E.g., pg_stat_bgwriter.buffers_backend does contain the number of buffers
written out by a backend, but as that includes extending relations (always
done by backends) and writes triggered by the use of buffer access strategies,
it cannot easily be used to tune background writer or checkpointer. Similarly,
pg_stat_database.blks_read cannot easily be used to tune shared_buffers /
compute a cache hit ratio, as the use of buffer access strategies will often
prevent a large fraction of the read blocks to end up in shared_buffers.
The new IO statistics count IO operations (evict, extend, fsync, read, reuse,
and write), and are aggregated for each combination of backend type (backend,
autovacuum worker, bgwriter, etc), target object of the IO (relations, temp
relations) and context of the IO (normal, vacuum, bulkread, bulkwrite).
What is tracked in this series of patches, is sufficient to perform the
aforementioned analyses. Further details, e.g. tracking the number of buffer
hits, would make that even easier, but was left out for now, to keep the scope
of the already large patchset manageable.
Bumps PGSTAT_FILE_FORMAT_ID.
Author: Melanie Plageman <melanieplageman@gmail.com>
Reviewed-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Justin Pryzby <pryzby@telsasoft.com>
Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com>
Discussion: https://postgr.es/m/20200124195226.lth52iydq2n2uilq@alap3.anarazel.de