Allow per-tablespace effective_io_concurrency

Per discussion, nowadays it is possible to have tablespaces that have wildly different I/O characteristics from others. Setting different effective_io_concurrency parameters for those has been measured to improve performance. Author: Julien Rouhaud Reviewed by: Andres Freund
2025-10-21 02:52:47 +03:00 · 2015-09-08 12:51:42 -03:00
parent 665a00c9e2
commit 1aba62ec63
12 changed files with 145 additions and 63 deletions
--- a/src/backend/storage/buffer/bufmgr.c
+++ b/src/backend/storage/buffer/bufmgr.c
@@ -80,11 +80,14 @@ bool		zero_damaged_pages = false;
 int			bgwriter_lru_maxpages = 100;
 double		bgwriter_lru_multiplier = 2.0;
 bool		track_io_timing = false;
+int			effective_io_concurrency = 0;

 /*
 * How many buffers PrefetchBuffer callers should try to stay ahead of their
 * ReadBuffer calls by.  This is maintained by the assign hook for
- * effective_io_concurrency.  Zero means "never prefetch".
+ * effective_io_concurrency.  Zero means "never prefetch".  This value is
+ * only used for buffers not belonging to tablespaces that have their
+ * effective_io_concurrency parameter set.
 */
 int			target_prefetch_pages = 0;

@@ -415,6 +418,64 @@ static void CheckForBufferLeaks(void);
 static int	rnode_comparator(const void *p1, const void *p2);


+/*
+ * ComputeIoConcurrency -- get the number of pages to prefetch for a given
+ *		number of spindles.
+ */
+bool
+ComputeIoConcurrency(int io_concurrency, double *target)
+{
+	double		new_prefetch_pages = 0.0;
+	int			i;
+
+	/*
+	 * Make sure the io_concurrency value is within valid range; it may have
+	 * been forced with a manual pg_tablespace update.
+	 */
+	io_concurrency = Min(Max(io_concurrency, 0), MAX_IO_CONCURRENCY);
+
+	/*----------
+	 * The user-visible GUC parameter is the number of drives (spindles),
+	 * which we need to translate to a number-of-pages-to-prefetch target.
+	 * The target value is stashed in *extra and then assigned to the actual
+	 * variable by assign_effective_io_concurrency.
+	 *
+	 * The expected number of prefetch pages needed to keep N drives busy is:
+	 *
+	 * drives |   I/O requests
+	 * -------+----------------
+	 *		1 |   1
+	 *		2 |   2/1 + 2/2 = 3
+	 *		3 |   3/1 + 3/2 + 3/3 = 5 1/2
+	 *		4 |   4/1 + 4/2 + 4/3 + 4/4 = 8 1/3
+	 *		n |   n * H(n)
+	 *
+	 * This is called the "coupon collector problem" and H(n) is called the
+	 * harmonic series.  This could be approximated by n * ln(n), but for
+	 * reasonable numbers of drives we might as well just compute the series.
+	 *
+	 * Alternatively we could set the target to the number of pages necessary
+	 * so that the expected number of active spindles is some arbitrary
+	 * percentage of the total.  This sounds the same but is actually slightly
+	 * different.  The result ends up being ln(1-P)/ln((n-1)/n) where P is
+	 * that desired fraction.
+	 *
+	 * Experimental results show that both of these formulas aren't aggressive
+	 * enough, but we don't really have any better proposals.
+	 *
+	 * Note that if io_concurrency = 0 (disabled), we must set target = 0.
+	 *----------
+	 */
+
+	for (i = 1; i <= io_concurrency; i++)
+		new_prefetch_pages += (double) io_concurrency / (double) i;
+
+	*target = new_prefetch_pages;
+
+	/* This range check shouldn't fail, but let's be paranoid */
+	return (new_prefetch_pages > 0.0 && new_prefetch_pages < (double) INT_MAX);
+}
+
 /*
 * PrefetchBuffer -- initiate asynchronous read of a block of a relation
 *