1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-30 11:03:19 +03:00

Use quicksort, not replacement selection, for external sorting.

We still use replacement selection for the first run of the sort only
and only when the number of tuples is relatively small.  Otherwise,
the first run, and subsequent runs in all cases, are produced using
quicksort.  This tends to be faster except perhaps for very small
amounts of working memory.

Peter Geoghegan, reviewed by Tomas Vondra, Jeff Janes, Mithun Cy,
Greg Stark, and me.
This commit is contained in:
Robert Haas
2016-04-08 02:36:26 -04:00
parent 719c84c1be
commit 0711803775
7 changed files with 431 additions and 92 deletions

View File

@ -1472,6 +1472,45 @@ include_dir 'conf.d'
</listitem>
</varlistentry>
<varlistentry id="guc-replacement-sort-tuples" xreflabel="replacement_sort_tuples">
<term><varname>replacement_sort_tuples</varname> (<type>integer</type>)
<indexterm>
<primary><varname>replacement_sort_tuples</> configuration parameter</primary>
</indexterm>
</term>
<listitem>
<para>
When the number of tuples to be sorted is smaller than this number,
a sort will produce its first output run using replacement selection
rather than quicksort. This may be useful in memory-constrained
environments where tuples that are input into larger sort operations
have a strong physical-to-logical correlation. Note that this does
not include input tuples with an <emphasis>inverse</emphasis>
correlation. It is possible for the replacement selection algorithm
to generate one long run that requires no merging, where use of the
default strategy would result in many runs that must be merged
to produce a final sorted output. This may allow sort
operations to complete sooner.
</para>
<para>
The default is 150,000 tuples. Note that higher values are typically
not much more effective, and may be counter-productive, since the
priority queue is sensitive to the size of available CPU cache, whereas
the default strategy sorts runs using a <firstterm>cache
oblivious</firstterm> algorithm. This property allows the default sort
strategy to automatically and transparently make effective use
of available CPU cache.
</para>
<para>
Setting <varname>maintenance_work_mem</varname> to its default
value usually prevents utility command external sorts (e.g.,
sorts used by <command>CREATE INDEX</> to build B-Tree
indexes) from ever using replacement selection sort, unless the
input tuples are quite wide.
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-autovacuum-work-mem" xreflabel="autovacuum_work_mem">
<term><varname>autovacuum_work_mem</varname> (<type>integer</type>)
<indexterm>