1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-31 22:04:40 +03:00

Make GEQO's planning deterministic by having it start from a predictable

random number seed each time.  This is how it used to work years ago, but
we got rid of the seed reset because it was resetting the main random()
sequence and thus having undesirable effects on the rest of the system.
To fix, establish a private random number state for each execution of
geqo(), and initialize the state using the new GUC variable geqo_seed.
People who want to experiment with different random searches can do so
by changing geqo_seed, but you'll always get the same plan for the same
value of geqo_seed (if holding all other planner inputs constant, of course).

The new state is kept in PlannerInfo by adding a "void *" field reserved
for use by join_search hooks.  Most of the rather bulky code changes in
this commit are just arranging to pass PlannerInfo around to all the GEQO
functions (many of which formerly didn't receive it).

Andres Freund, with some editorialization by Tom
This commit is contained in:
Tom Lane
2009-07-16 20:55:44 +00:00
parent c43feefa80
commit f5bc74192d
27 changed files with 305 additions and 202 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.221 2009/07/03 19:14:25 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/config.sgml,v 1.222 2009/07/16 20:55:44 tgl Exp $ -->
<chapter Id="runtime-config">
<title>Server Configuration</title>
@ -2149,7 +2149,23 @@ archive_command = 'copy "%p" "C:\\server\\archivedir\\%f"' # Windows
</para>
</listitem>
</varlistentry>
<varlistentry id="guc-geqo-seed" xreflabel="geqo_seed">
<term><varname>geqo_seed</varname> (<type>floating point</type>)</term>
<indexterm>
<primary><varname>geqo_seed</> configuration parameter</primary>
</indexterm>
<listitem>
<para>
Controls the initial value of the random number generator used
by GEQO to select random paths through the join order search space.
The value can range from zero (the default) to one. Varying the
value changes the set of join paths explored, and may result in a
better or worse best path being found.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="runtime-config-query-other">

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.40 2007/07/21 04:02:41 tgl Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/geqo.sgml,v 1.41 2009/07/16 20:55:44 tgl Exp $ -->
<chapter id="geqo">
<chapterinfo>
@ -49,7 +49,7 @@
methods</firstterm> (e.g., nested loop, hash join, merge join in
<productname>PostgreSQL</productname>) to process individual joins
and a diversity of <firstterm>indexes</firstterm> (e.g.,
B-tree, hash, GiST and GIN in <productname>PostgreSQL</productname>) as
B-tree, hash, GiST and GIN in <productname>PostgreSQL</productname>) as
access paths for relations.
</para>
@ -88,8 +88,7 @@
<para>
The genetic algorithm (<acronym>GA</acronym>) is a heuristic optimization method which
operates through
nondeterministic, randomized search. The set of possible solutions for the
operates through randomized search. The set of possible solutions for the
optimization problem is considered as a
<firstterm>population</firstterm> of <firstterm>individuals</firstterm>.
The degree of adaptation of an individual to its environment is specified
@ -116,7 +115,7 @@
According to the <systemitem class="resource">comp.ai.genetic</> <acronym>FAQ</acronym> it cannot be stressed too
strongly that a <acronym>GA</acronym> is not a pure random search for a solution to a
problem. A <acronym>GA</acronym> uses stochastic processes, but the result is distinctly
non-random (better than random).
non-random (better than random).
</para>
<figure id="geqo-diagram">
@ -260,9 +259,13 @@
<para>
This process is inherently nondeterministic, because of the randomized
choices made during both the initial population selection and subsequent
<quote>mutation</> of the best candidates. Hence different plans may
be selected from one run to the next, resulting in varying run time
and varying output row order.
<quote>mutation</> of the best candidates. To avoid surprising changes
of the selected plan, each run of the GEQO algorithm restarts its
random number generator with the current <xref linkend="guc-geqo-seed">
parameter setting. As long as <varname>geqo_seed</> and the other
GEQO parameters are kept fixed, the same plan will be generated for a
given query (and other planner inputs such as statistics). To experiment
with different search paths, try changing <varname>geqo_seed</>.
</para>
</sect2>
@ -330,7 +333,7 @@
url="news://comp.ai.genetic"></ulink>)
</para>
</listitem>
<listitem>
<para>
<ulink url="http://www.red3d.com/cwr/evolve.html">