1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-30 11:03:19 +03:00

BRIN minmax-multi indexes

Adds BRIN opclasses similar to the existing minmax, except that instead
of summarizing the page range into a single [min,max] range, the summary
consists of multiple ranges and/or points, allowing gaps. This allows
more efficient handling of data with poor correlation to physical
location within the table and/or outlier values, for which the regular
minmax opclassed tend to work poorly.

It's possible to specify the number of values kept for each page range,
either as a single point or an interval boundary.

  CREATE TABLE t (a int);
  CREATE INDEX ON t
   USING brin (a int4_minmax_multi_ops(values_per_range=16));

When building the summary, the values are combined into intervals with
the goal to minimize the "covering" (sum of interval lengths), using a
support procedure computing distance between two values.

Bump catversion, due to various catalog changes.

Author: Tomas Vondra <tomas.vondra@postgresql.org>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Alexander Korotkov <aekorotkov@gmail.com>
Reviewed-by: Sokolov Yura <y.sokolov@postgrespro.ru>
Reviewed-by: John Naylor <john.naylor@enterprisedb.com>
Discussion: https://postgr.es/m/c1138ead-7668-f0e1-0638-c3be3237e812@2ndquadrant.com
Discussion: https://postgr.es/m/5d78b774-7e9c-c94e-12cf-fef51cc89b1a%402ndquadrant.com
This commit is contained in:
Tomas Vondra
2021-03-26 13:54:29 +01:00
parent 77b88cd1bb
commit ab596105b5
19 changed files with 5286 additions and 26 deletions

View File

@ -116,7 +116,10 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
in the indexed column within the range. The <firstterm>inclusion</firstterm>
operator classes store a value which includes the values in the indexed
column within the range. The <firstterm>bloom</firstterm> operator
classes build a Bloom filter for all values in the range.
classes build a Bloom filter for all values in the range. The
<firstterm>minmax-multi</firstterm> operator classes store multiple
minimum and maximum values, representing values appearing in the indexed
column within the range.
</para>
<table id="brin-builtin-opclasses-table">
@ -211,6 +214,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (date,date)</literal></entry></row>
<row><entry><literal>&gt;= (date,date)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>date_minmax_multi_ops</literal></entry>
<entry><literal>= (date,date)</literal></entry>
</row>
<row><entry><literal>&lt; (date,date)</literal></entry></row>
<row><entry><literal>&lt;= (date,date)</literal></entry></row>
<row><entry><literal>&gt; (date,date)</literal></entry></row>
<row><entry><literal>&gt;= (date,date)</literal></entry></row>
<row>
<entry valign="middle"><literal>float4_bloom_ops</literal></entry>
<entry><literal>= (float4,float4)</literal></entry>
@ -225,6 +237,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (float4,float4)</literal></entry></row>
<row><entry><literal>&gt;= (float4,float4)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>float4_minmax_multi_ops</literal></entry>
<entry><literal>= (float4,float4)</literal></entry>
</row>
<row><entry><literal>&lt; (float4,float4)</literal></entry></row>
<row><entry><literal>&gt; (float4,float4)</literal></entry></row>
<row><entry><literal>&lt;= (float4,float4)</literal></entry></row>
<row><entry><literal>&gt;= (float4,float4)</literal></entry></row>
<row>
<entry valign="middle"><literal>float8_bloom_ops</literal></entry>
<entry><literal>= (float8,float8)</literal></entry>
@ -239,6 +260,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (float8,float8)</literal></entry></row>
<row><entry><literal>&gt;= (float8,float8)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>float8_minmax_multi_ops</literal></entry>
<entry><literal>= (float8,float8)</literal></entry>
</row>
<row><entry><literal>&lt; (float8,float8)</literal></entry></row>
<row><entry><literal>&lt;= (float8,float8)</literal></entry></row>
<row><entry><literal>&gt; (float8,float8)</literal></entry></row>
<row><entry><literal>&gt;= (float8,float8)</literal></entry></row>
<row>
<entry valign="middle" morerows="5"><literal>inet_inclusion_ops</literal></entry>
<entry><literal>&lt;&lt; (inet,inet)</literal></entry>
@ -263,6 +293,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (inet,inet)</literal></entry></row>
<row><entry><literal>&gt;= (inet,inet)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>inet_minmax_multi_ops</literal></entry>
<entry><literal>= (inet,inet)</literal></entry>
</row>
<row><entry><literal>&lt; (inet,inet)</literal></entry></row>
<row><entry><literal>&lt;= (inet,inet)</literal></entry></row>
<row><entry><literal>&gt; (inet,inet)</literal></entry></row>
<row><entry><literal>&gt;= (inet,inet)</literal></entry></row>
<row>
<entry valign="middle"><literal>int2_bloom_ops</literal></entry>
<entry><literal>= (int2,int2)</literal></entry>
@ -277,6 +316,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (int2,int2)</literal></entry></row>
<row><entry><literal>&gt;= (int2,int2)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>int2_minmax_multi_ops</literal></entry>
<entry><literal>= (int2,int2)</literal></entry>
</row>
<row><entry><literal>&lt; (int2,int2)</literal></entry></row>
<row><entry><literal>&gt; (int2,int2)</literal></entry></row>
<row><entry><literal>&lt;= (int2,int2)</literal></entry></row>
<row><entry><literal>&gt;= (int2,int2)</literal></entry></row>
<row>
<entry valign="middle"><literal>int4_bloom_ops</literal></entry>
<entry><literal>= (int4,int4)</literal></entry>
@ -291,6 +339,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (int4,int4)</literal></entry></row>
<row><entry><literal>&gt;= (int4,int4)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>int4_minmax_multi_ops</literal></entry>
<entry><literal>= (int4,int4)</literal></entry>
</row>
<row><entry><literal>&lt; (int4,int4)</literal></entry></row>
<row><entry><literal>&gt; (int4,int4)</literal></entry></row>
<row><entry><literal>&lt;= (int4,int4)</literal></entry></row>
<row><entry><literal>&gt;= (int4,int4)</literal></entry></row>
<row>
<entry valign="middle"><literal>int8_bloom_ops</literal></entry>
<entry><literal>= (bigint,bigint)</literal></entry>
@ -305,6 +362,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (bigint,bigint)</literal></entry></row>
<row><entry><literal>&gt;= (bigint,bigint)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>int8_minmax_multi_ops</literal></entry>
<entry><literal>= (bigint,bigint)</literal></entry>
</row>
<row><entry><literal>&lt; (bigint,bigint)</literal></entry></row>
<row><entry><literal>&gt; (bigint,bigint)</literal></entry></row>
<row><entry><literal>&lt;= (bigint,bigint)</literal></entry></row>
<row><entry><literal>&gt;= (bigint,bigint)</literal></entry></row>
<row>
<entry valign="middle"><literal>interval_bloom_ops</literal></entry>
<entry><literal>= (interval,interval)</literal></entry>
@ -319,6 +385,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (interval,interval)</literal></entry></row>
<row><entry><literal>&gt;= (interval,interval)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>interval_minmax_multi_ops</literal></entry>
<entry><literal>= (interval,interval)</literal></entry>
</row>
<row><entry><literal>&lt; (interval,interval)</literal></entry></row>
<row><entry><literal>&lt;= (interval,interval)</literal></entry></row>
<row><entry><literal>&gt; (interval,interval)</literal></entry></row>
<row><entry><literal>&gt;= (interval,interval)</literal></entry></row>
<row>
<entry valign="middle"><literal>macaddr_bloom_ops</literal></entry>
<entry><literal>= (macaddr,macaddr)</literal></entry>
@ -333,6 +408,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (macaddr,macaddr)</literal></entry></row>
<row><entry><literal>&gt;= (macaddr,macaddr)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>macaddr_minmax_multi_ops</literal></entry>
<entry><literal>= (macaddr,macaddr)</literal></entry>
</row>
<row><entry><literal>&lt; (macaddr,macaddr)</literal></entry></row>
<row><entry><literal>&lt;= (macaddr,macaddr)</literal></entry></row>
<row><entry><literal>&gt; (macaddr,macaddr)</literal></entry></row>
<row><entry><literal>&gt;= (macaddr,macaddr)</literal></entry></row>
<row>
<entry valign="middle"><literal>macaddr8_bloom_ops</literal></entry>
<entry><literal>= (macaddr8,macaddr8)</literal></entry>
@ -347,6 +431,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (macaddr8,macaddr8)</literal></entry></row>
<row><entry><literal>&gt;= (macaddr8,macaddr8)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>macaddr8_minmax_multi_ops</literal></entry>
<entry><literal>= (macaddr8,macaddr8)</literal></entry>
</row>
<row><entry><literal>&lt; (macaddr8,macaddr8)</literal></entry></row>
<row><entry><literal>&lt;= (macaddr8,macaddr8)</literal></entry></row>
<row><entry><literal>&gt; (macaddr8,macaddr8)</literal></entry></row>
<row><entry><literal>&gt;= (macaddr8,macaddr8)</literal></entry></row>
<row>
<entry valign="middle"><literal>name_bloom_ops</literal></entry>
<entry><literal>= (name,name)</literal></entry>
@ -375,6 +468,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (numeric,numeric)</literal></entry></row>
<row><entry><literal>&gt;= (numeric,numeric)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>numeric_minmax_multi_ops</literal></entry>
<entry><literal>= (numeric,numeric)</literal></entry>
</row>
<row><entry><literal>&lt; (numeric,numeric)</literal></entry></row>
<row><entry><literal>&lt;= (numeric,numeric)</literal></entry></row>
<row><entry><literal>&gt; (numeric,numeric)</literal></entry></row>
<row><entry><literal>&gt;= (numeric,numeric)</literal></entry></row>
<row>
<entry valign="middle"><literal>oid_bloom_ops</literal></entry>
<entry><literal>= (oid,oid)</literal></entry>
@ -389,6 +491,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (oid,oid)</literal></entry></row>
<row><entry><literal>&gt;= (oid,oid)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>oid_minmax_multi_ops</literal></entry>
<entry><literal>= (oid,oid)</literal></entry>
</row>
<row><entry><literal>&lt; (oid,oid)</literal></entry></row>
<row><entry><literal>&gt; (oid,oid)</literal></entry></row>
<row><entry><literal>&lt;= (oid,oid)</literal></entry></row>
<row><entry><literal>&gt;= (oid,oid)</literal></entry></row>
<row>
<entry valign="middle"><literal>pg_lsn_bloom_ops</literal></entry>
<entry><literal>= (pg_lsn,pg_lsn)</literal></entry>
@ -403,6 +514,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (pg_lsn,pg_lsn)</literal></entry></row>
<row><entry><literal>&gt;= (pg_lsn,pg_lsn)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>pg_lsn_minmax_multi_ops</literal></entry>
<entry><literal>= (pg_lsn,pg_lsn)</literal></entry>
</row>
<row><entry><literal>&lt; (pg_lsn,pg_lsn)</literal></entry></row>
<row><entry><literal>&gt; (pg_lsn,pg_lsn)</literal></entry></row>
<row><entry><literal>&lt;= (pg_lsn,pg_lsn)</literal></entry></row>
<row><entry><literal>&gt;= (pg_lsn,pg_lsn)</literal></entry></row>
<row>
<entry valign="middle" morerows="13"><literal>range_inclusion_ops</literal></entry>
<entry><literal>= (anyrange,anyrange)</literal></entry>
@ -449,6 +569,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (tid,tid)</literal></entry></row>
<row><entry><literal>&gt;= (tid,tid)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>tid_minmax_multi_ops</literal></entry>
<entry><literal>= (tid,tid)</literal></entry>
</row>
<row><entry><literal>&lt; (tid,tid)</literal></entry></row>
<row><entry><literal>&gt; (tid,tid)</literal></entry></row>
<row><entry><literal>&lt;= (tid,tid)</literal></entry></row>
<row><entry><literal>&gt;= (tid,tid)</literal></entry></row>
<row>
<entry valign="middle"><literal>timestamp_bloom_ops</literal></entry>
<entry><literal>= (timestamp,timestamp)</literal></entry>
@ -463,6 +592,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (timestamp,timestamp)</literal></entry></row>
<row><entry><literal>&gt;= (timestamp,timestamp)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>timestamp_minmax_multi_ops</literal></entry>
<entry><literal>= (timestamp,timestamp)</literal></entry>
</row>
<row><entry><literal>&lt; (timestamp,timestamp)</literal></entry></row>
<row><entry><literal>&lt;= (timestamp,timestamp)</literal></entry></row>
<row><entry><literal>&gt; (timestamp,timestamp)</literal></entry></row>
<row><entry><literal>&gt;= (timestamp,timestamp)</literal></entry></row>
<row>
<entry valign="middle"><literal>timestamptz_bloom_ops</literal></entry>
<entry><literal>= (timestamptz,timestamptz)</literal></entry>
@ -477,6 +615,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (timestamptz,timestamptz)</literal></entry></row>
<row><entry><literal>&gt;= (timestamptz,timestamptz)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>timestamptz_minmax_multi_ops</literal></entry>
<entry><literal>= (timestamptz,timestamptz)</literal></entry>
</row>
<row><entry><literal>&lt; (timestamptz,timestamptz)</literal></entry></row>
<row><entry><literal>&lt;= (timestamptz,timestamptz)</literal></entry></row>
<row><entry><literal>&gt; (timestamptz,timestamptz)</literal></entry></row>
<row><entry><literal>&gt;= (timestamptz,timestamptz)</literal></entry></row>
<row>
<entry valign="middle"><literal>time_bloom_ops</literal></entry>
<entry><literal>= (time,time)</literal></entry>
@ -491,6 +638,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (time,time)</literal></entry></row>
<row><entry><literal>&gt;= (time,time)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>time_minmax_multi_ops</literal></entry>
<entry><literal>= (time,time)</literal></entry>
</row>
<row><entry><literal>&lt; (time,time)</literal></entry></row>
<row><entry><literal>&lt;= (time,time)</literal></entry></row>
<row><entry><literal>&gt; (time,time)</literal></entry></row>
<row><entry><literal>&gt;= (time,time)</literal></entry></row>
<row>
<entry valign="middle"><literal>timetz_bloom_ops</literal></entry>
<entry><literal>= (timetz,timetz)</literal></entry>
@ -505,6 +661,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&gt; (timetz,timetz)</literal></entry></row>
<row><entry><literal>&gt;= (timetz,timetz)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>timetz_minmax_multi_ops</literal></entry>
<entry><literal>= (timetz,timetz)</literal></entry>
</row>
<row><entry><literal>&lt; (timetz,timetz)</literal></entry></row>
<row><entry><literal>&lt;= (timetz,timetz)</literal></entry></row>
<row><entry><literal>&gt; (timetz,timetz)</literal></entry></row>
<row><entry><literal>&gt;= (timetz,timetz)</literal></entry></row>
<row>
<entry valign="middle"><literal>uuid_bloom_ops</literal></entry>
<entry><literal>= (uuid,uuid)</literal></entry>
@ -519,6 +684,15 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<row><entry><literal>&lt;= (uuid,uuid)</literal></entry></row>
<row><entry><literal>&gt;= (uuid,uuid)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>uuid_minmax_multi_ops</literal></entry>
<entry><literal>= (uuid,uuid)</literal></entry>
</row>
<row><entry><literal>&lt; (uuid,uuid)</literal></entry></row>
<row><entry><literal>&gt; (uuid,uuid)</literal></entry></row>
<row><entry><literal>&lt;= (uuid,uuid)</literal></entry></row>
<row><entry><literal>&gt;= (uuid,uuid)</literal></entry></row>
<row>
<entry valign="middle" morerows="4"><literal>varbit_minmax_ops</literal></entry>
<entry><literal>= (varbit,varbit)</literal></entry>
@ -537,8 +711,8 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
<para>
Some of the built-in operator classes allow specifying parameters affecting
behavior of the operator class. Each operator class has its own set of
allowed parameters. Only the <literal>bloom</literal> operator class
allows specifying parameters:
allowed parameters. Only the <literal>bloom</literal> and <literal>minmax-multi</literal>
operator classes allow specifying parameters:
</para>
<para>
@ -577,6 +751,25 @@ LOG: request for BRIN range summarization for index "brin_wi_idx" page 128 was
</varlistentry>
</variablelist>
<para>
<acronym>minmax-multi</acronym> operator classes accept these parameters:
</para>
<variablelist>
<varlistentry>
<term><literal>values_per_range</literal></term>
<listitem>
<para>
Defines the maximum number of values stored by <acronym>BRIN</acronym>
minmax indexes to summarize a block range. Each value may represent
either a point, or a boundary of an interval. Values must be between
8 and 256, and the default value is 32.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
</sect1>
@ -715,13 +908,14 @@ typedef struct BrinOpcInfo
</varlistentry>
</variablelist>
The core distribution includes support for two types of operator classes:
minmax and inclusion. Operator class definitions using them are shipped for
in-core data types as appropriate. Additional operator classes can be
defined by the user for other data types using equivalent definitions,
without having to write any source code; appropriate catalog entries being
declared is enough. Note that assumptions about the semantics of operator
strategies are embedded in the support functions' source code.
The core distribution includes support for four types of operator classes:
minmax, minmax-multi, inclusion and bloom. Operator class definitions
using them are shipped for in-core data types as appropriate. Additional
operator classes can be defined by the user for other data types using
equivalent definitions, without having to write any source code;
appropriate catalog entries being declared is enough. Note that
assumptions about the semantics of operator strategies are embedded in the
support functions' source code.
</para>
<para>
@ -1018,6 +1212,72 @@ typedef struct BrinOpcInfo
and return a hash of the value.
</para>
<para>
The minmax-multi operator class is also intended for data types implementing
a totally ordered sets, and may be seen as a simple extension of the minmax
operator class. While minmax operator class summarizes values from each block
range into a single contiguous interval, minmax-multi allows summarization
into multiple smaller intervals to improve handling of outlier values.
It is possible to use the minmax-multi support procedures alongside the
corresponding operators, as shown in
<xref linkend="brin-extensibility-minmax-multi-table"/>.
All operator class members (procedures and operators) are mandatory.
</para>
<table id="brin-extensibility-minmax-multi-table">
<title>Procedure and Support Numbers for minmax-multi Operator Classes</title>
<tgroup cols="2">
<thead>
<row>
<entry>Operator class member</entry>
<entry>Object</entry>
</row>
</thead>
<tbody>
<row>
<entry>Support Procedure 1</entry>
<entry>internal function <function>brin_minmax_multi_opcinfo()</function></entry>
</row>
<row>
<entry>Support Procedure 2</entry>
<entry>internal function <function>brin_minmax_multi_add_value()</function></entry>
</row>
<row>
<entry>Support Procedure 3</entry>
<entry>internal function <function>brin_minmax_multi_consistent()</function></entry>
</row>
<row>
<entry>Support Procedure 4</entry>
<entry>internal function <function>brin_minmax_multi_union()</function></entry>
</row>
<row>
<entry>Support Procedure 11</entry>
<entry>function to compute distance between two values (length of a range)</entry>
</row>
<row>
<entry>Operator Strategy 1</entry>
<entry>operator less-than</entry>
</row>
<row>
<entry>Operator Strategy 2</entry>
<entry>operator less-than-or-equal-to</entry>
</row>
<row>
<entry>Operator Strategy 3</entry>
<entry>operator equal-to</entry>
</row>
<row>
<entry>Operator Strategy 4</entry>
<entry>operator greater-than-or-equal-to</entry>
</row>
<row>
<entry>Operator Strategy 5</entry>
<entry>operator greater-than</entry>
</row>
</tbody>
</tgroup>
</table>
<para>
Both minmax and inclusion operator classes support cross-data-type
operators, though with these the dependencies become more complicated.