1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

Add hash partitioning.

Hash partitioning is useful when you want to partition a growing data
set evenly.  This can be useful to keep table sizes reasonable, which
makes maintenance operations such as VACUUM faster, or to enable
partition-wise join.

At present, we still depend on constraint exclusion for partitioning
pruning, and the shape of the partition constraints for hash
partitioning is such that that doesn't work.  Work is underway to fix
that, which should both improve performance and make partitioning
pruning work with hash partitioning.

Amul Sul, reviewed and tested by Dilip Kumar, Ashutosh Bapat, Yugo
Nagata, Rajkumar Raghuwanshi, Jesper Pedersen, and by me.  A few
final tweaks also by me.

Discussion: http://postgr.es/m/CAAJ_b96fhpJAP=ALbETmeLk1Uni_GFZD938zgenhF49qgDTjaQ@mail.gmail.com
This commit is contained in:
Robert Haas
2017-11-09 18:07:25 -05:00
parent e7397f015c
commit 1aba8e651a
30 changed files with 1421 additions and 121 deletions

View File

@ -2875,6 +2875,19 @@ VALUES ('Albany', NULL, NULL, 'NY');
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Hash Partitioning</term>
<listitem>
<para>
The table is partitioned by specifying a modulus and a remainder for
each partition. Each partition will hold the rows for which the hash
value of the partition key divided by the specified modulus will
produce the specified remainder.
</para>
</listitem>
</varlistentry>
</variablelist>
If your application needs to use other forms of partitioning not listed
@ -2901,9 +2914,8 @@ VALUES ('Albany', NULL, NULL, 'NY');
All rows inserted into a partitioned table will be routed to one of the
<firstterm>partitions</firstterm> based on the value of the partition
key. Each partition has a subset of the data defined by its
<firstterm>partition bounds</firstterm>. Currently supported
partitioning methods include range and list, where each partition is
assigned a range of keys and a list of keys, respectively.
<firstterm>partition bounds</firstterm>. The currently supported
partitioning methods are range, list, and hash.
</para>
<para>
@ -3328,11 +3340,11 @@ ALTER TABLE measurement ATTACH PARTITION measurement_y2008m02
<listitem>
<para>
Declarative partitioning only supports list and range partitioning,
whereas table inheritance allows data to be divided in a manner of
the user's choosing. (Note, however, that if constraint exclusion is
unable to prune partitions effectively, query performance will be very
poor.)
Declarative partitioning only supports range, list and hash
partitioning, whereas table inheritance allows data to be divided in a
manner of the user's choosing. (Note, however, that if constraint
exclusion is unable to prune partitions effectively, query performance
will be very poor.)
</para>
</listitem>

View File

@ -1431,6 +1431,13 @@ ALTER TABLE cities
ATTACH PARTITION cities_partdef DEFAULT;
</programlisting></para>
<para>
Attach a partition to hash partitioned table:
<programlisting>
ALTER TABLE orders
ATTACH PARTITION orders_p4 FOR VALUES WITH (MODULUS 4, REMAINDER 3);
</programlisting></para>
<para>
Detach a partition from partitioned table:
<programlisting>

View File

@ -28,7 +28,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
[, ... ]
] )
[ INHERITS ( <replaceable>parent_table</replaceable> [, ... ] ) ]
[ PARTITION BY { RANGE | LIST } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [, ... ] ) ]
[ PARTITION BY { RANGE | LIST | HASH } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [, ... ] ) ]
[ WITH ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) | WITH OIDS | WITHOUT OIDS ]
[ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
[ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ]
@ -39,7 +39,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
| <replaceable>table_constraint</replaceable> }
[, ... ]
) ]
[ PARTITION BY { RANGE | LIST } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [, ... ] ) ]
[ PARTITION BY { RANGE | LIST | HASH } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [, ... ] ) ]
[ WITH ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) | WITH OIDS | WITHOUT OIDS ]
[ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
[ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ]
@ -50,7 +50,7 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
| <replaceable>table_constraint</replaceable> }
[, ... ]
) ] { FOR VALUES <replaceable class="parameter">partition_bound_spec</replaceable> | DEFAULT }
[ PARTITION BY { RANGE | LIST } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [, ... ] ) ]
[ PARTITION BY { RANGE | LIST | HASH } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ COLLATE <replaceable class="parameter">collation</replaceable> ] [ <replaceable class="parameter">opclass</replaceable> ] [, ... ] ) ]
[ WITH ( <replaceable class="parameter">storage_parameter</replaceable> [= <replaceable class="parameter">value</replaceable>] [, ... ] ) | WITH OIDS | WITHOUT OIDS ]
[ ON COMMIT { PRESERVE ROWS | DELETE ROWS | DROP } ]
[ TABLESPACE <replaceable class="parameter">tablespace_name</replaceable> ]
@ -88,7 +88,8 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
IN ( { <replaceable class="parameter">numeric_literal</replaceable> | <replaceable class="parameter">string_literal</replaceable> | NULL } [, ...] ) |
FROM ( { <replaceable class="parameter">numeric_literal</replaceable> | <replaceable class="parameter">string_literal</replaceable> | MINVALUE | MAXVALUE } [, ...] )
TO ( { <replaceable class="parameter">numeric_literal</replaceable> | <replaceable class="parameter">string_literal</replaceable> | MINVALUE | MAXVALUE } [, ...] )
TO ( { <replaceable class="parameter">numeric_literal</replaceable> | <replaceable class="parameter">string_literal</replaceable> | MINVALUE | MAXVALUE } [, ...] ) |
WITH ( MODULUS <replaceable class="parameter">numeric_literal</replaceable>, REMAINDER <replaceable class="parameter">numeric_literal</replaceable> )
<phrase><replaceable class="parameter">index_parameters</replaceable> in <literal>UNIQUE</literal>, <literal>PRIMARY KEY</literal>, and <literal>EXCLUDE</literal> constraints are:</phrase>
@ -256,7 +257,8 @@ FROM ( { <replaceable class="parameter">numeric_literal</replaceable> | <replace
Creates the table as a <firstterm>partition</firstterm> of the specified
parent table. The table can be created either as a partition for specific
values using <literal>FOR VALUES</literal> or as a default partition
using <literal>DEFAULT</literal>.
using <literal>DEFAULT</literal>. This option is not available for
hash-partitioned tables.
</para>
<para>
@ -264,8 +266,9 @@ FROM ( { <replaceable class="parameter">numeric_literal</replaceable> | <replace
must correspond to the partitioning method and partition key of the
parent table, and must not overlap with any existing partition of that
parent. The form with <literal>IN</literal> is used for list partitioning,
while the form with <literal>FROM</literal> and <literal>TO</literal> is used for
range partitioning.
the form with <literal>FROM</literal> and <literal>TO</literal> is used
for range partitioning, and the form with <literal>WITH</literal> is used
for hash partitioning.
</para>
<para>
@ -363,6 +366,29 @@ FROM ( { <replaceable class="parameter">numeric_literal</replaceable> | <replace
partition.
</para>
<para>
When creating a hash partition, a modulus and remainder must be specified.
The modulus must be a positive integer, and the remainder must be a
non-negative integer less than the modulus. Typically, when initially
setting up a hash-partitioned table, you should choose a modulus equal to
the number of partitions and assign every table the same modulus and a
different remainder (see examples, below). However, it is not required
that every partition have the same modulus, only that every modulus which
occurs among the partitions of a hash-partitioned table is a factor of the
next larger modulus. This allows the number of partitions to be increased
incrementally without needing to move all the data at once. For example,
suppose you have a hash-partitioned table with 8 partitions, each of which
has modulus 8, but find it necessary to increase the number of partitions
to 16. You can detach one of the modulus-8 partitions, create two new
modulus-16 partitions covering the same portion of the key space (one with
a remainder equal to the remainder of the detached partition, and the
other with a remainder equal to that value plus 8), and repopulate them
with data. You can then repeat this -- perhaps at a later time -- for
each modulus-8 partition until none remain. While this may still involve
a large amount of data movement at each step, it is still better than
having to create a whole new table and move all the data at once.
</para>
<para>
A partition must have the same column names and types as the partitioned
table to which it belongs. If the parent is specified <literal>WITH
@ -486,20 +512,28 @@ FROM ( { <replaceable class="parameter">numeric_literal</replaceable> | <replace
</varlistentry>
<varlistentry>
<term><literal>PARTITION BY { RANGE | LIST } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ <replaceable class="parameter">opclass</replaceable> ] [, ...] ) </literal></term>
<term><literal>PARTITION BY { RANGE | LIST | HASH } ( { <replaceable class="parameter">column_name</replaceable> | ( <replaceable class="parameter">expression</replaceable> ) } [ <replaceable class="parameter">opclass</replaceable> ] [, ...] ) </literal></term>
<listitem>
<para>
The optional <literal>PARTITION BY</literal> clause specifies a strategy
of partitioning the table. The table thus created is called a
<firstterm>partitioned</firstterm> table. The parenthesized list of
columns or expressions forms the <firstterm>partition key</firstterm>
for the table. When using range partitioning, the partition key can
include multiple columns or expressions (up to 32, but this limit can be
altered when building <productname>PostgreSQL</productname>), but for
for the table. When using range or hash partitioning, the partition key
can include multiple columns or expressions (up to 32, but this limit can
be altered when building <productname>PostgreSQL</productname>), but for
list partitioning, the partition key must consist of a single column or
expression. If no B-tree operator class is specified when creating a
partitioned table, the default B-tree operator class for the datatype will
be used. If there is none, an error will be reported.
expression.
</para>
<para>
Range and list partitioning require a btree operator class, while hash
partitioning requires a hash operator class. If no operator class is
specified explicitly, the default operator class of the appropriate
type will be used; if no default operator class exists, an error will
be raised. When hash partitioning is used, the operator class used
must implement support function 2 (see <xref linkend="xindex-support">
for details).
</para>
<para>
@ -1647,6 +1681,16 @@ CREATE TABLE cities (
name text not null,
population bigint
) PARTITION BY LIST (left(lower(name), 1));
</programlisting></para>
<para>
Create a hash partitioned table:
<programlisting>
CREATE TABLE orders (
order_id bigint not null,
cust_id bigint not null,
status text
) PARTITION BY HASH (order_id);
</programlisting></para>
<para>
@ -1701,6 +1745,19 @@ CREATE TABLE cities_ab_10000_to_100000
PARTITION OF cities_ab FOR VALUES FROM (10000) TO (100000);
</programlisting></para>
<para>
Create partitions of a hash partitioned table:
<programlisting>
CREATE TABLE orders_p1 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 0);
CREATE TABLE orders_p2 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 1);
CREATE TABLE orders_p3 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 2);
CREATE TABLE orders_p4 PARTITION OF orders
FOR VALUES WITH (MODULUS 4, REMAINDER 3);
</programlisting></para>
<para>
Create a default partition:
<programlisting>