mirror of
https://github.com/postgres/postgres.git
synced 2025-04-27 22:56:53 +03:00
Minor editorialization on storage.sgml's documentation of free space
maps.
This commit is contained in:
parent
2d6e2323a4
commit
03a5ff0d1a
@ -1,4 +1,4 @@
|
|||||||
<!-- $PostgreSQL: pgsql/doc/src/sgml/storage.sgml,v 1.27 2009/04/23 10:20:27 heikki Exp $ -->
|
<!-- $PostgreSQL: pgsql/doc/src/sgml/storage.sgml,v 1.28 2009/05/16 22:03:53 tgl Exp $ -->
|
||||||
|
|
||||||
<chapter id="storage">
|
<chapter id="storage">
|
||||||
|
|
||||||
@ -33,7 +33,7 @@ these required items, the cluster configuration files
|
|||||||
<filename>postgresql.conf</filename>, <filename>pg_hba.conf</filename>, and
|
<filename>postgresql.conf</filename>, <filename>pg_hba.conf</filename>, and
|
||||||
<filename>pg_ident.conf</filename> are traditionally stored in
|
<filename>pg_ident.conf</filename> are traditionally stored in
|
||||||
<varname>PGDATA</> (although in <productname>PostgreSQL</productname> 8.0 and
|
<varname>PGDATA</> (although in <productname>PostgreSQL</productname> 8.0 and
|
||||||
later, it is possible to keep them elsewhere).
|
later, it is possible to keep them elsewhere).
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<table tocentry="1" id="pgdata-contents-table">
|
<table tocentry="1" id="pgdata-contents-table">
|
||||||
@ -74,7 +74,7 @@ Item
|
|||||||
<row>
|
<row>
|
||||||
<entry><filename>pg_multixact</></entry>
|
<entry><filename>pg_multixact</></entry>
|
||||||
<entry>Subdirectory containing multitransaction status data
|
<entry>Subdirectory containing multitransaction status data
|
||||||
(used for shared row locks)</entry>
|
(used for shared row locks)</entry>
|
||||||
</row>
|
</row>
|
||||||
|
|
||||||
<row>
|
<row>
|
||||||
@ -131,12 +131,12 @@ there.
|
|||||||
Each table and index is stored in a separate file, named after the table
|
Each table and index is stored in a separate file, named after the table
|
||||||
or index's <firstterm>filenode</> number, which can be found in
|
or index's <firstterm>filenode</> number, which can be found in
|
||||||
<structname>pg_class</>.<structfield>relfilenode</>. In addition to the
|
<structname>pg_class</>.<structfield>relfilenode</>. In addition to the
|
||||||
main file (aka. main fork), a <firstterm>free space map</> (see
|
main file (a/k/a main fork), each table and index has a <firstterm>free space
|
||||||
<xref linkend="storage-fsm">) that stores information about free space
|
map</> (see <xref linkend="storage-fsm">), which stores information about free
|
||||||
available in the relation, is stored in a file named after the filenode
|
space available in the relation. The free space map is stored in a file named
|
||||||
number, with the <literal>_fsm</> suffix. Tables also have a visibility map
|
with the filenode number plus the suffix <literal>_fsm</>. Tables also have a
|
||||||
fork, with the <literal>_vm</> suffix, to track which pages are known to have
|
visibility map fork, with the suffix <literal>_vm</>, to track which pages are
|
||||||
no dead tuples and therefore need no vacuuming.
|
known to have no dead tuples and therefore need no vacuuming.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<caution>
|
<caution>
|
||||||
@ -157,6 +157,8 @@ This arrangement avoids problems on platforms that have file size limitations.
|
|||||||
(Actually, 1 GB is just the default segment size. The segment size can be
|
(Actually, 1 GB is just the default segment size. The segment size can be
|
||||||
adjusted using the configuration option <option>--with-segsize</option>
|
adjusted using the configuration option <option>--with-segsize</option>
|
||||||
when building <productname>PostgreSQL</>.)
|
when building <productname>PostgreSQL</>.)
|
||||||
|
In principle, free space map and visibility map forks could require multiple
|
||||||
|
segments as well, though this is unlikely to happen in practice.
|
||||||
The contents of tables and indexes are discussed further in
|
The contents of tables and indexes are discussed further in
|
||||||
<xref linkend="storage-page-layout">.
|
<xref linkend="storage-page-layout">.
|
||||||
</para>
|
</para>
|
||||||
@ -193,7 +195,7 @@ if a tablespace other than <literal>pg_default</> is specified for them.
|
|||||||
The name of a temporary file has the form
|
The name of a temporary file has the form
|
||||||
<filename>pgsql_tmp<replaceable>PPP</>.<replaceable>NNN</></filename>,
|
<filename>pgsql_tmp<replaceable>PPP</>.<replaceable>NNN</></filename>,
|
||||||
where <replaceable>PPP</> is the PID of the owning backend and
|
where <replaceable>PPP</> is the PID of the owning backend and
|
||||||
<replaceable>NNN</> distinguishes different files of that backend.
|
<replaceable>NNN</> distinguishes different temporary files of that backend.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
</sect1>
|
</sect1>
|
||||||
@ -215,10 +217,10 @@ Oversized-Attribute Storage Technique).
|
|||||||
<para>
|
<para>
|
||||||
<productname>PostgreSQL</productname> uses a fixed page size (commonly
|
<productname>PostgreSQL</productname> uses a fixed page size (commonly
|
||||||
8 kB), and does not allow tuples to span multiple pages. Therefore, it is
|
8 kB), and does not allow tuples to span multiple pages. Therefore, it is
|
||||||
not possible to store very large field values directly. To overcome
|
not possible to store very large field values directly. To overcome
|
||||||
this limitation, large field values are compressed and/or broken up into
|
this limitation, large field values are compressed and/or broken up into
|
||||||
multiple physical rows. This happens transparently to the user, with only
|
multiple physical rows. This happens transparently to the user, with only
|
||||||
small impact on most of the backend code. The technique is affectionately
|
small impact on most of the backend code. The technique is affectionately
|
||||||
known as <acronym>TOAST</> (or <quote>the best thing since sliced bread</>).
|
known as <acronym>TOAST</> (or <quote>the best thing since sliced bread</>).
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -377,24 +379,24 @@ comparison table, in which all the HTML pages were cut down to 7 kB to fit.
|
|||||||
|
|
||||||
<title>Free Space Map</title>
|
<title>Free Space Map</title>
|
||||||
|
|
||||||
<indexterm>
|
<indexterm>
|
||||||
<primary>Free Space Map</primary>
|
<primary>Free Space Map</primary>
|
||||||
</indexterm>
|
</indexterm>
|
||||||
<indexterm><primary>FSM</><see>Free Space Map</></indexterm>
|
<indexterm><primary>FSM</><see>Free Space Map</></indexterm>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
A Free Space Map is stored with every heap and index relation, except for
|
Each heap and index relation, except for hash indexes, has a Free Space Map
|
||||||
hash indexes, to keep track of available space in the relation. It's stored
|
(FSM) to keep track of available space in the relation. It's stored
|
||||||
along the main relation data, in a separate FSM relation fork, named after
|
alongside the main relation data in a separate relation fork, named after the
|
||||||
relfilenode of the relation, but with a <literal>_fsm</> suffix. For example,
|
filenode number of the relation, plus a <literal>_fsm</> suffix. For example,
|
||||||
if the relfilenode of a relation is 12345, the FSM is stored in a file called
|
if the filenode of a relation is 12345, the FSM is stored in a file called
|
||||||
<filename>12345_fsm</>, in the same directory as the main relation file.
|
<filename>12345_fsm</>, in the same directory as the main relation file.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
The Free Space Map is organized as a tree of <acronym>FSM</> pages. The
|
The Free Space Map is organized as a tree of <acronym>FSM</> pages. The
|
||||||
bottom level <acronym>FSM</> pages stores the free space available on every
|
bottom level <acronym>FSM</> pages store the free space available on each
|
||||||
heap (or index) page, using one byte to represent each heap page. The upper
|
heap (or index) page, using one byte to represent each such page. The upper
|
||||||
levels aggregate information from the lower levels.
|
levels aggregate information from the lower levels.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -409,8 +411,8 @@ at the root.
|
|||||||
<para>
|
<para>
|
||||||
See <filename>src/backend/storage/freespace/README</> for more details on
|
See <filename>src/backend/storage/freespace/README</> for more details on
|
||||||
how the <acronym>FSM</> is structured, and how it's updated and searched.
|
how the <acronym>FSM</> is structured, and how it's updated and searched.
|
||||||
<xref linkend="pgfreespacemap"> contrib module can be used to view the
|
The <filename>contrib/pg_freespacemap</> module can be used to examine the
|
||||||
information stored in free space maps.
|
information stored in free space maps (see <xref linkend="pgfreespacemap">).
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
</sect1>
|
</sect1>
|
||||||
@ -515,7 +517,7 @@ data. Empty in ordinary tables.</entry>
|
|||||||
and <structfield>pd_special</structfield>). These contain byte offsets
|
and <structfield>pd_special</structfield>). These contain byte offsets
|
||||||
from the page start to the start
|
from the page start to the start
|
||||||
of unallocated space, to the end of unallocated space, and to the start of
|
of unallocated space, to the end of unallocated space, and to the start of
|
||||||
the special space.
|
the special space.
|
||||||
The next 2 bytes of the page header,
|
The next 2 bytes of the page header,
|
||||||
<structfield>pd_pagesize_version</structfield>, store both the page size
|
<structfield>pd_pagesize_version</structfield>, store both the page size
|
||||||
and a version indicator. Beginning with
|
and a version indicator. Beginning with
|
||||||
@ -530,15 +532,15 @@ data. Empty in ordinary tables.</entry>
|
|||||||
more than one page size in an installation.
|
more than one page size in an installation.
|
||||||
The last field is a hint that shows whether pruning the page is likely
|
The last field is a hint that shows whether pruning the page is likely
|
||||||
to be profitable: it tracks the oldest un-pruned XMAX on the page.
|
to be profitable: it tracks the oldest un-pruned XMAX on the page.
|
||||||
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<table tocentry="1" id="pageheaderdata-table">
|
<table tocentry="1" id="pageheaderdata-table">
|
||||||
<title>PageHeaderData Layout</title>
|
<title>PageHeaderData Layout</title>
|
||||||
<titleabbrev>PageHeaderData Layout</titleabbrev>
|
<titleabbrev>PageHeaderData Layout</titleabbrev>
|
||||||
<tgroup cols="4">
|
<tgroup cols="4">
|
||||||
<thead>
|
<thead>
|
||||||
<row>
|
<row>
|
||||||
<entry>Field</entry>
|
<entry>Field</entry>
|
||||||
<entry>Type</entry>
|
<entry>Type</entry>
|
||||||
<entry>Length</entry>
|
<entry>Length</entry>
|
||||||
@ -627,25 +629,25 @@ data. Empty in ordinary tables.</entry>
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
|
||||||
The items themselves are stored in space allocated backwards from the end
|
The items themselves are stored in space allocated backwards from the end
|
||||||
of unallocated space. The exact structure varies depending on what the
|
of unallocated space. The exact structure varies depending on what the
|
||||||
table is to contain. Tables and sequences both use a structure named
|
table is to contain. Tables and sequences both use a structure named
|
||||||
<type>HeapTupleHeaderData</type>, described below.
|
<type>HeapTupleHeaderData</type>, described below.
|
||||||
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
|
||||||
The final section is the <quote>special section</quote> which can
|
The final section is the <quote>special section</quote> which can
|
||||||
contain anything the access method wishes to store. For example,
|
contain anything the access method wishes to store. For example,
|
||||||
b-tree indexes store links to the page's left and right siblings,
|
b-tree indexes store links to the page's left and right siblings,
|
||||||
as well as some other data relevant to the index structure.
|
as well as some other data relevant to the index structure.
|
||||||
Ordinary tables do not use a special section at all (indicated by setting
|
Ordinary tables do not use a special section at all (indicated by setting
|
||||||
<structfield>pd_special</> to equal the page size).
|
<structfield>pd_special</> to equal the page size).
|
||||||
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
|
||||||
All table rows are structured in the same way. There is a fixed-size
|
All table rows are structured in the same way. There is a fixed-size
|
||||||
@ -669,15 +671,15 @@ data. Empty in ordinary tables.</entry>
|
|||||||
<structfield>t_hoff</> a MAXALIGN multiple will appear between the null
|
<structfield>t_hoff</> a MAXALIGN multiple will appear between the null
|
||||||
bitmap and the object ID. (This in turn ensures that the object ID is
|
bitmap and the object ID. (This in turn ensures that the object ID is
|
||||||
suitably aligned.)
|
suitably aligned.)
|
||||||
|
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<table tocentry="1" id="heaptupleheaderdata-table">
|
<table tocentry="1" id="heaptupleheaderdata-table">
|
||||||
<title>HeapTupleHeaderData Layout</title>
|
<title>HeapTupleHeaderData Layout</title>
|
||||||
<titleabbrev>HeapTupleHeaderData Layout</titleabbrev>
|
<titleabbrev>HeapTupleHeaderData Layout</titleabbrev>
|
||||||
<tgroup cols="4">
|
<tgroup cols="4">
|
||||||
<thead>
|
<thead>
|
||||||
<row>
|
<row>
|
||||||
<entry>Field</entry>
|
<entry>Field</entry>
|
||||||
<entry>Type</entry>
|
<entry>Type</entry>
|
||||||
<entry>Length</entry>
|
<entry>Length</entry>
|
||||||
@ -743,7 +745,7 @@ data. Empty in ordinary tables.</entry>
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
|
||||||
Interpreting the actual data can only be done with information obtained
|
Interpreting the actual data can only be done with information obtained
|
||||||
from other tables, mostly <structname>pg_attribute</structname>. The
|
from other tables, mostly <structname>pg_attribute</structname>. The
|
||||||
key values needed to identify field locations are
|
key values needed to identify field locations are
|
||||||
@ -753,7 +755,7 @@ data. Empty in ordinary tables.</entry>
|
|||||||
null values. All this trickery is wrapped up in the functions
|
null values. All this trickery is wrapped up in the functions
|
||||||
<firstterm>heap_getattr</firstterm>, <firstterm>fastgetattr</firstterm>
|
<firstterm>heap_getattr</firstterm>, <firstterm>fastgetattr</firstterm>
|
||||||
and <firstterm>heap_getsysattr</firstterm>.
|
and <firstterm>heap_getsysattr</firstterm>.
|
||||||
|
|
||||||
</para>
|
</para>
|
||||||
<para>
|
<para>
|
||||||
|
|
||||||
@ -767,7 +769,7 @@ data. Empty in ordinary tables.</entry>
|
|||||||
value and some flag bits. Depending on the flags, the data can be either
|
value and some flag bits. Depending on the flags, the data can be either
|
||||||
inline or in a <acronym>TOAST</> table;
|
inline or in a <acronym>TOAST</> table;
|
||||||
it might be compressed, too (see <xref linkend="storage-toast">).
|
it might be compressed, too (see <xref linkend="storage-toast">).
|
||||||
|
|
||||||
</para>
|
</para>
|
||||||
</sect1>
|
</sect1>
|
||||||
|
|
||||||
|
Loading…
x
Reference in New Issue
Block a user