mirror of
https://github.com/postgres/postgres.git
synced 2025-07-28 23:42:10 +03:00
Teach pageinspect about nbtree deduplication.
Add a new bt_metap() column to display the metapage's allequalimage
field. Also add three new columns to contrib/pageinspect's
bt_page_items() function:
* Add a boolean column ("dead") that displays the LP_DEAD bit value for
each non-pivot tuple.
* Add a TID column ("htid") that displays a single heap TID value for
each tuple. This is the TID that is returned by BTreeTupleGetHeapTID(),
so comparable values are shown for pivot tuples, plain non-pivot tuples,
and posting list tuples.
* Add a TID array column ("tids") that displays TIDs from each tuple's
posting list, if any. This works just like the "tids" column from
pageinspect's gin_leafpage_items() function.
No version bump for the pageinspect extension, since there hasn't been a
stable Postgres release since the last version bump (the last bump was
part of commit 58b4cb30
).
Author: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-WzmSMmU2eNvY9+a4MNP+z02h6sa-uxZvN3un6jY02ZVBSw@mail.gmail.com
This commit is contained in:
@ -300,13 +300,14 @@ test=# SELECT t_ctid, raw_flags, combined_flags
|
||||
test=# SELECT * FROM bt_metap('pg_cast_oid_index');
|
||||
-[ RECORD 1 ]-----------+-------
|
||||
magic | 340322
|
||||
version | 3
|
||||
version | 4
|
||||
root | 1
|
||||
level | 0
|
||||
fastroot | 1
|
||||
fastlevel | 0
|
||||
oldest_xact | 582
|
||||
last_cleanup_num_tuples | 1000
|
||||
allequalimage | f
|
||||
</screen>
|
||||
</para>
|
||||
</listitem>
|
||||
@ -329,11 +330,11 @@ test=# SELECT * FROM bt_page_stats('pg_cast_oid_index', 1);
|
||||
-[ RECORD 1 ]-+-----
|
||||
blkno | 1
|
||||
type | l
|
||||
live_items | 256
|
||||
live_items | 224
|
||||
dead_items | 0
|
||||
avg_item_size | 12
|
||||
avg_item_size | 16
|
||||
page_size | 8192
|
||||
free_size | 4056
|
||||
free_size | 3668
|
||||
btpo_prev | 0
|
||||
btpo_next | 0
|
||||
btpo | 0
|
||||
@ -356,33 +357,75 @@ btpo_flags | 3
|
||||
<function>bt_page_items</function> returns detailed information about
|
||||
all of the items on a B-tree index page. For example:
|
||||
<screen>
|
||||
test=# SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
|
||||
itemoffset | ctid | itemlen | nulls | vars | data
|
||||
------------+---------+---------+-------+------+-------------
|
||||
1 | (0,1) | 12 | f | f | 23 27 00 00
|
||||
2 | (0,2) | 12 | f | f | 24 27 00 00
|
||||
3 | (0,3) | 12 | f | f | 25 27 00 00
|
||||
4 | (0,4) | 12 | f | f | 26 27 00 00
|
||||
5 | (0,5) | 12 | f | f | 27 27 00 00
|
||||
6 | (0,6) | 12 | f | f | 28 27 00 00
|
||||
7 | (0,7) | 12 | f | f | 29 27 00 00
|
||||
8 | (0,8) | 12 | f | f | 2a 27 00 00
|
||||
test=# SELECT itemoffset, ctid, itemlen, nulls, vars, data, dead, htid, tids[0:2] AS some_tids
|
||||
FROM bt_page_items(get_raw_page('tenk2_hundred', 5));
|
||||
itemoffset | ctid | itemlen | nulls | vars | data | dead | htid | some_tids
|
||||
------------+-----------+---------+-------+------+-------------------------+------+--------+---------------------
|
||||
1 | (16,1) | 16 | f | f | 30 00 00 00 00 00 00 00 | | |
|
||||
2 | (16,8292) | 616 | f | f | 24 00 00 00 00 00 00 00 | f | (1,6) | {"(1,6)","(10,22)"}
|
||||
3 | (16,8292) | 616 | f | f | 25 00 00 00 00 00 00 00 | f | (1,18) | {"(1,18)","(4,22)"}
|
||||
4 | (16,8292) | 616 | f | f | 26 00 00 00 00 00 00 00 | f | (4,18) | {"(4,18)","(6,17)"}
|
||||
5 | (16,8292) | 616 | f | f | 27 00 00 00 00 00 00 00 | f | (1,2) | {"(1,2)","(1,19)"}
|
||||
6 | (16,8292) | 616 | f | f | 28 00 00 00 00 00 00 00 | f | (2,24) | {"(2,24)","(4,11)"}
|
||||
7 | (16,8292) | 616 | f | f | 29 00 00 00 00 00 00 00 | f | (2,17) | {"(2,17)","(11,2)"}
|
||||
8 | (16,8292) | 616 | f | f | 2a 00 00 00 00 00 00 00 | f | (0,25) | {"(0,25)","(3,20)"}
|
||||
9 | (16,8292) | 616 | f | f | 2b 00 00 00 00 00 00 00 | f | (0,10) | {"(0,10)","(0,14)"}
|
||||
10 | (16,8292) | 616 | f | f | 2c 00 00 00 00 00 00 00 | f | (1,3) | {"(1,3)","(3,9)"}
|
||||
11 | (16,8292) | 616 | f | f | 2d 00 00 00 00 00 00 00 | f | (6,28) | {"(6,28)","(11,1)"}
|
||||
12 | (16,8292) | 616 | f | f | 2e 00 00 00 00 00 00 00 | f | (0,27) | {"(0,27)","(1,13)"}
|
||||
13 | (16,8292) | 616 | f | f | 2f 00 00 00 00 00 00 00 | f | (4,17) | {"(4,17)","(4,21)"}
|
||||
(13 rows)
|
||||
</screen>
|
||||
In a B-tree leaf page, <structfield>ctid</structfield> points to a heap tuple.
|
||||
In an internal page, the block number part of <structfield>ctid</structfield>
|
||||
points to another page in the index itself, while the offset part
|
||||
(the second number) is ignored and is usually 1.
|
||||
This is a B-tree leaf page. All tuples that point to the table
|
||||
happen to be posting list tuples (all of which store a total of
|
||||
100 6 byte TIDs). There is also a <quote>high key</quote> tuple
|
||||
at <literal>itemoffset</literal> number 1.
|
||||
<structfield>ctid</structfield> is used to store encoded
|
||||
information about each tuple in this example, though leaf page
|
||||
tuples often store a heap TID directly in the
|
||||
<structfield>ctid</structfield> field instead.
|
||||
<structfield>tids</structfield> is the list of TIDs stored as a
|
||||
posting list.
|
||||
</para>
|
||||
<para>
|
||||
In an internal page (not shown), the block number part of
|
||||
<structfield>ctid</structfield> is a <quote>downlink</quote>,
|
||||
which is a block number of another page in the index itself.
|
||||
The offset part (the second number) of
|
||||
<structfield>ctid</structfield> stores encoded information about
|
||||
the tuple, such as the number of columns present (suffix
|
||||
truncation may have removed unneeded suffix columns). Truncated
|
||||
columns are treated as having the value <quote>minus
|
||||
infinity</quote>.
|
||||
</para>
|
||||
<para>
|
||||
<structfield>htid</structfield> shows a heap TID for the tuple,
|
||||
regardless of the underlying tuple representation. This value
|
||||
may match <structfield>ctid</structfield>, or may be decoded
|
||||
from the alternative representations used by posting list tuples
|
||||
and tuples from internal pages. Tuples in internal pages
|
||||
usually have the implementation level heap TID column truncated
|
||||
away, which is represented as a NULL
|
||||
<structfield>htid</structfield> value.
|
||||
</para>
|
||||
<para>
|
||||
Note that the first item on any non-rightmost page (any page with
|
||||
a non-zero value in the <structfield>btpo_next</structfield> field) is the
|
||||
page's <quote>high key</quote>, meaning its <structfield>data</structfield>
|
||||
serves as an upper bound on all items appearing on the page, while
|
||||
its <structfield>ctid</structfield> field is meaningless. Also, on non-leaf
|
||||
pages, the first real data item (the first item that is not a high
|
||||
key) is a <quote>minus infinity</quote> item, with no actual value
|
||||
in its <structfield>data</structfield> field. Such an item does have a valid
|
||||
downlink in its <structfield>ctid</structfield> field, however.
|
||||
its <structfield>ctid</structfield> field does not point to
|
||||
another block. Also, on internal pages, the first real data
|
||||
item (the first item that is not a high key) reliably has every
|
||||
column truncated away, leaving no actual value in its
|
||||
<structfield>data</structfield> field. Such an item does have a
|
||||
valid downlink in its <structfield>ctid</structfield> field,
|
||||
however.
|
||||
</para>
|
||||
<para>
|
||||
For more details about the structure of B-tree indexes, see
|
||||
<xref linkend="btree-structure"/>. For more details about
|
||||
deduplication and posting lists, see <xref
|
||||
linkend="btree-deduplication"/>.
|
||||
</para>
|
||||
</listitem>
|
||||
</varlistentry>
|
||||
@ -402,17 +445,24 @@ test=# SELECT * FROM bt_page_items('pg_cast_oid_index', 1);
|
||||
with <function>get_raw_page</function> should be passed as argument. So
|
||||
the last example could also be rewritten like this:
|
||||
<screen>
|
||||
test=# SELECT * FROM bt_page_items(get_raw_page('pg_cast_oid_index', 1));
|
||||
itemoffset | ctid | itemlen | nulls | vars | data
|
||||
------------+---------+---------+-------+------+-------------
|
||||
1 | (0,1) | 12 | f | f | 23 27 00 00
|
||||
2 | (0,2) | 12 | f | f | 24 27 00 00
|
||||
3 | (0,3) | 12 | f | f | 25 27 00 00
|
||||
4 | (0,4) | 12 | f | f | 26 27 00 00
|
||||
5 | (0,5) | 12 | f | f | 27 27 00 00
|
||||
6 | (0,6) | 12 | f | f | 28 27 00 00
|
||||
7 | (0,7) | 12 | f | f | 29 27 00 00
|
||||
8 | (0,8) | 12 | f | f | 2a 27 00 00
|
||||
test=# SELECT itemoffset, ctid, itemlen, nulls, vars, data, dead, htid, tids[0:2] AS some_tids
|
||||
FROM bt_page_items(get_raw_page('tenk2_hundred', 5));
|
||||
itemoffset | ctid | itemlen | nulls | vars | data | dead | htid | some_tids
|
||||
------------+-----------+---------+-------+------+-------------------------+------+--------+---------------------
|
||||
1 | (16,1) | 16 | f | f | 30 00 00 00 00 00 00 00 | | |
|
||||
2 | (16,8292) | 616 | f | f | 24 00 00 00 00 00 00 00 | f | (1,6) | {"(1,6)","(10,22)"}
|
||||
3 | (16,8292) | 616 | f | f | 25 00 00 00 00 00 00 00 | f | (1,18) | {"(1,18)","(4,22)"}
|
||||
4 | (16,8292) | 616 | f | f | 26 00 00 00 00 00 00 00 | f | (4,18) | {"(4,18)","(6,17)"}
|
||||
5 | (16,8292) | 616 | f | f | 27 00 00 00 00 00 00 00 | f | (1,2) | {"(1,2)","(1,19)"}
|
||||
6 | (16,8292) | 616 | f | f | 28 00 00 00 00 00 00 00 | f | (2,24) | {"(2,24)","(4,11)"}
|
||||
7 | (16,8292) | 616 | f | f | 29 00 00 00 00 00 00 00 | f | (2,17) | {"(2,17)","(11,2)"}
|
||||
8 | (16,8292) | 616 | f | f | 2a 00 00 00 00 00 00 00 | f | (0,25) | {"(0,25)","(3,20)"}
|
||||
9 | (16,8292) | 616 | f | f | 2b 00 00 00 00 00 00 00 | f | (0,10) | {"(0,10)","(0,14)"}
|
||||
10 | (16,8292) | 616 | f | f | 2c 00 00 00 00 00 00 00 | f | (1,3) | {"(1,3)","(3,9)"}
|
||||
11 | (16,8292) | 616 | f | f | 2d 00 00 00 00 00 00 00 | f | (6,28) | {"(6,28)","(11,1)"}
|
||||
12 | (16,8292) | 616 | f | f | 2e 00 00 00 00 00 00 00 | f | (0,27) | {"(0,27)","(1,13)"}
|
||||
13 | (16,8292) | 616 | f | f | 2f 00 00 00 00 00 00 00 | f | (4,17) | {"(4,17)","(4,21)"}
|
||||
(13 rows)
|
||||
</screen>
|
||||
All the other details are the same as explained in the previous item.
|
||||
</para>
|
||||
|
Reference in New Issue
Block a user