1
0
mirror of https://github.com/postgres/postgres.git synced 2025-10-25 13:17:41 +03:00
Commit Graph

31 Commits

Author SHA1 Message Date
Michael Paquier
cd4868a570 pageinspect: Fix handling of all-zero pages
Getting from get_raw_page() an all-zero page is considered as a valid
case by the buffer manager and it can happen for example when finding a
corrupted page with zero_damaged_pages enabled (using zero_damaged_pages
to look at corrupted pages happens), or after a crash when a relation
file is extended before any WAL for its new data is generated (before a
vacuum or autovacuum job comes in to do some cleanup).

However, all the functions of pageinspect, as of the index AMs (except
hash that has its own idea of new pages), heap, the FSM or the page
header have never worked with all-zero pages, causing various crashes
when going through the page internals.

This commit changes all the pageinspect functions to be compliant with
all-zero pages, where the choice is made to return NULL or no rows for
SRFs when finding a new page.  get_raw_page() still works the same way,
returning a batch of zeros in the bytea of the page retrieved.  A hard
error could be used but NULL, while more invasive, is useful when
scanning relation files in full to get a batch of results for a single
relation in one query.  Tests are added for all the code paths
impacted.

Reported-by: Daria Lepikhova
Author: Michael Paquier
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10
2022-04-14 15:08:03 +09:00
Michael Paquier
291e517a4d pageinspect: Add more sanity checks to prevent out-of-bound reads
A couple of code paths use the special area on the page passed by the
function caller, expecting to find some data in it.  However, feeding
an incorrect page can lead to out-of-bound reads when trying to access
the page special area (like a heap page that has no special area,
leading PageGetSpecialPointer() to grab a pointer outside the allocated
page).

The functions used for hash and btree indexes have some protection
already against that, while some other functions using a relation OID
as argument would make sure that the access method involved is correct,
but functions taking in input a raw page without knowing the relation
the page is attached to would run into problems.

This commit improves the set of checks used in the code paths of BRIN,
btree (including one check if a leaf page is found with a non-zero
level), GIN and GiST to verify that the page given in input has a
special area size that fits with each access method, which is done
though PageGetSpecialSize(), becore calling PageGetSpecialPointer().

The scope of the checks done is limited to work with pages that one
would pass after getting a block with get_raw_page(), as it is possible
to craft byteas that could bypass existing code paths.  Having too many
checks would also impact the usability of pageinspect, as the existing
code is very useful to look at the content details in a corrupted page,
so the focus is really to avoid out-of-bound reads as this is never a
good thing even with functions whose execution is limited to
superusers.

The safest approach could be to rework the functions so as these fetch a
block using a relation OID and a block number, but there are also cases
where using a raw page is useful.

Tests are added to cover all the code paths that needed such checks, and
an error message for hash indexes is reworded to fit better with what
this commit adds.

Reported-By: Alexander Lakhin
Author: Julien Rouhaud, Michael Paquier
Discussion: https://postgr.es/m/16527-ef7606186f0610a1@postgresql.org
Discussion: https://postgr.es/m/561e187b-3549-c8d5-03f5-525c14e65bd0@postgrespro.ru
Backpatch-through: 10
2022-03-27 17:53:40 +09:00
Michael Paquier
076f4d9539 pageinspect: Fix handling of page sizes and AM types
This commit fixes a set of issues related to the use of the SQL
functions in this module when the caller is able to pass down raw page
data as input argument:
- The page size check was fuzzy in a couple of places, sometimes
looking after only a sub-range, but what we are looking for is an exact
match on BLCKSZ.  After considering a few options here, I have settled
down to do a generalization of get_page_from_raw().  Most of the SQL
functions already used that, and this is not strictly required if not
accessing an 8-byte-wide value from a raw page, but this feels safer in
the long run for alignment-picky environment, particularly if a code
path begins to access such values.  This also reduces the number of
strings that need to be translated.
- The BRIN function brin_page_items() uses a Relation but it did not
check the access method of the opened index, potentially leading to
crashes.  All the other functions in need of a Relation already did
that.
- Some code paths could fail on elog(), but we should to use ereport()
for failures that can be triggered by the user.

Tests are added to stress all the cases that are fixed as of this
commit, with some junk raw pages (\set VERBOSITY ensures that this works
across all page sizes) and unexpected index types when functions open
relations.

Author: Michael Paquier, Justin Prysby
Discussion: https://postgr.es/m/20220218030020.GA1137@telsasoft.com
Backpatch-through: 10
2022-03-16 11:19:39 +09:00
Michael Paquier
6bdf1a1400 Fix collection of typos in the code and the documentation
Some words were duplicated while other places were grammatically
incorrect, including one variable name in the code.

Author: Otto Kekalainen, Justin Pryzby
Discussion: https://postgr.es/m/7DDBEFC5-09B6-4325-B942-B563D1A24BDC@amazon.com
2022-03-15 11:29:35 +09:00
Michael Paquier
127404fbe2 pageinspect: Improve page_header() for pages of 32kB
ld_upper, ld_lower, pd_special and the page size have been using
smallint as return type, which could cause those fields to return
negative values in certain cases for builds configures with a page size
of 32kB.

Bump pageinspect to 1.10.  page_header() is able to handle the correct
return type of those fields at runtime when using an older version of
the extension, with some tests are added to cover that.

Author: Quan Zongliang
Reviewed-by: Michael Paquier, Bharath Rupireddy
Discussion: https://postgr.es/m/8b8ec36e-61fe-14f9-005d-07bc85aa4eed@yeah.net
2021-07-12 11:05:27 +09:00
Tom Lane
c2dc1a7976 Disable vacuum page skipping in selected test cases.
By default VACUUM will skip pages that it can't immediately get
exclusive access to, which means that even activities as harmless
and unpredictable as checkpoint buffer writes might prevent a page
from being processed.  Ordinarily this is no big deal, but we have
a small number of test cases that examine the results of VACUUM's
processing and therefore will fail if the page of interest is skipped.
This seems to be the explanation for some rare buildfarm failures.
To fix, add the DISABLE_PAGE_SKIPPING option to the VACUUM commands
in tests where this could be an issue.

In passing, remove a duplicated query in pageinspect/sql/page.sql.

Back-patch as necessary (some of these cases are as old as v10).

Discussion: https://postgr.es/m/413923.1611006484@sss.pgh.pa.us
2021-01-20 11:49:29 -05:00
Peter Eisentraut
f18aa1b203 pageinspect: Change block number arguments to bigint
Block numbers are 32-bit unsigned integers.  Therefore, the smallest
SQL integer type that they can fit in is bigint.  However, in the
pageinspect module, most input and output parameters dealing with
block numbers were declared as int.  The behavior with block numbers
larger than a signed 32-bit integer was therefore dubious.  Change
these arguments to type bigint and add some more explicit error
checking on the block range.

(Other contrib modules appear to do this correctly already.)

Since we are changing argument types of existing functions, in order
to not misbehave if the binary is updated before the extension is
updated, we need to create new C symbols for the entry points, similar
to how it's done in other extensions as well.

Reported-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Reviewed-by: Alvaro Herrera <alvherre@alvh.no-ip.org>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/d8f6bdd536df403b9b33816e9f7e0b9d@G08CNEXMBPEKD05.g08.fujitsu.local
2021-01-19 11:03:38 +01:00
Heikki Linnakangas
5abca4b1cd Fix test failure with wal_level=minimal.
The newly-added gist pageinspect test prints the LSNs of GiST pages,
expecting them all to be 1 (GistBuildLSN). But with wal_level=minimal,
they got updated by the whole-relation WAL-logging at commit. Fix by
wrapping the problematic tests in the same transaction with the CREATE
INDEX.

Per buildfarm failure on thorntail.

Discussion: https://www.postgresql.org/message-id/3B4F97E5-40FB-4142-8CAA-B301CDFBF982%40iki.fi
2021-01-13 20:58:51 +02:00
Heikki Linnakangas
6ecaaf810b Fix portability issues in the new gist pageinspect test.
1. The raw bytea representation of the point-type keys used in the test
   depends on endianess. Remove the raw key_data column from the test.

2. The items stored on non-leftmost gist page depends on how many items
   git on the other pages. This showed up as a failure on 32-bit i386
   systems. To fix, only test the gist_page_items() function on the
   leftmost leaf page.

Per Andrey Borodin and the buildfarm.

Discussion: https://www.postgresql.org/message-id/9FCEC1DC-86FB-4A57-88EF-DD13663B36AF%40yandex-team.ru
2021-01-13 12:32:54 +02:00
Heikki Linnakangas
756ab29124 Add functions to 'pageinspect' to inspect GiST indexes.
Author: Andrey Borodin and me
Discussion: https://www.postgresql.org/message-id/3E4F9093-A1B5-4DF8-A292-0B48692E3954%40yandex-team.ru
2021-01-13 10:33:33 +02:00
Tom Lane
38ce06c37e Add an explicit test to catch changes in checksumming calculations.
Seems like a good idea in view of 006517432 and addd034ae.

Michael Paquier, Tom Lane

Discussion: https://postgr.es/m/20200306075230.GA118430@paquier.xyz
2020-03-08 15:09:14 -04:00
Michael Paquier
58b4cb30a5 Redesign pageinspect function printing infomask bits
After more discussion, the new function added by ddbd5d8 could have been
designed in a better way.  Based on an idea from Álvaro, instead of
returning one column which includes both the raw and combined flags, use
two columns, with one for the raw flags and one for the combined flags.

This also takes care of some issues with HEAP_LOCKED_UPGRADED and
HEAP_XMAX_IS_LOCKED_ONLY which are not really combined flags as they
depend on conditions defined by other raw bits, as mentioned by Amit.

While on it, fix an extra issue with combined flags.  A combined flag
was returned if at least one of its bits was set, but all its bits need
to be set to include it in the result.

Author: Michael Paquier
Reviewed-by: Álvaro Herrera, Amit Kapila
Discussion: https://postgr.es/m/20190913114950.GA3824@alvherre.pgsql
2019-09-19 11:01:52 +09:00
Michael Paquier
ddbd5d8731 Add to pageinspect function to make t_infomask/t_infomask2 human-readable
Flags of t_infomask and t_infomask2 for each tuple are already included
in the information returned by heap_page_items as integers, and we
lacked a way to make that information human-readable.

Per discussion, the function includes an option which controls if
combined flags should be decomposed or not.  The default is false, to
not decompose combined flags.

The module is bumped to version 1.8.

Author: Craig Ringer, Sawada Masahiko
Reviewed-by: Peter Geoghegan, Robert Haas, Álvaro Herrera, Moon Insung,
Amit Kapila, Michael Paquier, Tomas Vondra
Discussion: https://postgr.es/m/CAMsr+YEY7jeaXOb+oX+RhDyOFuTMdmHjGsBxL=igCm03J0go9Q@mail.gmail.com
2019-09-12 15:06:00 +09:00
Amit Kapila
7db0cde6b5 Revert "Avoid the creation of the free space map for small heap relations".
This feature was using a process local map to track the first few blocks
in the relation.  The map was reset each time we get the block with enough
freespace.  It was discussed that it would be better to track this map on
a per-relation basis in relcache and then invalidate the same whenever
vacuum frees up some space in the page or when FSM is created.  The new
design would be better both in terms of API design and performance.

List of commits reverted, in reverse chronological order:

06c8a5090e  Improve code comments in b0eaa4c51b.
13e8643bfc  During pg_upgrade, conditionally skip transfer of FSMs.
6f918159a9  Add more tests for FSM.
9c32e4c350  Clear the local map when not used.
29d108cdec  Update the documentation for FSM behavior..
08ecdfe7e5  Make FSM test portable.
b0eaa4c51b  Avoid creation of the free space map for small heap relations.

Discussion: https://postgr.es/m/20190416180452.3pm6uegx54iitbt5@alap3.anarazel.de
2019-05-07 09:30:24 +05:30
Amit Kapila
08ecdfe7e5 Make FSM test portable.
In b0eaa4c51b, we allow FSM to be created only after 4 pages.  One of the
tests check the FSM contents and to do that it populates many tuples in
the relation.  The FSM contents depend on the availability of freespace in
the page and it could vary because of the alignment of tuples.

This commit removes the dependency on FSM contents.

Author: Amit Kapila
Discussion: https://postgr.es/m/CAA4eK1KADF6K1bagr0--mGv3dMcZ%3DH_Z-Qtvdfbp5PjaC6PJJA%40mail.gmail.com
2019-02-04 10:08:29 +05:30
Amit Kapila
b0eaa4c51b Avoid creation of the free space map for small heap relations, take 2.
Previously, all heaps had FSMs. For very small tables, this means that the
FSM took up more space than the heap did. This is wasteful, so now we
refrain from creating the FSM for heaps with 4 pages or fewer. If the last
known target block has insufficient space, we still try to insert into some
other page before giving up and extending the relation, since doing
otherwise leads to table bloat. Testing showed that trying every page
penalized performance slightly, so we compromise and try every other page.
This way, we visit at most two pages. Any pages with wasted free space
become visible at next relation extension, so we still control table bloat.
As a bonus, directly attempting one or two pages can even be faster than
consulting the FSM would have been.

Once the FSM is created for a heap we don't remove it even if somebody
deletes all the rows from the corresponding relation.  We don't think it is
a useful optimization as it is quite likely that relation will again grow
to the same size.

Author: John Naylor, Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Mithun C Y
Discussion: https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ@mail.gmail.com
2019-02-04 07:49:15 +05:30
Amit Kapila
a23676503b Revert "Avoid creation of the free space map for small heap relations."
This reverts commit ac88d2962a.
2019-01-28 11:31:44 +05:30
Amit Kapila
ac88d2962a Avoid creation of the free space map for small heap relations.
Previously, all heaps had FSMs. For very small tables, this means that the
FSM took up more space than the heap did. This is wasteful, so now we
refrain from creating the FSM for heaps with 4 pages or fewer. If the last
known target block has insufficient space, we still try to insert into some
other page before giving up and extending the relation, since doing
otherwise leads to table bloat. Testing showed that trying every page
penalized performance slightly, so we compromise and try every other page.
This way, we visit at most two pages. Any pages with wasted free space
become visible at next relation extension, so we still control table bloat.
As a bonus, directly attempting one or two pages can even be faster than
consulting the FSM would have been.

Once the FSM is created for a heap we don't remove it even if somebody
deletes all the rows from the corresponding relation.  We don't think it is
a useful optimization as it is quite likely that relation will again grow
to the same size.

Author: John Naylor with design inputs and some code contribution by Amit Kapila
Reviewed-by: Amit Kapila
Tested-by: Mithun C Y
Discussion: https://www.postgresql.org/message-id/CAJVSVGWvB13PzpbLEecFuGFc5V2fsO736BsdTakPiPAcdMM5tQ@mail.gmail.com
2019-01-28 08:14:06 +05:30
Alvaro Herrera
bef5fcc36b pgstatindex, pageinspect: handle partitioned indexes
Commit 8b08f7d482 failed to update these modules to at least give
non-broken error messages for partitioned indexes.  Add appropriate
error support to them.

Peter G. was complaining about a problem of unfriendly error messages;
while we haven't fixed that yet, subsequent discussion let to discovery
of these unhandled cases.

Author: Michaël Paquier
Reported-by: Peter Geoghegan
Discussion: https://postgr.es/m/CAH2-WzkOKptQiE51Bh4_xeEHhaBwHkZkGtKizrFMgEkfUuRRQg@mail.gmail.com
2018-05-09 14:22:59 -03:00
Tom Lane
18869e202b Fix new test case to not be endian-dependent.
Per buildfarm.

Discussion: https://postgr.es/m/ec295792-a69f-350f-6287-25a20e8f31d5@gmail.com
2018-01-04 16:00:21 -05:00
Tom Lane
39cfe86195 Fix incorrect computations of length of null bitmap in pageinspect.
Instead of using our standard macro for this calculation, this code
did it itself ... and got it wrong, leading to incorrect display of
the null bitmap in some cases.  Noted and fixed by Maksim Milyutin.

In passing, remove a uselessly duplicative error check.

Errors were introduced in commit d6061f83a; back-patch to 9.6
where that came in.

Maksim Milyutin, reviewed by Andrey Borodin

Discussion: https://postgr.es/m/ec295792-a69f-350f-6287-25a20e8f31d5@gmail.com
2018-01-04 14:59:00 -05:00
Peter Eisentraut
193f5f9e91 pageinspect: Add bt_page_items function with bytea argument
Author: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
2017-04-04 23:52:55 -04:00
Peter Eisentraut
fef2bcdcba pageinspect: Add page_checksum function
Author: Tomas Vondra <tomas.vondra@2ndquadrant.com>
Reviewed-by: Ashutosh Sharma <ashu.coek88@gmail.com>
2017-03-17 10:55:17 -04:00
Peter Eisentraut
a02731cb10 pageinspect: Add test for page_header function 2017-03-17 09:23:39 -04:00
Stephen Frost
c08d82f38e Add relkind checks to certain contrib modules
The contrib extensions pageinspect, pg_visibility and pgstattuple only
work against regular relations which have storage.  They don't work
against foreign tables, partitioned (parent) tables, views, et al.

Add checks to the user-callable functions to return a useful error
message to the user if they mistakenly pass an invalid relation to a
function which doesn't accept that kind of relation.

In passing, improve some of the existing checks to use ereport() instead
of elog(), add a function to consolidate common checks where
appropriate, and add some regression tests.

Author: Amit Langote, with various changes by me
Reviewed by: Michael Paquier and Corey Huinker
Discussion: https://postgr.es/m/ab91fd9d-4751-ee77-c87b-4dd704c1e59c@lab.ntt.co.jp
2017-03-09 16:34:25 -05:00
Robert Haas
29e312bc13 pageinspect: Remove platform-dependent values from hash tests.
Per a report from Tom Lane, the ffactor reported by hash_metapage_info
and the free_size reported by hash_page_stats vary by platform.

Ashutosh Sharma and Robert Haas
2017-02-03 11:06:41 -05:00
Robert Haas
08bf6e5295 pageinspect: Support hash indexes.
Patch by Jesper Pedersen and Ashutosh Sharma, with some error handling
improvements by me.  Tests from Peter Eisentraut.  Reviewed by Álvaro
Herrera, Michael Paquier, Jesper Pedersen, Jeff Janes, Peter
Eisentraut, Amit Kapila, Mithun Cy, and me.

Discussion: http://postgr.es/m/e2ac6c58-b93f-9dd9-f4e6-d6d30add7fdf@redhat.com
2017-02-02 14:19:32 -05:00
Tom Lane
367b99bbb1 Fix gin_leafpage_items().
On closer inspection, commit 84ad68d64 broke gin_leafpage_items(),
because the aligned copy of the page got palloc'd in a short-lived
context whereas it needs to be in the SRF's multi_call_memory_ctx.
This was not exposed by the regression test, because the regression
test doesn't actually exercise the function in a meaningful way.
Fix the code bug, and extend the test in what I hope is a portable
fashion.
2016-11-04 12:11:54 -04:00
Peter Eisentraut
00a86856c1 pageinspect: Make page test more portable
Choose test data that makes the output independent of endianness.
2016-11-02 08:45:17 -04:00
Peter Eisentraut
f7c9a6e083 pageinspect: Make btree test more portable
Choose test data that makes the output independent of endianness and
alignment.
2016-11-01 22:02:39 -04:00
Peter Eisentraut
adfb81d9e1 pageinspect: Add tests 2016-11-01 14:02:16 -04:00