1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-27 12:41:57 +03:00

Redefine pg_class.reltuples to be -1 before the first VACUUM or ANALYZE.

Historically, we've considered the state with relpages and reltuples
both zero as indicating that we do not know the table's tuple density.
This is problematic because it's impossible to distinguish "never yet
vacuumed" from "vacuumed and seen to be empty".  In particular, a user
cannot use VACUUM or ANALYZE to override the planner's normal heuristic
that an empty table should not be believed to be empty because it is
probably about to get populated.  That heuristic is a good safety
measure, so I don't care to abandon it, but there should be a way to
override it if the table is indeed intended to stay empty.

Hence, represent the initial state of ignorance by setting reltuples
to -1 (relpages is still set to zero), and apply the minimum-ten-pages
heuristic only when reltuples is still -1.  If the table is empty,
VACUUM or ANALYZE (but not CREATE INDEX) will override that to
reltuples = relpages = 0, and then we'll plan on that basis.

This requires a bunch of fiddly little changes, but we can get rid of
some ugly kluges that were formerly needed to maintain the old definition.

One notable point is that FDWs' GetForeignRelSize methods will see
baserel->tuples = -1 when no ANALYZE has been done on the foreign table.
That seems like a net improvement, since those methods were formerly
also in the dark about what baserel->tuples = 0 really meant.  Still,
it is an API change.

I bumped catversion because code predating this change would get confused
by seeing reltuples = -1.

Discussion: https://postgr.es/m/F02298E0-6EF4-49A1-BCB6-C484794D9ACC@thebuild.com
This commit is contained in:
Tom Lane
2020-08-30 12:21:51 -04:00
parent 9511fb37ac
commit 3d351d916b
20 changed files with 77 additions and 69 deletions

View File

@ -208,7 +208,8 @@ typedef struct LVShared
* live tuples in the index vacuum case or the new live tuples in the
* index cleanup case.
*
* estimated_count is true if reltuples is an estimated value.
* estimated_count is true if reltuples is an estimated value. (Note that
* reltuples could be -1 in this case, indicating we have no idea.)
*/
double reltuples;
bool estimated_count;
@ -567,31 +568,19 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
/*
* Update statistics in pg_class.
*
* A corner case here is that if we scanned no pages at all because every
* page is all-visible, we should not update relpages/reltuples, because
* we have no new information to contribute. In particular this keeps us
* from replacing relpages=reltuples=0 (which means "unknown tuple
* density") with nonzero relpages and reltuples=0 (which means "zero
* tuple density") unless there's some actual evidence for the latter.
* In principle new_live_tuples could be -1 indicating that we (still)
* don't know the tuple count. In practice that probably can't happen,
* since we'd surely have scanned some pages if the table is new and
* nonempty.
*
* It's important that we use tupcount_pages and not scanned_pages for the
* check described above; scanned_pages counts pages where we could not
* get cleanup lock, and which were processed only for frozenxid purposes.
*
* We do update relallvisible even in the corner case, since if the table
* is all-visible we'd definitely like to know that. But clamp the value
* to be not more than what we're setting relpages to.
* For safety, clamp relallvisible to be not more than what we're setting
* relpages to.
*
* Also, don't change relfrozenxid/relminmxid if we skipped any pages,
* since then we don't know for certain that all tuples have a newer xmin.
*/
new_rel_pages = vacrelstats->rel_pages;
new_live_tuples = vacrelstats->new_live_tuples;
if (vacrelstats->tupcount_pages == 0 && new_rel_pages > 0)
{
new_rel_pages = vacrelstats->old_rel_pages;
new_live_tuples = vacrelstats->old_live_tuples;
}
visibilitymap_count(onerel, &new_rel_allvisible, NULL);
if (new_rel_allvisible > new_rel_pages)
@ -612,7 +601,7 @@ heap_vacuum_rel(Relation onerel, VacuumParams *params,
/* report results to the stats collector, too */
pgstat_report_vacuum(RelationGetRelid(onerel),
onerel->rd_rel->relisshared,
new_live_tuples,
Max(new_live_tuples, 0),
vacrelstats->new_dead_tuples);
pgstat_progress_end_command();
@ -1695,9 +1684,12 @@ lazy_scan_heap(Relation onerel, VacuumParams *params, LVRelStats *vacrelstats,
vacrelstats->tupcount_pages,
live_tuples);
/* also compute total number of surviving heap entries */
/*
* Also compute the total number of surviving heap entries. In the
* (unlikely) scenario that new_live_tuples is -1, take it as zero.
*/
vacrelstats->new_rel_tuples =
vacrelstats->new_live_tuples + vacrelstats->new_dead_tuples;
Max(vacrelstats->new_live_tuples, 0) + vacrelstats->new_dead_tuples;
/*
* Release any remaining pin on visibility map page.
@ -2434,7 +2426,7 @@ lazy_cleanup_all_indexes(Relation *Irel, IndexBulkDeleteResult **stats,
* dead_tuples, and update running statistics.
*
* reltuples is the number of heap tuples to be passed to the
* bulkdelete callback.
* bulkdelete callback. It's always assumed to be estimated.
*/
static void
lazy_vacuum_index(Relation indrel, IndexBulkDeleteResult **stats,