mirror of
https://github.com/postgres/postgres.git
synced 2025-08-21 10:42:50 +03:00
Fix several recently introduced issues around handling new relation forks.
Most of these stem from d25f519107
"tableam: relation creation, VACUUM
FULL/CLUSTER, SET TABLESPACE.".
1) To pass data to the relation_set_new_filenode()
RelationSetNewRelfilenode() was made to update RelationData.rd_rel
directly. That's not OK however, as it makes the relcache entries
temporarily inconsistent. Which among other scenarios is a problem
if a REINDEX targets an index on pg_class - the
CatalogTupleUpdate() in RelationSetNewRelfilenode(). Presumably
that was introduced because other places in the code do so - while
those aren't "good practice" they don't appear to be actively
buggy (e.g. because system tables may not be targeted).
I (Andres) should have caught this while reviewing and signficantly
evolving the code in that commit, mea culpa.
Fix that by instead passing in the new RelFileNode as separate
argument to relation_set_new_filenode() and rely on the relcache to
update the catalog entry. Also revert that the
RelationMapUpdateMap() call was changed to immediate, and undo some
other more unnecessary changes.
2) Document that the relation_set_new_filenode cannot rely on the
whole relcache entry to be valid. It might be worthwhile to
refactor the code to never have to rely on that, but given the way
heap_create() is currently coded, that'd be a large change.
3) ATExecSetTableSpace() shouldn't do FlushRelationBuffers() itself. A
table AM might not use shared buffers at all. Move to
index_copy_data() and heapam_relation_copy_data().
4) heapam_relation_set_new_filenode() previously sometimes accessed
rel->rd_rel->relpersistence rather than the `persistence`
argument. Code movement mistake.
5) Previously heapam_relation_set_new_filenode() re-opened the smgr
relation to create the init for, if necesary. Instead have
RelationCreateStorage() return the SMgrRelation and use it to
create the init fork.
6) Add a note about the danger of modifying the relcache directly to
ATExecSetTableSpace() - it's currently not a bug because there's a
check ERRORing for catalog tables.
Regression tests and assertion improvements that together trigger the
bug described in 1) will be added in a later commit, as there is a
related bug on all branches.
Reported-By: Michael Paquier
Diagnosed-By: Tom Lane and Andres Freund
Author: Andres Freund
Reviewed-By: Tom Lane
Discussion: https://postgr.es/m/20190418011430.GA19133@paquier.xyz
This commit is contained in:
@@ -416,7 +416,12 @@ typedef struct TableAmRoutine
|
||||
* This callback needs to create a new relation filenode for `rel`, with
|
||||
* appropriate durability behaviour for `persistence`.
|
||||
*
|
||||
* On output *freezeXid, *minmulti must be set to the values appropriate
|
||||
* Note that only the subset of the relcache filled by
|
||||
* RelationBuildLocalRelation() can be relied upon and that the relation's
|
||||
* catalog entries either will either not yet exist (new relation), or
|
||||
* will still reference the old relfilenode.
|
||||
*
|
||||
* As output *freezeXid, *minmulti must be set to the values appropriate
|
||||
* for pg_class.{relfrozenxid, relminmxid}. For AMs that don't need those
|
||||
* fields to be filled they can be set to InvalidTransactionId and
|
||||
* InvalidMultiXactId, respectively.
|
||||
@@ -424,6 +429,7 @@ typedef struct TableAmRoutine
|
||||
* See also table_relation_set_new_filenode().
|
||||
*/
|
||||
void (*relation_set_new_filenode) (Relation rel,
|
||||
const RelFileNode *newrnode,
|
||||
char persistence,
|
||||
TransactionId *freezeXid,
|
||||
MultiXactId *minmulti);
|
||||
@@ -444,7 +450,8 @@ typedef struct TableAmRoutine
|
||||
* This can typically be implemented by directly copying the underlying
|
||||
* storage, unless it contains references to the tablespace internally.
|
||||
*/
|
||||
void (*relation_copy_data) (Relation rel, RelFileNode newrnode);
|
||||
void (*relation_copy_data) (Relation rel,
|
||||
const RelFileNode *newrnode);
|
||||
|
||||
/* See table_relation_copy_for_cluster() */
|
||||
void (*relation_copy_for_cluster) (Relation NewHeap,
|
||||
@@ -1251,21 +1258,25 @@ table_finish_bulk_insert(Relation rel, int options)
|
||||
*/
|
||||
|
||||
/*
|
||||
* Create a new relation filenode for `rel`, with persistence set to
|
||||
* Create storage for `rel` in `newrode`, with persistence set to
|
||||
* `persistence`.
|
||||
*
|
||||
* This is used both during relation creation and various DDL operations to
|
||||
* create a new relfilenode that can be filled from scratch.
|
||||
* create a new relfilenode that can be filled from scratch. When creating
|
||||
* new storage for an existing relfilenode, this should be called before the
|
||||
* relcache entry has been updated.
|
||||
*
|
||||
* *freezeXid, *minmulti are set to the xid / multixact horizon for the table
|
||||
* that pg_class.{relfrozenxid, relminmxid} have to be set to.
|
||||
*/
|
||||
static inline void
|
||||
table_relation_set_new_filenode(Relation rel, char persistence,
|
||||
table_relation_set_new_filenode(Relation rel,
|
||||
const RelFileNode *newrnode,
|
||||
char persistence,
|
||||
TransactionId *freezeXid,
|
||||
MultiXactId *minmulti)
|
||||
{
|
||||
rel->rd_tableam->relation_set_new_filenode(rel, persistence,
|
||||
rel->rd_tableam->relation_set_new_filenode(rel, newrnode, persistence,
|
||||
freezeXid, minmulti);
|
||||
}
|
||||
|
||||
@@ -1288,7 +1299,7 @@ table_relation_nontransactional_truncate(Relation rel)
|
||||
* changing a relation's tablespace.
|
||||
*/
|
||||
static inline void
|
||||
table_relation_copy_data(Relation rel, RelFileNode newrnode)
|
||||
table_relation_copy_data(Relation rel, const RelFileNode *newrnode)
|
||||
{
|
||||
rel->rd_tableam->relation_copy_data(rel, newrnode);
|
||||
}
|
||||
|
Reference in New Issue
Block a user