1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-05 07:21:24 +03:00

Fix old-fd issues using global barriers everywhere.

Commits 4eb21763 and b74e94dc introduced a way to force every backend to
close all relation files, to fix an ancient Windows-only bug.

This commit extends that behavior to all operating systems and adds
a couple of extra barrier points, to fix a totally different class of
bug: the reuse of relfilenodes in scenarios that have no other kind of
cache invalidation to prevent file descriptor mix-ups.

In all releases, data corruption could occur when you moved a database
to another tablespace and then back again.  Despite that, no back-patch
for now as the infrastructure required is too new and invasive.  In
master only, since commit aa010514, it could also happen when using
CREATE DATABASE with a user-supplied OID or via pg_upgrade.

Author: Andres Freund <andres@anarazel.de>
Reviewed-by: Robert Haas <robertmhaas@gmail.com>
Reviewed-by: Thomas Munro <thomas.munro@gmail.com>
Discussion: https://postgr.es/m/20220209220004.kb3dgtn2x2k2gtdm%40alap3.anarazel.de
This commit is contained in:
Thomas Munro
2022-05-07 15:19:52 +12:00
parent b74e94dc27
commit e2f65f4255
5 changed files with 241 additions and 25 deletions

View File

@ -548,11 +548,10 @@ DropTableSpace(DropTableSpaceStmt *stmt)
* use a global barrier to ask all backends to close all files, and
* wait until they're finished.
*/
#if defined(USE_BARRIER_SMGRRELEASE)
LWLockRelease(TablespaceCreateLock);
WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
LWLockAcquire(TablespaceCreateLock, LW_EXCLUSIVE);
#endif
/* And now try again. */
if (!destroy_tablespace_directories(tablespaceoid, false))
{
@ -1574,6 +1573,9 @@ tblspc_redo(XLogReaderState *record)
{
xl_tblspc_drop_rec *xlrec = (xl_tblspc_drop_rec *) XLogRecGetData(record);
/* Close all smgr fds in all backends. */
WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
/*
* If we issued a WAL record for a drop tablespace it implies that
* there were no files in it at all when the DROP was done. That means
@ -1591,11 +1593,6 @@ tblspc_redo(XLogReaderState *record)
*/
if (!destroy_tablespace_directories(xlrec->ts_id, true))
{
#if defined(USE_BARRIER_SMGRRELEASE)
/* Close all smgr fds in all backends. */
WaitForProcSignalBarrier(EmitProcSignalBarrier(PROCSIGNAL_BARRIER_SMGRRELEASE));
#endif
ResolveRecoveryConflictWithTablespace(xlrec->ts_id);
/*