mirror of
				https://github.com/postgres/postgres.git
				synced 2025-11-03 09:13:20 +03:00 
			
		
		
		
	Fix creation of partition descriptor during concurrent detach+drop
If a partition undergoes DETACH CONCURRENTLY immediately followed by DROP, this could cause a problem for a concurrent transaction recomputing the partition descriptor when running a prepared statement, because it tries to dereference a pointer to a tuple that's not found in a catalog scan. The existing retry logic added in commitdbca3469ebis sufficient to cope with the overall problem, provided we don't try to dereference a non-existant heap tuple. Arguably, the code in RelationBuildPartitionDesc() has been wrong all along, since no check was added in commit898e5e3290against receiving a NULL tuple from the catalog scan; that bug has only become user-visible with DETACH CONCURRENTLY which was added in branch 14. Therefore, even though there's no known mechanism to cause a crash because of this, backpatch the addition of such a check to all supported branches. In branches prior to 14, this would cause the code to fail with a "missing relpartbound for relation XYZ" error instead of crashing; that's okay, because there are no reports of such behavior anyway. Author: Kuntal Ghosh <kuntalghosh.2007@gmail.com> Reviewed-by: Junwang Zhao <zhjwpku@gmail.com> Reviewed-by: Tender Wang <tndrwang@gmail.com> Discussion: https://postgr.es/m/18559-b48286d2eacd9a4e@postgresql.org
This commit is contained in:
		@@ -210,6 +210,10 @@ retry:
 | 
			
		||||
		 * shared queue.  We solve this problem by reading pg_class directly
 | 
			
		||||
		 * for the desired tuple.
 | 
			
		||||
		 *
 | 
			
		||||
		 * If the partition recently detached is also dropped, we get no tuple
 | 
			
		||||
		 * from the scan.  In that case, we also retry, and next time through
 | 
			
		||||
		 * here, we don't see that partition anymore.
 | 
			
		||||
		 *
 | 
			
		||||
		 * The other problem is that DETACH CONCURRENTLY is in the process of
 | 
			
		||||
		 * removing a partition, which happens in two steps: first it marks it
 | 
			
		||||
		 * as "detach pending", commits, then unsets relpartbound.  If
 | 
			
		||||
@@ -224,8 +228,6 @@ retry:
 | 
			
		||||
			Relation	pg_class;
 | 
			
		||||
			SysScanDesc scan;
 | 
			
		||||
			ScanKeyData key[1];
 | 
			
		||||
			Datum		datum;
 | 
			
		||||
			bool		isnull;
 | 
			
		||||
 | 
			
		||||
			pg_class = table_open(RelationRelationId, AccessShareLock);
 | 
			
		||||
			ScanKeyInit(&key[0],
 | 
			
		||||
@@ -234,17 +236,29 @@ retry:
 | 
			
		||||
						ObjectIdGetDatum(inhrelid));
 | 
			
		||||
			scan = systable_beginscan(pg_class, ClassOidIndexId, true,
 | 
			
		||||
									  NULL, 1, key);
 | 
			
		||||
 | 
			
		||||
			/*
 | 
			
		||||
			 * We could get one tuple from the scan (the normal case), or zero
 | 
			
		||||
			 * tuples if the table has been dropped meanwhile.
 | 
			
		||||
			 */
 | 
			
		||||
			tuple = systable_getnext(scan);
 | 
			
		||||
			datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
 | 
			
		||||
								 RelationGetDescr(pg_class), &isnull);
 | 
			
		||||
			if (!isnull)
 | 
			
		||||
				boundspec = stringToNode(TextDatumGetCString(datum));
 | 
			
		||||
			if (HeapTupleIsValid(tuple))
 | 
			
		||||
			{
 | 
			
		||||
				Datum		datum;
 | 
			
		||||
				bool		isnull;
 | 
			
		||||
 | 
			
		||||
				datum = heap_getattr(tuple, Anum_pg_class_relpartbound,
 | 
			
		||||
									 RelationGetDescr(pg_class), &isnull);
 | 
			
		||||
				if (!isnull)
 | 
			
		||||
					boundspec = stringToNode(TextDatumGetCString(datum));
 | 
			
		||||
			}
 | 
			
		||||
			systable_endscan(scan);
 | 
			
		||||
			table_close(pg_class, AccessShareLock);
 | 
			
		||||
 | 
			
		||||
			/*
 | 
			
		||||
			 * If we still don't get a relpartbound value, then it must be
 | 
			
		||||
			 * because of DETACH CONCURRENTLY.  Restart from the top, as
 | 
			
		||||
			 * If we still don't get a relpartbound value (either because
 | 
			
		||||
			 * boundspec is null or because there was no tuple), then it must
 | 
			
		||||
			 * be because of DETACH CONCURRENTLY.  Restart from the top, as
 | 
			
		||||
			 * explained above.  We only do this once, for two reasons: first,
 | 
			
		||||
			 * only one DETACH CONCURRENTLY session could affect us at a time,
 | 
			
		||||
			 * since each of them would have to wait for the snapshot under
 | 
			
		||||
 
 | 
			
		||||
		Reference in New Issue
	
	Block a user