mirror of
https://github.com/postgres/postgres.git
synced 2025-07-28 23:42:10 +03:00
Fix failure to detoast fields in composite elements of structured types.
If we have an array of records stored on disk, the individual record fields cannot contain out-of-line TOAST pointers: the tuptoaster.c mechanisms are only prepared to deal with TOAST pointers appearing in top-level fields of a stored row. The same applies for ranges over composite types, nested composites, etc. However, the existing code only took care of expanding sub-field TOAST pointers for the case of nested composites, not for other structured types containing composites. For example, given a command such as UPDATE tab SET arraycol = ARRAY[(ROW(x,42)::mycompositetype] ... where x is a direct reference to a field of an on-disk tuple, if that field is long enough to be toasted out-of-line then the TOAST pointer would be inserted as-is into the array column. If the source record for x is later deleted, the array field value would become a dangling pointer, leading to errors along the line of "missing chunk number 0 for toast value ..." when the value is referenced. A reproducible test case for this was provided by Jan Pecek, but it seems likely that some of the "missing chunk number" reports we've heard in the past were caused by similar issues. Code-wise, the problem is that PG_DETOAST_DATUM() is not adequate to produce a self-contained Datum value if the Datum is of composite type. Seen in this light, the problem is not just confined to arrays and ranges, but could also affect some other places where detoasting is done in that way, for example form_index_tuple(). I tried teaching the array code to apply toast_flatten_tuple_attribute() along with PG_DETOAST_DATUM() when the array element type is composite, but this was messy and imposed extra cache lookup costs whether or not any TOAST pointers were present, indeed sometimes when the array element type isn't even composite (since sometimes it takes a typcache lookup to find that out). The idea of extending that approach to all the places that currently use PG_DETOAST_DATUM() wasn't attractive at all. This patch instead solves the problem by decreeing that composite Datum values must not contain any out-of-line TOAST pointers in the first place; that is, we expand out-of-line fields at the point of constructing a composite Datum, not at the point where we're about to insert it into a larger tuple. This rule is applied only to true composite Datums, not to tuples that are being passed around the system as tuples, so it's not as invasive as it might sound at first. With this approach, the amount of code that has to be touched for a full solution is greatly reduced, and added cache lookup costs are avoided except when there actually is a TOAST pointer that needs to be inlined. The main drawback of this approach is that we might sometimes dereference a TOAST pointer that will never actually be used by the query, imposing a rather large cost that wasn't there before. On the other side of the coin, if the field value is used multiple times then we'll come out ahead by avoiding repeat detoastings. Experimentation suggests that common SQL coding patterns are unaffected either way, though. Applications that are very negatively affected could be advised to modify their code to not fetch columns they won't be using. In future, we might consider reverting this solution in favor of detoasting only at the point where data is about to be stored to disk, using some method that can drill down into multiple levels of nested structured types. That will require defining new APIs for structured types, though, so it doesn't seem feasible as a back-patchable fix. Note that this patch changes HeapTupleGetDatum() from a macro to a function call; this means that any third-party code using that macro will not get protection against creating TOAST-pointer-containing Datums until it's recompiled. The same applies to any uses of PG_RETURN_HEAPTUPLEHEADER(). It seems likely that this is not a big problem in practice: most of the tuple-returning functions in core and contrib produce outputs that could not possibly be toasted anyway, and the same probably holds for third-party extensions. This bug has existed since TOAST was invented, so back-patch to all supported branches.
This commit is contained in:
@ -991,6 +991,9 @@ toast_insert_or_update(Relation rel, HeapTuple newtup, HeapTuple oldtup,
|
||||
*
|
||||
* "Flatten" a tuple to contain no out-of-line toasted fields.
|
||||
* (This does not eliminate compressed or short-header datums.)
|
||||
*
|
||||
* Note: we expect the caller already checked HeapTupleHasExternal(tup),
|
||||
* so there is no need for a short-circuit path.
|
||||
* ----------
|
||||
*/
|
||||
HeapTuple
|
||||
@ -1068,59 +1071,61 @@ toast_flatten_tuple(HeapTuple tup, TupleDesc tupleDesc)
|
||||
|
||||
|
||||
/* ----------
|
||||
* toast_flatten_tuple_attribute -
|
||||
* toast_flatten_tuple_to_datum -
|
||||
*
|
||||
* If a Datum is of composite type, "flatten" it to contain no toasted fields.
|
||||
* This must be invoked on any potentially-composite field that is to be
|
||||
* inserted into a tuple. Doing this preserves the invariant that toasting
|
||||
* goes only one level deep in a tuple.
|
||||
* "Flatten" a tuple containing out-of-line toasted fields into a Datum.
|
||||
* The result is always palloc'd in the current memory context.
|
||||
*
|
||||
* Note that flattening does not mean expansion of short-header varlenas,
|
||||
* so in one sense toasting is allowed within composite datums.
|
||||
* We have a general rule that Datums of container types (rows, arrays,
|
||||
* ranges, etc) must not contain any external TOAST pointers. Without
|
||||
* this rule, we'd have to look inside each Datum when preparing a tuple
|
||||
* for storage, which would be expensive and would fail to extend cleanly
|
||||
* to new sorts of container types.
|
||||
*
|
||||
* However, we don't want to say that tuples represented as HeapTuples
|
||||
* can't contain toasted fields, so instead this routine should be called
|
||||
* when such a HeapTuple is being converted into a Datum.
|
||||
*
|
||||
* While we're at it, we decompress any compressed fields too. This is not
|
||||
* necessary for correctness, but reflects an expectation that compression
|
||||
* will be more effective if applied to the whole tuple not individual
|
||||
* fields. We are not so concerned about that that we want to deconstruct
|
||||
* and reconstruct tuples just to get rid of compressed fields, however.
|
||||
* So callers typically won't call this unless they see that the tuple has
|
||||
* at least one external field.
|
||||
*
|
||||
* On the other hand, in-line short-header varlena fields are left alone.
|
||||
* If we "untoasted" them here, they'd just get changed back to short-header
|
||||
* format anyway within heap_fill_tuple.
|
||||
* ----------
|
||||
*/
|
||||
Datum
|
||||
toast_flatten_tuple_attribute(Datum value,
|
||||
Oid typeId, int32 typeMod)
|
||||
toast_flatten_tuple_to_datum(HeapTupleHeader tup,
|
||||
uint32 tup_len,
|
||||
TupleDesc tupleDesc)
|
||||
{
|
||||
TupleDesc tupleDesc;
|
||||
HeapTupleHeader olddata;
|
||||
HeapTupleHeader new_data;
|
||||
int32 new_header_len;
|
||||
int32 new_data_len;
|
||||
int32 new_tuple_len;
|
||||
HeapTupleData tmptup;
|
||||
Form_pg_attribute *att;
|
||||
int numAttrs;
|
||||
Form_pg_attribute *att = tupleDesc->attrs;
|
||||
int numAttrs = tupleDesc->natts;
|
||||
int i;
|
||||
bool need_change = false;
|
||||
bool has_nulls = false;
|
||||
Datum toast_values[MaxTupleAttributeNumber];
|
||||
bool toast_isnull[MaxTupleAttributeNumber];
|
||||
bool toast_free[MaxTupleAttributeNumber];
|
||||
|
||||
/*
|
||||
* See if it's a composite type, and get the tupdesc if so.
|
||||
*/
|
||||
tupleDesc = lookup_rowtype_tupdesc_noerror(typeId, typeMod, true);
|
||||
if (tupleDesc == NULL)
|
||||
return value; /* not a composite type */
|
||||
|
||||
att = tupleDesc->attrs;
|
||||
numAttrs = tupleDesc->natts;
|
||||
/* Build a temporary HeapTuple control structure */
|
||||
tmptup.t_len = tup_len;
|
||||
ItemPointerSetInvalid(&(tmptup.t_self));
|
||||
tmptup.t_tableOid = InvalidOid;
|
||||
tmptup.t_data = tup;
|
||||
|
||||
/*
|
||||
* Break down the tuple into fields.
|
||||
*/
|
||||
olddata = DatumGetHeapTupleHeader(value);
|
||||
Assert(typeId == HeapTupleHeaderGetTypeId(olddata));
|
||||
Assert(typeMod == HeapTupleHeaderGetTypMod(olddata));
|
||||
/* Build a temporary HeapTuple control structure */
|
||||
tmptup.t_len = HeapTupleHeaderGetDatumLength(olddata);
|
||||
ItemPointerSetInvalid(&(tmptup.t_self));
|
||||
tmptup.t_tableOid = InvalidOid;
|
||||
tmptup.t_data = olddata;
|
||||
|
||||
Assert(numAttrs <= MaxTupleAttributeNumber);
|
||||
heap_deform_tuple(&tmptup, tupleDesc, toast_values, toast_isnull);
|
||||
|
||||
@ -1144,20 +1149,10 @@ toast_flatten_tuple_attribute(Datum value,
|
||||
new_value = heap_tuple_untoast_attr(new_value);
|
||||
toast_values[i] = PointerGetDatum(new_value);
|
||||
toast_free[i] = true;
|
||||
need_change = true;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
/*
|
||||
* If nothing to untoast, just return the original tuple.
|
||||
*/
|
||||
if (!need_change)
|
||||
{
|
||||
ReleaseTupleDesc(tupleDesc);
|
||||
return value;
|
||||
}
|
||||
|
||||
/*
|
||||
* Calculate the new size of the tuple.
|
||||
*
|
||||
@ -1166,7 +1161,7 @@ toast_flatten_tuple_attribute(Datum value,
|
||||
new_header_len = offsetof(HeapTupleHeaderData, t_bits);
|
||||
if (has_nulls)
|
||||
new_header_len += BITMAPLEN(numAttrs);
|
||||
if (olddata->t_infomask & HEAP_HASOID)
|
||||
if (tup->t_infomask & HEAP_HASOID)
|
||||
new_header_len += sizeof(Oid);
|
||||
new_header_len = MAXALIGN(new_header_len);
|
||||
new_data_len = heap_compute_data_size(tupleDesc,
|
||||
@ -1178,14 +1173,16 @@ toast_flatten_tuple_attribute(Datum value,
|
||||
/*
|
||||
* Copy the existing tuple header, but adjust natts and t_hoff.
|
||||
*/
|
||||
memcpy(new_data, olddata, offsetof(HeapTupleHeaderData, t_bits));
|
||||
memcpy(new_data, tup, offsetof(HeapTupleHeaderData, t_bits));
|
||||
HeapTupleHeaderSetNatts(new_data, numAttrs);
|
||||
new_data->t_hoff = new_header_len;
|
||||
if (olddata->t_infomask & HEAP_HASOID)
|
||||
HeapTupleHeaderSetOid(new_data, HeapTupleHeaderGetOid(olddata));
|
||||
if (tup->t_infomask & HEAP_HASOID)
|
||||
HeapTupleHeaderSetOid(new_data, HeapTupleHeaderGetOid(tup));
|
||||
|
||||
/* Reset the datum length field, too */
|
||||
/* Set the composite-Datum header fields correctly */
|
||||
HeapTupleHeaderSetDatumLength(new_data, new_tuple_len);
|
||||
HeapTupleHeaderSetTypeId(new_data, tupleDesc->tdtypeid);
|
||||
HeapTupleHeaderSetTypMod(new_data, tupleDesc->tdtypmod);
|
||||
|
||||
/* Copy over the data, and fill the null bitmap if needed */
|
||||
heap_fill_tuple(tupleDesc,
|
||||
@ -1202,7 +1199,6 @@ toast_flatten_tuple_attribute(Datum value,
|
||||
for (i = 0; i < numAttrs; i++)
|
||||
if (toast_free[i])
|
||||
pfree(DatumGetPointer(toast_values[i]));
|
||||
ReleaseTupleDesc(tupleDesc);
|
||||
|
||||
return PointerGetDatum(new_data);
|
||||
}
|
||||
|
Reference in New Issue
Block a user