mirror of
https://github.com/postgres/postgres.git
synced 2025-06-22 02:52:08 +03:00
Redesign get_attstatsslot()/free_attstatsslot() for more safety and speed.
The mess cleaned up in commit da0759600
is clear evidence that it's a
bug hazard to expect the caller of get_attstatsslot()/free_attstatsslot()
to provide the correct type OID for the array elements in the slot.
Moreover, we weren't even getting any performance benefit from that,
since get_attstatsslot() was extracting the real type OID from the array
anyway. So we ought to get rid of that requirement; indeed, it would
make more sense for get_attstatsslot() to pass back the type OID it found,
in case the caller isn't sure what to expect, which is likely in binary-
compatible-operator cases.
Another problem with the current implementation is that if the stats array
element type is pass-by-reference, we incur a palloc/memcpy/pfree cycle
for each element. That seemed acceptable when the code was written because
we were targeting O(10) array sizes --- but these days, stats arrays are
almost always bigger than that, sometimes much bigger. We can save a
significant number of cycles by doing one palloc/memcpy/pfree of the whole
array. Indeed, in the now-probably-common case where the array is toasted,
that happens anyway so this method is basically free. (Note: although the
catcache code will inline any out-of-line toasted values, it doesn't
decompress them. At the other end of the size range, it doesn't expand
short-header datums either. In either case, DatumGetArrayTypeP would have
to make a copy. We do end up using an extra array copy step if the element
type is pass-by-value and the array length is neither small enough for a
short header nor large enough to have suffered compression. But that
seems like a very acceptable price for winning in pass-by-ref cases.)
Hence, redesign to take these insights into account. While at it,
convert to an API in which we fill a struct rather than passing a bunch
of pointers to individual output arguments. That will make it less
painful if we ever want further expansion of what get_attstatsslot can
pass back.
It's certainly arguable that this is new development and not something to
push post-feature-freeze. However, I view it as primarily bug-proofing
and therefore something that's better to have sooner not later. Since
we aren't quite at beta phase yet, let's put it in.
Discussion: https://postgr.es/m/16364.1494520862@sss.pgh.pa.us
This commit is contained in:
@ -1283,10 +1283,7 @@ static void
|
||||
ExecHashBuildSkewHash(HashJoinTable hashtable, Hash *node, int mcvsToUse)
|
||||
{
|
||||
HeapTupleData *statsTuple;
|
||||
Datum *values;
|
||||
int nvalues;
|
||||
float4 *numbers;
|
||||
int nnumbers;
|
||||
AttStatsSlot sslot;
|
||||
|
||||
/* Do nothing if planner didn't identify the outer relation's join key */
|
||||
if (!OidIsValid(node->skewTable))
|
||||
@ -1305,19 +1302,17 @@ ExecHashBuildSkewHash(HashJoinTable hashtable, Hash *node, int mcvsToUse)
|
||||
if (!HeapTupleIsValid(statsTuple))
|
||||
return;
|
||||
|
||||
if (get_attstatsslot(statsTuple, node->skewColType, node->skewColTypmod,
|
||||
if (get_attstatsslot(&sslot, statsTuple,
|
||||
STATISTIC_KIND_MCV, InvalidOid,
|
||||
NULL,
|
||||
&values, &nvalues,
|
||||
&numbers, &nnumbers))
|
||||
ATTSTATSSLOT_VALUES | ATTSTATSSLOT_NUMBERS))
|
||||
{
|
||||
double frac;
|
||||
int nbuckets;
|
||||
FmgrInfo *hashfunctions;
|
||||
int i;
|
||||
|
||||
if (mcvsToUse > nvalues)
|
||||
mcvsToUse = nvalues;
|
||||
if (mcvsToUse > sslot.nvalues)
|
||||
mcvsToUse = sslot.nvalues;
|
||||
|
||||
/*
|
||||
* Calculate the expected fraction of outer relation that will
|
||||
@ -1326,11 +1321,10 @@ ExecHashBuildSkewHash(HashJoinTable hashtable, Hash *node, int mcvsToUse)
|
||||
*/
|
||||
frac = 0;
|
||||
for (i = 0; i < mcvsToUse; i++)
|
||||
frac += numbers[i];
|
||||
frac += sslot.numbers[i];
|
||||
if (frac < SKEW_MIN_OUTER_FRACTION)
|
||||
{
|
||||
free_attstatsslot(node->skewColType,
|
||||
values, nvalues, numbers, nnumbers);
|
||||
free_attstatsslot(&sslot);
|
||||
ReleaseSysCache(statsTuple);
|
||||
return;
|
||||
}
|
||||
@ -1392,7 +1386,7 @@ ExecHashBuildSkewHash(HashJoinTable hashtable, Hash *node, int mcvsToUse)
|
||||
int bucket;
|
||||
|
||||
hashvalue = DatumGetUInt32(FunctionCall1(&hashfunctions[0],
|
||||
values[i]));
|
||||
sslot.values[i]));
|
||||
|
||||
/*
|
||||
* While we have not hit a hole in the hashtable and have not hit
|
||||
@ -1426,8 +1420,7 @@ ExecHashBuildSkewHash(HashJoinTable hashtable, Hash *node, int mcvsToUse)
|
||||
hashtable->spacePeak = hashtable->spaceUsed;
|
||||
}
|
||||
|
||||
free_attstatsslot(node->skewColType,
|
||||
values, nvalues, numbers, nnumbers);
|
||||
free_attstatsslot(&sslot);
|
||||
}
|
||||
|
||||
ReleaseSysCache(statsTuple);
|
||||
|
Reference in New Issue
Block a user