1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-09 22:41:56 +03:00

Improve sys/catcache performance.

The following are the individual improvements:
1) Avoidance of FunctionCallInfo based function calls, replaced by
   more efficient functions with a native C argument interface.
2) Don't extract columns from a cache entry's tuple whenever matching
   entries - instead store them as a Datum array. This also allows to
   get rid of having to build dummy tuples for negative & list
   entries, and of a hack for dealing with cstring vs. text weirdness.
3) Reorder members of catcache.h struct, so imortant entries are more
   likely to be on one cacheline.
4) Allowing the compiler to specialize critical SearchCatCache for a
   specific number of attributes allows to unroll loops and avoid
   other nkeys dependant initialization.
5) Only initializing the ScanKey when necessary, i.e. catcache misses,
   greatly reduces cache unnecessary cpu cache misses.
6) Split of the cache-miss case from the hash lookup, reducing stack
   allocations etc in the common case.
7) CatCTup and their corresponding heaptuple are allocated in one
   piece.

This results in making cache lookups themselves roughly three times as
fast - full-system benchmarks obviously improve less than that.

I've also evaluated further techniques:
- replace open coded hash with simplehash - the list walk right now
  shows up in profiles. Unfortunately it's not easy to do so safely as
  an entry's memory location can change at various times, which
  doesn't work well with the refcounting and cache invalidation.
- Cacheline-aligning CatCTup entries - helps some with performance,
  but the win isn't big and the code for it is ugly, because the
  tuples have to be freed as well.
- add more proper functions, rather than macros for
  SearchSysCacheCopyN etc., but right now they don't show up in
  profiles.

The reason the macro wrapper for syscache.c/h have to be changed,
rather than just catcache, is that doing otherwise would require
exposing the SysCache array to the outside.  That might be a good idea
anyway, but it's for another day.

Author: Andres Freund
Reviewed-By: Robert Haas
Discussion: https://postgr.es/m/20170914061207.zxotvyopetm7lrrp@alap3.anarazel.de
This commit is contained in:
Andres Freund
2017-10-13 13:16:50 -07:00
parent a0247e7a11
commit 141fd1b66c
5 changed files with 611 additions and 283 deletions

File diff suppressed because it is too large Load Diff

View File

@ -1102,13 +1102,56 @@ SearchSysCache(int cacheId,
Datum key3,
Datum key4)
{
if (cacheId < 0 || cacheId >= SysCacheSize ||
!PointerIsValid(SysCache[cacheId]))
elog(ERROR, "invalid cache ID: %d", cacheId);
Assert(cacheId >= 0 && cacheId < SysCacheSize &&
PointerIsValid(SysCache[cacheId]));
return SearchCatCache(SysCache[cacheId], key1, key2, key3, key4);
}
HeapTuple
SearchSysCache1(int cacheId,
Datum key1)
{
Assert(cacheId >= 0 && cacheId < SysCacheSize &&
PointerIsValid(SysCache[cacheId]));
Assert(SysCache[cacheId]->cc_nkeys == 1);
return SearchCatCache1(SysCache[cacheId], key1);
}
HeapTuple
SearchSysCache2(int cacheId,
Datum key1, Datum key2)
{
Assert(cacheId >= 0 && cacheId < SysCacheSize &&
PointerIsValid(SysCache[cacheId]));
Assert(SysCache[cacheId]->cc_nkeys == 2);
return SearchCatCache2(SysCache[cacheId], key1, key2);
}
HeapTuple
SearchSysCache3(int cacheId,
Datum key1, Datum key2, Datum key3)
{
Assert(cacheId >= 0 && cacheId < SysCacheSize &&
PointerIsValid(SysCache[cacheId]));
Assert(SysCache[cacheId]->cc_nkeys == 3);
return SearchCatCache3(SysCache[cacheId], key1, key2, key3);
}
HeapTuple
SearchSysCache4(int cacheId,
Datum key1, Datum key2, Datum key3, Datum key4)
{
Assert(cacheId >= 0 && cacheId < SysCacheSize &&
PointerIsValid(SysCache[cacheId]));
Assert(SysCache[cacheId]->cc_nkeys == 4);
return SearchCatCache4(SysCache[cacheId], key1, key2, key3, key4);
}
/*
* ReleaseSysCache
* Release previously grabbed reference count on a tuple