mirror of
https://github.com/postgres/postgres.git
synced 2025-04-24 10:47:04 +03:00
Improve JIT docs.
Author: John Naylor and Andres Freund Discussion: https://postgr.es/m/CAJVSVGUs-VcwSY7-Kx-GQe__8hvWuA4Uhyf3gxoMXeiZqebE9g@mail.gmail.com
This commit is contained in:
parent
c1de1a3a8b
commit
fb60478011
@ -15945,8 +15945,8 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
|
|||||||
<row>
|
<row>
|
||||||
<entry><literal><function>pg_jit_available()</function></literal></entry>
|
<entry><literal><function>pg_jit_available()</function></literal></entry>
|
||||||
<entry><type>boolean</type></entry>
|
<entry><type>boolean</type></entry>
|
||||||
<entry>is <acronym>JIT</acronym> available in this session (see <xref
|
<entry>is <acronym>JIT</acronym> compilation available in this session
|
||||||
linkend="jit"/>)? Returns <literal>false</literal> if <xref
|
(see <xref linkend="jit"/>)? Returns <literal>false</literal> if <xref
|
||||||
linkend="guc-jit"/> is set to false.</entry>
|
linkend="guc-jit"/> is set to false.</entry>
|
||||||
</row>
|
</row>
|
||||||
|
|
||||||
|
@ -18,7 +18,7 @@
|
|||||||
</para>
|
</para>
|
||||||
|
|
||||||
<sect1 id="jit-reason">
|
<sect1 id="jit-reason">
|
||||||
<title>What is <acronym>JIT</acronym>?</title>
|
<title>What is <acronym>JIT</acronym> compilation?</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
Just-in-time compilation (<acronym>JIT</acronym>) is the process of turning
|
Just-in-time compilation (<acronym>JIT</acronym>) is the process of turning
|
||||||
@ -33,7 +33,7 @@
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
<productname>PostgreSQL</productname> has builtin support to perform
|
<productname>PostgreSQL</productname> has builtin support to perform
|
||||||
<acronym>JIT</acronym> using <ulink
|
<acronym>JIT</acronym> compilation using <ulink
|
||||||
url="https://llvm.org/"><productname>LLVM</productname></ulink> when
|
url="https://llvm.org/"><productname>LLVM</productname></ulink> when
|
||||||
<productname>PostgreSQL</productname> was built with
|
<productname>PostgreSQL</productname> was built with
|
||||||
<literal>--with-llvm</literal> (see <xref linkend="configure-with-llvm"/>).
|
<literal>--with-llvm</literal> (see <xref linkend="configure-with-llvm"/>).
|
||||||
@ -97,15 +97,15 @@
|
|||||||
<title>When to <acronym>JIT</acronym>?</title>
|
<title>When to <acronym>JIT</acronym>?</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
<acronym>JIT</acronym> is beneficial primarily for long-running CPU bound
|
<acronym>JIT</acronym> compilation is beneficial primarily for long-running
|
||||||
queries. Frequently these will be analytical queries. For short queries
|
CPU bound queries. Frequently these will be analytical queries. For short
|
||||||
the overhead of performing <acronym>JIT</acronym> will often be higher than
|
queries the added overhead of performing <acronym>JIT</acronym> compilation
|
||||||
the time it can save.
|
will often be higher than the time it can save.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
To determine whether <acronym>JIT</acronym> is used, the total cost of a
|
To determine whether <acronym>JIT</acronym> compilation is used, the total
|
||||||
query (see <xref linkend="planner-stats-details"/> and <xref
|
cost of a query (see <xref linkend="planner-stats-details"/> and <xref
|
||||||
linkend="runtime-config-query-constants"/>) is used.
|
linkend="runtime-config-query-constants"/>) is used.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
@ -117,9 +117,9 @@
|
|||||||
|
|
||||||
<para>
|
<para>
|
||||||
If the planner, based on the above criterion, decided that
|
If the planner, based on the above criterion, decided that
|
||||||
<acronym>JIT</acronym> is beneficial, two further decisions are
|
<acronym>JIT</acronym> compilation is beneficial, two further decisions are
|
||||||
made. Firstly, if the query is more costly than the <xref
|
made. Firstly, if the query is more costly than the <xref
|
||||||
linkend="guc-jit-optimize-above-cost"/>, GUC expensive optimizations are
|
linkend="guc-jit-optimize-above-cost"/> GUC, expensive optimizations are
|
||||||
used to improve the generated code. Secondly, if the query is more costly
|
used to improve the generated code. Secondly, if the query is more costly
|
||||||
than the <xref linkend="guc-jit-inline-above-cost"/> GUC, short functions
|
than the <xref linkend="guc-jit-inline-above-cost"/> GUC, short functions
|
||||||
and operators used in the query will be inlined. Both of these operations
|
and operators used in the query will be inlined. Both of these operations
|
||||||
@ -187,8 +187,9 @@ SET
|
|||||||
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
|
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
|
||||||
</programlisting>
|
</programlisting>
|
||||||
As visible here, <acronym>JIT</acronym> was used, but inlining and
|
As visible here, <acronym>JIT</acronym> was used, but inlining and
|
||||||
optimization were not. If <xref linkend="guc-jit-optimize-above-cost"/>,
|
expensive optimization were not. If <xref
|
||||||
<xref linkend="guc-jit-inline-above-cost"/> were lowered, just like <xref
|
linkend="guc-jit-optimize-above-cost"/>, <xref
|
||||||
|
linkend="guc-jit-inline-above-cost"/> were lowered, just like <xref
|
||||||
linkend="guc-jit-above-cost"/>, that would change.
|
linkend="guc-jit-above-cost"/>, that would change.
|
||||||
</para>
|
</para>
|
||||||
</sect1>
|
</sect1>
|
||||||
@ -197,8 +198,8 @@ SET
|
|||||||
<title>Configuration</title>
|
<title>Configuration</title>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
<xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym> is
|
<xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym>
|
||||||
enabled or disabled.
|
compilation is enabled or disabled.
|
||||||
</para>
|
</para>
|
||||||
|
|
||||||
<para>
|
<para>
|
||||||
|
@ -13,12 +13,12 @@ the CPU that just handles that expression, yielding a speedup.
|
|||||||
That this is done at query execution time, possibly even only in cases
|
That this is done at query execution time, possibly even only in cases
|
||||||
the relevant task is done a number of times, makes it JIT, rather than
|
the relevant task is done a number of times, makes it JIT, rather than
|
||||||
ahead-of-time (AOT). Given the way JIT compilation is used in
|
ahead-of-time (AOT). Given the way JIT compilation is used in
|
||||||
postgres, the lines between interpretation, AOT and JIT are somewhat
|
PostgreSQL, the lines between interpretation, AOT and JIT are somewhat
|
||||||
blurry.
|
blurry.
|
||||||
|
|
||||||
Note that the interpreted program turned into a native program does
|
Note that the interpreted program turned into a native program does
|
||||||
not necessarily have to be a program in the classical sense. E.g. it
|
not necessarily have to be a program in the classical sense. E.g. it
|
||||||
is highly beneficial JIT compile tuple deforming into a native
|
is highly beneficial to JIT compile tuple deforming into a native
|
||||||
function just handling a specific type of table, despite tuple
|
function just handling a specific type of table, despite tuple
|
||||||
deforming not commonly being understood as a "program".
|
deforming not commonly being understood as a "program".
|
||||||
|
|
||||||
@ -26,7 +26,7 @@ deforming not commonly being understood as a "program".
|
|||||||
Why JIT?
|
Why JIT?
|
||||||
========
|
========
|
||||||
|
|
||||||
Parts of postgres are commonly bottlenecked by comparatively small
|
Parts of PostgreSQL are commonly bottlenecked by comparatively small
|
||||||
pieces of CPU intensive code. In a number of cases that is because the
|
pieces of CPU intensive code. In a number of cases that is because the
|
||||||
relevant code has to be very generic (e.g. handling arbitrary SQL
|
relevant code has to be very generic (e.g. handling arbitrary SQL
|
||||||
level expressions, over arbitrary tables, with arbitrary extensions
|
level expressions, over arbitrary tables, with arbitrary extensions
|
||||||
@ -49,11 +49,11 @@ particularly beneficial for removing branches during tuple deforming.
|
|||||||
How to JIT
|
How to JIT
|
||||||
==========
|
==========
|
||||||
|
|
||||||
Postgres, by default, uses LLVM to perform JIT. LLVM was chosen
|
PostgreSQL, by default, uses LLVM to perform JIT. LLVM was chosen
|
||||||
because it is developed by several large corporations and therefore
|
because it is developed by several large corporations and therefore
|
||||||
unlikely to be discontinued, because it has a license compatible with
|
unlikely to be discontinued, because it has a license compatible with
|
||||||
PostgreSQL, and because its LLVM IR can be generated from C
|
PostgreSQL, and because its IR can be generated from C using the Clang
|
||||||
using the clang compiler.
|
compiler.
|
||||||
|
|
||||||
|
|
||||||
Shared Library Separation
|
Shared Library Separation
|
||||||
@ -68,13 +68,13 @@ An additional benefit of doing so is that it is relatively easy to
|
|||||||
evaluate JIT compilation that does not use LLVM, by changing out the
|
evaluate JIT compilation that does not use LLVM, by changing out the
|
||||||
shared library used to provide JIT compilation.
|
shared library used to provide JIT compilation.
|
||||||
|
|
||||||
To achieve this code, e.g. expression evaluation, intending to perform
|
To achieve this, code intending to perform JIT (e.g. expression evaluation)
|
||||||
JIT, calls a LLVM independent wrapper located in jit.c to do so. If
|
calls an LLVM independent wrapper located in jit.c to do so. If the
|
||||||
the shared library providing JIT support can be loaded (i.e. postgres
|
shared library providing JIT support can be loaded (i.e. PostgreSQL was
|
||||||
was compiled with LLVM support and the shared library is installed),
|
compiled with LLVM support and the shared library is installed), the task
|
||||||
the task of JIT compiling an expression gets handed of to shared
|
of JIT compiling an expression gets handed off to the shared library. This
|
||||||
library. This obviously requires that the function in jit.c is allowed
|
obviously requires that the function in jit.c is allowed to fail in case
|
||||||
to fail in case no JIT provider can be loaded.
|
no JIT provider can be loaded.
|
||||||
|
|
||||||
Which shared library is loaded is determined by the jit_provider GUC,
|
Which shared library is loaded is determined by the jit_provider GUC,
|
||||||
defaulting to "llvmjit".
|
defaulting to "llvmjit".
|
||||||
@ -82,8 +82,8 @@ defaulting to "llvmjit".
|
|||||||
Cloistering code performing JIT into a shared library unfortunately
|
Cloistering code performing JIT into a shared library unfortunately
|
||||||
also means that code doing JIT compilation for various parts of code
|
also means that code doing JIT compilation for various parts of code
|
||||||
has to be located separately from the code doing so without
|
has to be located separately from the code doing so without
|
||||||
JIT. E.g. the JITed version of execExprInterp.c is located in
|
JIT. E.g. the JIT version of execExprInterp.c is located in jit/llvm/
|
||||||
jit/llvm/ rather than executor/.
|
rather than executor/.
|
||||||
|
|
||||||
|
|
||||||
JIT Context
|
JIT Context
|
||||||
@ -105,9 +105,9 @@ implementations.
|
|||||||
|
|
||||||
Emitting individual functions separately is more expensive than
|
Emitting individual functions separately is more expensive than
|
||||||
emitting several functions at once, and emitting them together can
|
emitting several functions at once, and emitting them together can
|
||||||
provide additional optimization opportunities. To facilitate that the
|
provide additional optimization opportunities. To facilitate that, the
|
||||||
LLVM provider separates function definition from emitting them in an
|
LLVM provider separates defining functions from optimizing and
|
||||||
executable way.
|
emitting functions in an executable manner.
|
||||||
|
|
||||||
Creating functions into the current mutable module (a module
|
Creating functions into the current mutable module (a module
|
||||||
essentially is LLVM's equivalent of a translation unit in C) is done
|
essentially is LLVM's equivalent of a translation unit in C) is done
|
||||||
@ -127,7 +127,7 @@ used.
|
|||||||
Error Handling
|
Error Handling
|
||||||
--------------
|
--------------
|
||||||
|
|
||||||
There are two aspects to error handling. Firstly, generated (LLVM IR)
|
There are two aspects of error handling. Firstly, generated (LLVM IR)
|
||||||
and emitted functions (mmap()ed segments) need to be cleaned up both
|
and emitted functions (mmap()ed segments) need to be cleaned up both
|
||||||
after a successful query execution and after an error. This is done by
|
after a successful query execution and after an error. This is done by
|
||||||
registering each created JITContext with the current resource owner,
|
registering each created JITContext with the current resource owner,
|
||||||
@ -140,12 +140,12 @@ cleaning up emitted code upon ERROR, but there's also the chance that
|
|||||||
LLVM itself runs out of memory. LLVM by default does *not* use any C++
|
LLVM itself runs out of memory. LLVM by default does *not* use any C++
|
||||||
exceptions. Its allocations are primarily funneled through the
|
exceptions. Its allocations are primarily funneled through the
|
||||||
standard "new" handlers, and some direct use of malloc() and
|
standard "new" handlers, and some direct use of malloc() and
|
||||||
mmap(). For the former a 'new handler' exists
|
mmap(). For the former a 'new handler' exists:
|
||||||
http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the
|
http://en.cppreference.com/w/cpp/memory/new/set_new_handler
|
||||||
latter LLVM provides callback that get called upon failure
|
For the latter LLVM provides callbacks that get called upon failure
|
||||||
(unfortunately mmap() failures are treated as fatal rather than OOM
|
(unfortunately mmap() failures are treated as fatal rather than OOM errors).
|
||||||
errors). What we've, for now, chosen to do, is to have two functions
|
What we've chosen to do for now is have two functions that LLVM using code
|
||||||
that LLVM using code must use:
|
must use:
|
||||||
extern void llvm_enter_fatal_on_oom(void);
|
extern void llvm_enter_fatal_on_oom(void);
|
||||||
extern void llvm_leave_fatal_on_oom(void);
|
extern void llvm_leave_fatal_on_oom(void);
|
||||||
before interacting with LLVM code.
|
before interacting with LLVM code.
|
||||||
@ -160,7 +160,7 @@ the handlers instead are reset on toplevel sigsetjmp() level.
|
|||||||
|
|
||||||
Using a relatively small enter/leave protected section of code, rather
|
Using a relatively small enter/leave protected section of code, rather
|
||||||
than setting up these handlers globally, avoids negative interactions
|
than setting up these handlers globally, avoids negative interactions
|
||||||
with extensions that might use C++ like e.g. postgis. As LLVM code
|
with extensions that might use C++ such as PostGIS. As LLVM code
|
||||||
generation should never execute arbitrary code, just setting these
|
generation should never execute arbitrary code, just setting these
|
||||||
handlers temporarily ought to suffice.
|
handlers temporarily ought to suffice.
|
||||||
|
|
||||||
@ -168,9 +168,9 @@ handlers temporarily ought to suffice.
|
|||||||
Type Synchronization
|
Type Synchronization
|
||||||
--------------------
|
--------------------
|
||||||
|
|
||||||
To able to generate code performing tasks that are done in "interpreted"
|
To be able to generate code that can perform tasks done by "interpreted"
|
||||||
postgres, it obviously is required that code generation knows about at
|
PostgreSQL, it obviously is required that code generation knows about at
|
||||||
least a few postgres types. While it is possible to inform LLVM about
|
least a few PostgreSQL types. While it is possible to inform LLVM about
|
||||||
type definitions by recreating them manually in C code, that is failure
|
type definitions by recreating them manually in C code, that is failure
|
||||||
prone and labor intensive.
|
prone and labor intensive.
|
||||||
|
|
||||||
@ -178,13 +178,13 @@ Instead there is one small file (llvmjit_types.c) which references each of
|
|||||||
the types required for JITing. That file is translated to bitcode at
|
the types required for JITing. That file is translated to bitcode at
|
||||||
compile time, and loaded when LLVM is initialized in a backend.
|
compile time, and loaded when LLVM is initialized in a backend.
|
||||||
|
|
||||||
That works very well to synchronize the type definition, unfortunately
|
That works very well to synchronize the type definition, but unfortunately
|
||||||
it does *not* synchronize offsets as the IR level representation doesn't
|
it does *not* synchronize offsets as the IR level representation doesn't
|
||||||
know field names. Instead required offsets are maintained as defines in
|
know field names. Instead, required offsets are maintained as defines in
|
||||||
the original struct definition. E.g.
|
the original struct definition, like so:
|
||||||
#define FIELDNO_TUPLETABLESLOT_NVALID 9
|
#define FIELDNO_TUPLETABLESLOT_NVALID 9
|
||||||
int tts_nvalid; /* # of valid values in tts_values */
|
int tts_nvalid; /* # of valid values in tts_values */
|
||||||
while that still needs to be defined, it's only required for a
|
While that still needs to be defined, it's only required for a
|
||||||
relatively small number of fields, and it's bunched together with the
|
relatively small number of fields, and it's bunched together with the
|
||||||
struct definition, so it's easily kept synchronized.
|
struct definition, so it's easily kept synchronized.
|
||||||
|
|
||||||
@ -193,12 +193,12 @@ Inlining
|
|||||||
--------
|
--------
|
||||||
|
|
||||||
One big advantage of JITing expressions is that it can significantly
|
One big advantage of JITing expressions is that it can significantly
|
||||||
reduce the overhead of postgres's extensible function/operator
|
reduce the overhead of PostgreSQL's extensible function/operator
|
||||||
mechanism, by inlining the body of called functions / operators.
|
mechanism, by inlining the body of called functions/operators.
|
||||||
|
|
||||||
It obviously is undesirable to maintain a second implementation of
|
It obviously is undesirable to maintain a second implementation of
|
||||||
commonly used functions, just for inlining purposes. Instead we take
|
commonly used functions, just for inlining purposes. Instead we take
|
||||||
advantage of the fact that the clang compiler can emit LLVM IR.
|
advantage of the fact that the Clang compiler can emit LLVM IR.
|
||||||
|
|
||||||
The ability to do so allows us to get the LLVM IR for all operators
|
The ability to do so allows us to get the LLVM IR for all operators
|
||||||
(e.g. int8eq, float8pl etc), without maintaining two copies. These
|
(e.g. int8eq, float8pl etc), without maintaining two copies. These
|
||||||
@ -225,7 +225,7 @@ Caching
|
|||||||
Currently it is not yet possible to cache generated functions, even
|
Currently it is not yet possible to cache generated functions, even
|
||||||
though that'd be desirable from a performance point of view. The
|
though that'd be desirable from a performance point of view. The
|
||||||
problem is that the generated functions commonly contain pointers into
|
problem is that the generated functions commonly contain pointers into
|
||||||
per-execution memory. The expression evaluation functionality needs to
|
per-execution memory. The expression evaluation machinery needs to
|
||||||
be redesigned a bit to avoid that. Basically all per-execution memory
|
be redesigned a bit to avoid that. Basically all per-execution memory
|
||||||
needs to be referenced as an offset to one block of memory stored in
|
needs to be referenced as an offset to one block of memory stored in
|
||||||
an ExprState, rather than absolute pointers into memory.
|
an ExprState, rather than absolute pointers into memory.
|
||||||
@ -278,7 +278,7 @@ Currently there are a number of GUCs that influence JITing:
|
|||||||
- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
|
- jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has
|
||||||
higher cost.
|
higher cost.
|
||||||
|
|
||||||
whenever a query's total cost is above these limits, JITing is
|
Whenever a query's total cost is above these limits, JITing is
|
||||||
performed.
|
performed.
|
||||||
|
|
||||||
Alternative costing models, e.g. by generating separate paths for
|
Alternative costing models, e.g. by generating separate paths for
|
||||||
@ -291,5 +291,5 @@ individual expressions.
|
|||||||
The obvious seeming approach of JITing expressions individually after
|
The obvious seeming approach of JITing expressions individually after
|
||||||
a number of execution turns out not to work too well. Primarily
|
a number of execution turns out not to work too well. Primarily
|
||||||
because emitting many small functions individually has significant
|
because emitting many small functions individually has significant
|
||||||
overhead. Secondarily because the time till JITing occurs causes
|
overhead. Secondarily because the time until JITing occurs causes
|
||||||
relative slowdowns that eat into the gain of JIT compilation.
|
relative slowdowns that eat into the gain of JIT compilation.
|
||||||
|
Loading…
x
Reference in New Issue
Block a user