mirror of
				https://github.com/postgres/postgres.git
				synced 2025-10-25 13:17:41 +03:00 
			
		
		
		
	Improve JIT docs.
Author: John Naylor and Andres Freund Discussion: https://postgr.es/m/CAJVSVGUs-VcwSY7-Kx-GQe__8hvWuA4Uhyf3gxoMXeiZqebE9g@mail.gmail.com
This commit is contained in:
		| @@ -15945,8 +15945,8 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n); | ||||
|       <row> | ||||
|        <entry><literal><function>pg_jit_available()</function></literal></entry> | ||||
|        <entry><type>boolean</type></entry> | ||||
|        <entry>is <acronym>JIT</acronym> available in this session (see <xref | ||||
|        linkend="jit"/>)? Returns <literal>false</literal> if <xref | ||||
|        <entry>is <acronym>JIT</acronym> compilation available in this session | ||||
|        (see <xref linkend="jit"/>)? Returns <literal>false</literal> if <xref | ||||
|        linkend="guc-jit"/> is set to false.</entry> | ||||
|       </row> | ||||
|  | ||||
|   | ||||
| @@ -18,7 +18,7 @@ | ||||
|  </para> | ||||
|  | ||||
|  <sect1 id="jit-reason"> | ||||
|   <title>What is <acronym>JIT</acronym>?</title> | ||||
|   <title>What is <acronym>JIT</acronym> compilation?</title> | ||||
|  | ||||
|   <para> | ||||
|    Just-in-time compilation (<acronym>JIT</acronym>) is the process of turning | ||||
| @@ -33,7 +33,7 @@ | ||||
|  | ||||
|   <para> | ||||
|    <productname>PostgreSQL</productname> has builtin support to perform | ||||
|    <acronym>JIT</acronym> using <ulink | ||||
|    <acronym>JIT</acronym> compilation using <ulink | ||||
|    url="https://llvm.org/"><productname>LLVM</productname></ulink> when | ||||
|    <productname>PostgreSQL</productname> was built with | ||||
|    <literal>--with-llvm</literal> (see <xref linkend="configure-with-llvm"/>). | ||||
| @@ -97,15 +97,15 @@ | ||||
|   <title>When to <acronym>JIT</acronym>?</title> | ||||
|  | ||||
|   <para> | ||||
|    <acronym>JIT</acronym> is beneficial primarily for long-running CPU bound | ||||
|    queries. Frequently these will be analytical queries.  For short queries | ||||
|    the overhead of performing <acronym>JIT</acronym> will often be higher than | ||||
|    the time it can save. | ||||
|    <acronym>JIT</acronym> compilation is beneficial primarily for long-running | ||||
|    CPU bound queries. Frequently these will be analytical queries.  For short | ||||
|    queries the added overhead of performing <acronym>JIT</acronym> compilation | ||||
|    will often be higher than the time it can save. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|    To determine whether <acronym>JIT</acronym> is used, the total cost of a | ||||
|    query (see <xref linkend="planner-stats-details"/> and <xref | ||||
|    To determine whether <acronym>JIT</acronym> compilation is used, the total | ||||
|    cost of a query (see <xref linkend="planner-stats-details"/> and <xref | ||||
|    linkend="runtime-config-query-constants"/>) is used. | ||||
|   </para> | ||||
|  | ||||
| @@ -117,9 +117,9 @@ | ||||
|  | ||||
|   <para> | ||||
|    If the planner, based on the above criterion, decided that | ||||
|    <acronym>JIT</acronym> is beneficial, two further decisions are | ||||
|    <acronym>JIT</acronym> compilation is beneficial, two further decisions are | ||||
|    made. Firstly, if the query is more costly than the <xref | ||||
|    linkend="guc-jit-optimize-above-cost"/>, GUC expensive optimizations are | ||||
|    linkend="guc-jit-optimize-above-cost"/> GUC, expensive optimizations are | ||||
|    used to improve the generated code. Secondly, if the query is more costly | ||||
|    than the <xref linkend="guc-jit-inline-above-cost"/> GUC, short functions | ||||
|    and operators used in the query will be inlined.  Both of these operations | ||||
| @@ -187,8 +187,9 @@ SET | ||||
| └─────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ | ||||
|    </programlisting> | ||||
|    As visible here, <acronym>JIT</acronym> was used, but inlining and | ||||
|    optimization were not. If <xref linkend="guc-jit-optimize-above-cost"/>, | ||||
|    <xref linkend="guc-jit-inline-above-cost"/> were lowered, just like <xref | ||||
|    expensive optimization were not. If <xref | ||||
|    linkend="guc-jit-optimize-above-cost"/>, <xref | ||||
|    linkend="guc-jit-inline-above-cost"/> were lowered, just like <xref | ||||
|    linkend="guc-jit-above-cost"/>, that would change. | ||||
|   </para> | ||||
|  </sect1> | ||||
| @@ -197,8 +198,8 @@ SET | ||||
|   <title>Configuration</title> | ||||
|  | ||||
|   <para> | ||||
|    <xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym> is | ||||
|    enabled or disabled. | ||||
|    <xref linkend="guc-jit"/> determines whether <acronym>JIT</acronym> | ||||
|    compilation is enabled or disabled. | ||||
|   </para> | ||||
|  | ||||
|   <para> | ||||
|   | ||||
| @@ -13,12 +13,12 @@ the CPU that just handles that expression, yielding a speedup. | ||||
| That this is done at query execution time, possibly even only in cases | ||||
| the relevant task is done a number of times, makes it JIT, rather than | ||||
| ahead-of-time (AOT). Given the way JIT compilation is used in | ||||
| postgres, the lines between interpretation, AOT and JIT are somewhat | ||||
| PostgreSQL, the lines between interpretation, AOT and JIT are somewhat | ||||
| blurry. | ||||
|  | ||||
| Note that the interpreted program turned into a native program does | ||||
| not necessarily have to be a program in the classical sense. E.g. it | ||||
| is highly beneficial JIT compile tuple deforming into a native | ||||
| is highly beneficial to JIT compile tuple deforming into a native | ||||
| function just handling a specific type of table, despite tuple | ||||
| deforming not commonly being understood as a "program". | ||||
|  | ||||
| @@ -26,7 +26,7 @@ deforming not commonly being understood as a "program". | ||||
| Why JIT? | ||||
| ======== | ||||
|  | ||||
| Parts of postgres are commonly bottlenecked by comparatively small | ||||
| Parts of PostgreSQL are commonly bottlenecked by comparatively small | ||||
| pieces of CPU intensive code. In a number of cases that is because the | ||||
| relevant code has to be very generic (e.g. handling arbitrary SQL | ||||
| level expressions, over arbitrary tables, with arbitrary extensions | ||||
| @@ -49,11 +49,11 @@ particularly beneficial for removing branches during tuple deforming. | ||||
| How to JIT | ||||
| ========== | ||||
|  | ||||
| Postgres, by default, uses LLVM to perform JIT. LLVM was chosen | ||||
| PostgreSQL, by default, uses LLVM to perform JIT. LLVM was chosen | ||||
| because it is developed by several large corporations and therefore | ||||
| unlikely to be discontinued, because it has a license compatible with | ||||
| PostgreSQL, and because its LLVM IR can be generated from C | ||||
| using the clang compiler. | ||||
| PostgreSQL, and because its IR can be generated from C using the Clang | ||||
| compiler. | ||||
|  | ||||
|  | ||||
| Shared Library Separation | ||||
| @@ -68,13 +68,13 @@ An additional benefit of doing so is that it is relatively easy to | ||||
| evaluate JIT compilation that does not use LLVM, by changing out the | ||||
| shared library used to provide JIT compilation. | ||||
|  | ||||
| To achieve this code, e.g. expression evaluation, intending to perform | ||||
| JIT, calls a LLVM independent wrapper located in jit.c to do so. If | ||||
| the shared library providing JIT support can be loaded (i.e. postgres | ||||
| was compiled with LLVM support and the shared library is installed), | ||||
| the task of JIT compiling an expression gets handed of to shared | ||||
| library. This obviously requires that the function in jit.c is allowed | ||||
| to fail in case no JIT provider can be loaded. | ||||
| To achieve this, code intending to perform JIT (e.g. expression evaluation) | ||||
| calls an LLVM independent wrapper located in jit.c to do so. If the | ||||
| shared library providing JIT support can be loaded (i.e. PostgreSQL was | ||||
| compiled with LLVM support and the shared library is installed), the task | ||||
| of JIT compiling an expression gets handed off to the shared library. This | ||||
| obviously requires that the function in jit.c is allowed to fail in case | ||||
| no JIT provider can be loaded. | ||||
|  | ||||
| Which shared library is loaded is determined by the jit_provider GUC, | ||||
| defaulting to "llvmjit". | ||||
| @@ -82,8 +82,8 @@ defaulting to "llvmjit". | ||||
| Cloistering code performing JIT into a shared library unfortunately | ||||
| also means that code doing JIT compilation for various parts of code | ||||
| has to be located separately from the code doing so without | ||||
| JIT. E.g. the JITed version of execExprInterp.c is located in | ||||
| jit/llvm/ rather than executor/. | ||||
| JIT. E.g. the JIT version of execExprInterp.c is located in jit/llvm/ | ||||
| rather than executor/. | ||||
|  | ||||
|  | ||||
| JIT Context | ||||
| @@ -105,9 +105,9 @@ implementations. | ||||
|  | ||||
| Emitting individual functions separately is more expensive than | ||||
| emitting several functions at once, and emitting them together can | ||||
| provide additional optimization opportunities. To facilitate that the | ||||
| LLVM provider separates function definition from emitting them in an | ||||
| executable way. | ||||
| provide additional optimization opportunities. To facilitate that, the | ||||
| LLVM provider separates defining functions from optimizing and | ||||
| emitting functions in an executable manner. | ||||
|  | ||||
| Creating functions into the current mutable module (a module | ||||
| essentially is LLVM's equivalent of a translation unit in C) is done | ||||
| @@ -127,7 +127,7 @@ used. | ||||
| Error Handling | ||||
| -------------- | ||||
|  | ||||
| There are two aspects to error handling.  Firstly, generated (LLVM IR) | ||||
| There are two aspects of error handling.  Firstly, generated (LLVM IR) | ||||
| and emitted functions (mmap()ed segments) need to be cleaned up both | ||||
| after a successful query execution and after an error. This is done by | ||||
| registering each created JITContext with the current resource owner, | ||||
| @@ -140,12 +140,12 @@ cleaning up emitted code upon ERROR, but there's also the chance that | ||||
| LLVM itself runs out of memory. LLVM by default does *not* use any C++ | ||||
| exceptions. Its allocations are primarily funneled through the | ||||
| standard "new" handlers, and some direct use of malloc() and | ||||
| mmap(). For the former a 'new handler' exists | ||||
| http://en.cppreference.com/w/cpp/memory/new/set_new_handler for the | ||||
| latter LLVM provides callback that get called upon failure | ||||
| (unfortunately mmap() failures are treated as fatal rather than OOM | ||||
| errors).  What we've, for now, chosen to do, is to have two functions | ||||
| that LLVM using code must use: | ||||
| mmap(). For the former a 'new handler' exists: | ||||
| http://en.cppreference.com/w/cpp/memory/new/set_new_handler | ||||
| For the latter LLVM provides callbacks that get called upon failure | ||||
| (unfortunately mmap() failures are treated as fatal rather than OOM errors). | ||||
| What we've chosen to do for now is have two functions that LLVM using code | ||||
| must use: | ||||
| extern void llvm_enter_fatal_on_oom(void); | ||||
| extern void llvm_leave_fatal_on_oom(void); | ||||
| before interacting with LLVM code. | ||||
| @@ -160,7 +160,7 @@ the handlers instead are reset on toplevel sigsetjmp() level. | ||||
|  | ||||
| Using a relatively small enter/leave protected section of code, rather | ||||
| than setting up these handlers globally, avoids negative interactions | ||||
| with extensions that might use C++ like e.g. postgis. As LLVM code | ||||
| with extensions that might use C++ such as PostGIS. As LLVM code | ||||
| generation should never execute arbitrary code, just setting these | ||||
| handlers temporarily ought to suffice. | ||||
|  | ||||
| @@ -168,9 +168,9 @@ handlers temporarily ought to suffice. | ||||
| Type Synchronization | ||||
| -------------------- | ||||
|  | ||||
| To able to generate code performing tasks that are done in "interpreted" | ||||
| postgres, it obviously is required that code generation knows about at | ||||
| least a few postgres types.  While it is possible to inform LLVM about | ||||
| To be able to generate code that can perform tasks done by "interpreted" | ||||
| PostgreSQL, it obviously is required that code generation knows about at | ||||
| least a few PostgreSQL types.  While it is possible to inform LLVM about | ||||
| type definitions by recreating them manually in C code, that is failure | ||||
| prone and labor intensive. | ||||
|  | ||||
| @@ -178,13 +178,13 @@ Instead there is one small file (llvmjit_types.c) which references each of | ||||
| the types required for JITing. That file is translated to bitcode at | ||||
| compile time, and loaded when LLVM is initialized in a backend. | ||||
|  | ||||
| That works very well to synchronize the type definition, unfortunately | ||||
| That works very well to synchronize the type definition, but unfortunately | ||||
| it does *not* synchronize offsets as the IR level representation doesn't | ||||
| know field names.  Instead required offsets are maintained as defines in | ||||
| the original struct definition. E.g. | ||||
| know field names.  Instead, required offsets are maintained as defines in | ||||
| the original struct definition, like so: | ||||
| #define FIELDNO_TUPLETABLESLOT_NVALID 9 | ||||
|         int                     tts_nvalid;             /* # of valid values in tts_values */ | ||||
| while that still needs to be defined, it's only required for a | ||||
| While that still needs to be defined, it's only required for a | ||||
| relatively small number of fields, and it's bunched together with the | ||||
| struct definition, so it's easily kept synchronized. | ||||
|  | ||||
| @@ -193,12 +193,12 @@ Inlining | ||||
| -------- | ||||
|  | ||||
| One big advantage of JITing expressions is that it can significantly | ||||
| reduce the overhead of postgres's extensible function/operator | ||||
| mechanism, by inlining the body of called functions / operators. | ||||
| reduce the overhead of PostgreSQL's extensible function/operator | ||||
| mechanism, by inlining the body of called functions/operators. | ||||
|  | ||||
| It obviously is undesirable to maintain a second implementation of | ||||
| commonly used functions, just for inlining purposes. Instead we take | ||||
| advantage of the fact that the clang compiler can emit LLVM IR. | ||||
| advantage of the fact that the Clang compiler can emit LLVM IR. | ||||
|  | ||||
| The ability to do so allows us to get the LLVM IR for all operators | ||||
| (e.g. int8eq, float8pl etc), without maintaining two copies.  These | ||||
| @@ -225,7 +225,7 @@ Caching | ||||
| Currently it is not yet possible to cache generated functions, even | ||||
| though that'd be desirable from a performance point of view. The | ||||
| problem is that the generated functions commonly contain pointers into | ||||
| per-execution memory. The expression evaluation functionality needs to | ||||
| per-execution memory. The expression evaluation machinery needs to | ||||
| be redesigned a bit to avoid that. Basically all per-execution memory | ||||
| needs to be referenced as an offset to one block of memory stored in | ||||
| an ExprState, rather than absolute pointers into memory. | ||||
| @@ -278,7 +278,7 @@ Currently there are a number of GUCs that influence JITing: | ||||
| - jit_inline_above_cost = -1, 0-DBL_MAX - inlining is tried if query has | ||||
|   higher cost. | ||||
|  | ||||
| whenever a query's total cost is above these limits, JITing is | ||||
| Whenever a query's total cost is above these limits, JITing is | ||||
| performed. | ||||
|  | ||||
| Alternative costing models, e.g. by generating separate paths for | ||||
| @@ -291,5 +291,5 @@ individual expressions. | ||||
| The obvious seeming approach of JITing expressions individually after | ||||
| a number of execution turns out not to work too well. Primarily | ||||
| because emitting many small functions individually has significant | ||||
| overhead. Secondarily because the time till JITing occurs causes | ||||
| overhead. Secondarily because the time until JITing occurs causes | ||||
| relative slowdowns that eat into the gain of JIT compilation. | ||||
|   | ||||
		Reference in New Issue
	
	Block a user