1
0
mirror of https://github.com/postgres/postgres.git synced 2025-05-02 11:44:50 +03:00
postgres/doc/src/sgml/oauth-validators.sgml
Thomas Munro 873c0fd678 oauth: Improve validator docs on interruptibility
Andres pointed out that EINTR handling is inadequate for real-world use
cases. Direct module writers to our wait APIs instead.

Author: Jacob Champion <jacob.champion@enterprisedb.com>
Discussion: https://postgr.es/m/p4bd7mn6dxr2zdak74abocyltpfdxif4pxqzixqpxpetjwt34h%40qc6jgfmoddvq
2025-03-19 16:58:06 +13:00

417 lines
19 KiB
Plaintext

<!-- doc/src/sgml/oauth-validators.sgml -->
<chapter id="oauth-validators">
<title>OAuth Validator Modules</title>
<indexterm zone="oauth-validators">
<primary>OAuth Validators</primary>
</indexterm>
<para>
<productname>PostgreSQL</productname> provides infrastructure for creating
custom modules to perform server-side validation of OAuth bearer tokens.
Because OAuth implementations vary so wildly, and bearer token validation is
heavily dependent on the issuing party, the server cannot check the token
itself; validator modules provide the integration layer between the server
and the OAuth provider in use.
</para>
<para>
OAuth validator modules must at least consist of an initialization function
(see <xref linkend="oauth-validator-init"/>) and the required callback for
performing validation (see <xref linkend="oauth-validator-callback-validate"/>).
</para>
<warning>
<para>
Since a misbehaving validator might let unauthorized users into the database,
correct implementation is crucial for server safety. See
<xref linkend="oauth-validator-design"/> for design considerations.
</para>
</warning>
<sect1 id="oauth-validator-design">
<title>Safely Designing a Validator Module</title>
<warning>
<para>
Read and understand the entirety of this section before implementing a
validator module. A malfunctioning validator is potentially worse than no
authentication at all, both because of the false sense of security it
provides, and because it may contribute to attacks against other pieces of
an OAuth ecosystem.
</para>
</warning>
<sect2 id="oauth-validator-design-responsibilities">
<title>Validator Responsibilities</title>
<para>
Although different modules may take very different approaches to token
validation, implementations generally need to perform three separate
actions:
</para>
<variablelist>
<varlistentry>
<term>Validate the Token</term>
<listitem>
<para>
The validator must first ensure that the presented token is in fact a
valid Bearer token for use in client authentication. The correct way to
do this depends on the provider, but it generally involves either
cryptographic operations to prove that the token was created by a trusted
party (offline validation), or the presentation of the token to that
trusted party so that it can perform validation for you (online
validation).
</para>
<para>
Online validation, usually implemented via
<ulink url="https://datatracker.ietf.org/doc/html/rfc7662">OAuth Token
Introspection</ulink>, requires fewer steps of a validator module and
allows central revocation of a token in the event that it is stolen
or misissued. However, it does require the module to make at least one
network call per authentication attempt (all of which must complete
within the configured <xref linkend="guc-authentication-timeout"/>).
Additionally, your provider may not provide introspection endpoints for
use by external resource servers.
</para>
<para>
Offline validation is much more involved, typically requiring a validator
to maintain a list of trusted signing keys for a provider and then
check the token's cryptographic signature along with its contents.
Implementations must follow the provider's instructions to the letter,
including any verification of issuer ("where is this token from?"),
audience ("who is this token for?"), and validity period ("when can this
token be used?"). Since there is no communication between the module and
the provider, tokens cannot be centrally revoked using this method;
offline validator implementations may wish to place restrictions on the
maximum length of a token's validity period.
</para>
<para>
If the token cannot be validated, the module should immediately fail.
Further authentication/authorization is pointless if the bearer token
wasn't issued by a trusted party.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Authorize the Client</term>
<listitem>
<para>
Next the validator must ensure that the end user has given the client
permission to access the server on their behalf. This generally involves
checking the scopes that have been assigned to the token, to make sure
that they cover database access for the current HBA parameters.
</para>
<para>
The purpose of this step is to prevent an OAuth client from obtaining a
token under false pretenses. If the validator requires all tokens to
carry scopes that cover database access, the provider should then loudly
prompt the user to grant that access during the flow. This gives them the
opportunity to reject the request if the client isn't supposed to be
using their credentials to connect to databases.
</para>
<para>
While it is possible to establish client authorization without explicit
scopes by using out-of-band knowledge of the deployed architecture, doing
so removes the user from the loop, which prevents them from catching
deployment mistakes and allows any such mistakes to be exploited
silently. Access to the database must be tightly restricted to only
trusted clients
<footnote>
<para>
That is, "trusted" in the sense that the OAuth client and the
<productname>PostgreSQL</productname> server are controlled by the same
entity. Notably, the Device Authorization client flow supported by
libpq does not usually meet this bar, since it's designed for use by
public/untrusted clients.
</para>
</footnote>
if users are not prompted for additional scopes.
</para>
<para>
Even if authorization fails, a module may choose to continue to pull
authentication information from the token for use in auditing and
debugging.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Authenticate the End User</term>
<listitem>
<para>
Finally, the validator should determine a user identifier for the token,
either by asking the provider for this information or by extracting it
from the token itself, and return that identifier to the server (which
will then make a final authorization decision using the HBA
configuration). This identifier will be available within the session via
<link linkend="functions-info-session-table"><function>system_user</function></link>
and recorded in the server logs if <xref linkend="guc-log-connections"/>
is enabled.
</para>
<para>
Different providers may record a variety of different authentication
information for an end user, typically referred to as
<emphasis>claims</emphasis>. Providers usually document which of these
claims are trustworthy enough to use for authorization decisions and
which are not. (For instance, it would probably not be wise to use an
end user's full name as the identifier for authentication, since many
providers allow users to change their display names arbitrarily.)
Ultimately, the choice of which claim (or combination of claims) to use
comes down to the provider implementation and application requirements.
</para>
<para>
Note that anonymous/pseudonymous login is possible as well, by enabling
usermap delegation; see
<xref linkend="oauth-validator-design-usermap-delegation"/>.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="oauth-validator-design-guidelines">
<title>General Coding Guidelines</title>
<para>
Developers should keep the following in mind when implementing token
validation:
</para>
<variablelist>
<varlistentry>
<term>Token Confidentiality</term>
<listitem>
<para>
Modules should not write tokens, or pieces of tokens, into the server
log. This is true even if the module considers the token invalid; an
attacker who confuses a client into communicating with the wrong provider
should not be able to retrieve that (otherwise valid) token from the
disk.
</para>
<para>
Implementations that send tokens over the network (for example, to
perform online token validation with a provider) must authenticate the
peer and ensure that strong transport security is in use.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Logging</term>
<listitem>
<para>
Modules may use the same <link linkend="error-message-reporting">logging
facilities</link> as standard extensions; however, the rules for emitting
log entries to the client are subtly different during the authentication
phase of the connection. Generally speaking, modules should log
verification problems at the <symbol>COMMERROR</symbol> level and return
normally, instead of using <symbol>ERROR</symbol>/<symbol>FATAL</symbol>
to unwind the stack, to avoid leaking information to unauthenticated
clients.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Interruptibility</term>
<listitem>
<para>
Modules must remain interruptible by signals so that the server can
correctly handle authentication timeouts and shutdown signals from
<application>pg_ctl</application>. For example, blocking calls on sockets
should generally be replaced with code that handles both socket events
and interrupts without races (see <function>WaitLatchOrSocket()</function>,
<function>WaitEventSetWait()</function>, et al), and long-running loops
should periodically call <function>CHECK_FOR_INTERRUPTS()</function>.
Failure to follow this guidance may result in unresponsive backend
sessions.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Testing</term>
<listitem>
<para>
The breadth of testing an OAuth system is well beyond the scope of this
documentation, but at minimum, negative testing should be considered
mandatory. It's trivial to design a module that lets authorized users in;
the whole point of the system is to keep unauthorized users out.
</para>
</listitem>
</varlistentry>
<varlistentry>
<term>Documentation</term>
<listitem>
<para>
Validator implementations should document the contents and format of the
authenticated ID that is reported to the server for each end user, since
DBAs may need to use this information to construct pg_ident maps. (For
instance, is it an email address? an organizational ID number? a UUID?)
They should also document whether or not it is safe to use the module in
<symbol>delegate_ident_mapping=1</symbol> mode, and what additional
configuration is required in order to do so.
</para>
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2 id="oauth-validator-design-usermap-delegation">
<title>Authorizing Users (Usermap Delegation)</title>
<para>
The standard deliverable of a validation module is the user identifier,
which the server will then compare to any configured
<link linkend="auth-username-maps"><filename>pg_ident.conf</filename>
mappings</link> and determine whether the end user is authorized to connect.
However, OAuth is itself an authorization framework, and tokens may carry
information about user privileges. For example, a token may be associated
with the organizational groups that a user belongs to, or list the roles
that a user may assume, and duplicating that knowledge into local usermaps
for every server may not be desirable.
</para>
<para>
To bypass username mapping entirely, and have the validator module assume
the additional responsibility of authorizing user connections, the HBA may
be configured with <xref linkend="auth-oauth-delegate-ident-mapping"/>.
The module may then use token scopes or an equivalent method to decide
whether the user is allowed to connect under their desired role. The user
identifier will still be recorded by the server, but it plays no part in
determining whether to continue the connection.
</para>
<para>
Using this scheme, authentication itself is optional. As long as the module
reports that the connection is authorized, login will continue even if there
is no recorded user identifier at all. This makes it possible to implement
anonymous or pseudonymous access to the database, where the third-party
provider performs all necessary authentication but does not provide any
user-identifying information to the server. (Some providers may create an
anonymized ID number that can be recorded instead, for later auditing.)
</para>
<para>
Usermap delegation provides the most architectural flexibility, but it turns
the validator module into a single point of failure for connection
authorization. Use with caution.
</para>
</sect2>
</sect1>
<sect1 id="oauth-validator-init">
<title>Initialization Functions</title>
<indexterm zone="oauth-validator-init">
<primary>_PG_oauth_validator_module_init</primary>
</indexterm>
<para>
OAuth validator modules are dynamically loaded from the shared
libraries listed in <xref linkend="guc-oauth-validator-libraries"/>.
Modules are loaded on demand when requested from a login in progress.
The normal library search path is used to locate the library. To
provide the validator callbacks and to indicate that the library is an OAuth
validator module a function named
<function>_PG_oauth_validator_module_init</function> must be provided. The
return value of the function must be a pointer to a struct of type
<structname>OAuthValidatorCallbacks</structname>, which contains a magic
number and pointers to the module's token validation functions. The returned
pointer must be of server lifetime, which is typically achieved by defining
it as a <literal>static const</literal> variable in global scope.
<programlisting>
typedef struct OAuthValidatorCallbacks
{
uint32 magic; /* must be set to PG_OAUTH_VALIDATOR_MAGIC */
ValidatorStartupCB startup_cb;
ValidatorShutdownCB shutdown_cb;
ValidatorValidateCB validate_cb;
} OAuthValidatorCallbacks;
typedef const OAuthValidatorCallbacks *(*OAuthValidatorModuleInit) (void);
</programlisting>
Only the <function>validate_cb</function> callback is required, the others
are optional.
</para>
</sect1>
<sect1 id="oauth-validator-callbacks">
<title>OAuth Validator Callbacks</title>
<para>
OAuth validator modules implement their functionality by defining a set of
callbacks. The server will call them as required to process the
authentication request from the user.
</para>
<sect2 id="oauth-validator-callback-startup">
<title>Startup Callback</title>
<para>
The <function>startup_cb</function> callback is executed directly after
loading the module. This callback can be used to set up local state and
perform additional initialization if required. If the validator module
has state it can use <structfield>state->private_data</structfield> to
store it.
<programlisting>
typedef void (*ValidatorStartupCB) (ValidatorModuleState *state);
</programlisting>
</para>
</sect2>
<sect2 id="oauth-validator-callback-validate">
<title>Validate Callback</title>
<para>
The <function>validate_cb</function> callback is executed during the OAuth
exchange when a user attempts to authenticate using OAuth. Any state set in
previous calls will be available in <structfield>state->private_data</structfield>.
<programlisting>
typedef bool (*ValidatorValidateCB) (const ValidatorModuleState *state,
const char *token, const char *role,
ValidatorModuleResult *result);
</programlisting>
<replaceable>token</replaceable> will contain the bearer token to validate.
<application>PostgreSQL</application> has ensured that the token is well-formed syntactically, but no
other validation has been performed. <replaceable>role</replaceable> will
contain the role the user has requested to log in as. The callback must
set output parameters in the <literal>result</literal> struct, which is
defined as below:
<programlisting>
typedef struct ValidatorModuleResult
{
bool authorized;
char *authn_id;
} ValidatorModuleResult;
</programlisting>
The connection will only proceed if the module sets
<structfield>result->authorized</structfield> to <literal>true</literal>. To
authenticate the user, the authenticated user name (as determined using the
token) shall be palloc'd and returned in the <structfield>result->authn_id</structfield>
field. Alternatively, <structfield>result->authn_id</structfield> may be set to
NULL if the token is valid but the associated user identity cannot be
determined.
</para>
<para>
A validator may return <literal>false</literal> to signal an internal error,
in which case any result parameters are ignored and the connection fails.
Otherwise the validator should return <literal>true</literal> to indicate
that it has processed the token and made an authorization decision.
</para>
<para>
The behavior after <function>validate_cb</function> returns depends on the
specific HBA setup. Normally, the <structfield>result->authn_id</structfield> user
name must exactly match the role that the user is logging in as. (This
behavior may be modified with a usermap.) But when authenticating against
an HBA rule with <literal>delegate_ident_mapping</literal> turned on,
<productname>PostgreSQL</productname> will not perform any checks on the value of
<structfield>result->authn_id</structfield> at all; in this case it is up to the
validator to ensure that the token carries enough privileges for the user to
log in under the indicated <replaceable>role</replaceable>.
</para>
</sect2>
<sect2 id="oauth-validator-callback-shutdown">
<title>Shutdown Callback</title>
<para>
The <function>shutdown_cb</function> callback is executed when the backend
process associated with the connection exits. If the validator module has
any allocated state, this callback should free it to avoid resource leaks.
<programlisting>
typedef void (*ValidatorShutdownCB) (ValidatorModuleState *state);
</programlisting>
</para>
</sect2>
</sect1>
</chapter>