mirror of
https://github.com/postgres/postgres.git
synced 2025-05-02 11:44:50 +03:00
1036 lines
37 KiB
Plaintext
1036 lines
37 KiB
Plaintext
<!-- doc/src/sgml/sources.sgml -->
|
|
|
|
<chapter id="source">
|
|
<title>PostgreSQL Coding Conventions</title>
|
|
|
|
<sect1 id="source-format">
|
|
<title>Formatting</title>
|
|
|
|
<para>
|
|
Source code formatting uses 4 column tab spacing, with
|
|
tabs preserved (i.e., tabs are not expanded to spaces).
|
|
Each logical indentation level is one additional tab stop.
|
|
</para>
|
|
|
|
<para>
|
|
Layout rules (brace positioning, etc.) follow BSD conventions. In
|
|
particular, curly braces for the controlled blocks of <literal>if</literal>,
|
|
<literal>while</literal>, <literal>switch</literal>, etc. go on their own lines.
|
|
</para>
|
|
|
|
<para>
|
|
Limit line lengths so that the code is readable in an 80-column window.
|
|
(This doesn't mean that you must never go past 80 columns. For instance,
|
|
breaking a long error message string in arbitrary places just to keep the
|
|
code within 80 columns is probably not a net gain in readability.)
|
|
</para>
|
|
|
|
<para>
|
|
To maintain a consistent coding style, do not use C++ style comments
|
|
(<literal>//</literal> comments). <application>pgindent</application>
|
|
will replace them with <literal>/* ... */</literal>.
|
|
</para>
|
|
|
|
<para>
|
|
The preferred style for multi-line comment blocks is
|
|
<programlisting>
|
|
/*
|
|
* comment text begins here
|
|
* and continues here
|
|
*/
|
|
</programlisting>
|
|
Note that comment blocks that begin in column 1 will be preserved as-is
|
|
by <application>pgindent</application>, but it will re-flow indented comment blocks
|
|
as though they were plain text. If you want to preserve the line breaks
|
|
in an indented block, add dashes like this:
|
|
<programlisting>
|
|
/*----------
|
|
* comment text begins here
|
|
* and continues here
|
|
*----------
|
|
*/
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
While submitted patches do not absolutely have to follow these formatting
|
|
rules, it's a good idea to do so. Your code will get run through
|
|
<application>pgindent</application> before the next release, so there's no point in
|
|
making it look nice under some other set of formatting conventions.
|
|
A good rule of thumb for patches is <quote>make the new code look like
|
|
the existing code around it</quote>.
|
|
</para>
|
|
|
|
<para>
|
|
The <filename>src/tools/editors</filename> directory contains sample settings
|
|
files that can be used with the <productname>Emacs</productname>,
|
|
<productname>xemacs</productname> or <productname>vim</productname>
|
|
editors to help ensure that they format code according to these
|
|
conventions.
|
|
</para>
|
|
|
|
<para>
|
|
If you'd like to run <application>pgindent</application> locally
|
|
to help make your code match project style, see
|
|
the <filename>src/tools/pgindent</filename> directory.
|
|
</para>
|
|
|
|
<para>
|
|
The text browsing tools <application>more</application> and
|
|
<application>less</application> can be invoked as:
|
|
<programlisting>
|
|
more -x4
|
|
less -x4
|
|
</programlisting>
|
|
to make them show tabs appropriately.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="error-message-reporting">
|
|
<title>Reporting Errors Within the Server</title>
|
|
|
|
<indexterm>
|
|
<primary>ereport</primary>
|
|
</indexterm>
|
|
<indexterm>
|
|
<primary>elog</primary>
|
|
</indexterm>
|
|
|
|
<para>
|
|
Error, warning, and log messages generated within the server code
|
|
should be created using <function>ereport</function>, or its older cousin
|
|
<function>elog</function>. The use of this function is complex enough to
|
|
require some explanation.
|
|
</para>
|
|
|
|
<para>
|
|
There are two required elements for every message: a severity level
|
|
(ranging from <literal>DEBUG</literal> to <literal>PANIC</literal>) and a primary
|
|
message text. In addition there are optional elements, the most
|
|
common of which is an error identifier code that follows the SQL spec's
|
|
SQLSTATE conventions.
|
|
<function>ereport</function> itself is just a shell macro that exists
|
|
mainly for the syntactic convenience of making message generation
|
|
look like a single function call in the C source code. The only parameter
|
|
accepted directly by <function>ereport</function> is the severity level.
|
|
The primary message text and any optional message elements are
|
|
generated by calling auxiliary functions, such as <function>errmsg</function>,
|
|
within the <function>ereport</function> call.
|
|
</para>
|
|
|
|
<para>
|
|
A typical call to <function>ereport</function> might look like this:
|
|
<programlisting>
|
|
ereport(ERROR,
|
|
errcode(ERRCODE_DIVISION_BY_ZERO),
|
|
errmsg("division by zero"));
|
|
</programlisting>
|
|
This specifies error severity level <literal>ERROR</literal> (a run-of-the-mill
|
|
error). The <function>errcode</function> call specifies the SQLSTATE error code
|
|
using a macro defined in <filename>src/include/utils/errcodes.h</filename>. The
|
|
<function>errmsg</function> call provides the primary message text.
|
|
</para>
|
|
|
|
<para>
|
|
You will also frequently see this older style, with an extra set of
|
|
parentheses surrounding the auxiliary function calls:
|
|
<programlisting>
|
|
ereport(ERROR,
|
|
(errcode(ERRCODE_DIVISION_BY_ZERO),
|
|
errmsg("division by zero")));
|
|
</programlisting>
|
|
The extra parentheses were required
|
|
before <productname>PostgreSQL</productname> version 12, but are now
|
|
optional.
|
|
</para>
|
|
|
|
<para>
|
|
Here is a more complex example:
|
|
<programlisting>
|
|
ereport(ERROR,
|
|
errcode(ERRCODE_AMBIGUOUS_FUNCTION),
|
|
errmsg("function %s is not unique",
|
|
func_signature_string(funcname, nargs,
|
|
NIL, actual_arg_types)),
|
|
errhint("Unable to choose a best candidate function. "
|
|
"You might need to add explicit typecasts."));
|
|
</programlisting>
|
|
This illustrates the use of format codes to embed run-time values into
|
|
a message text. Also, an optional <quote>hint</quote> message is provided.
|
|
The auxiliary function calls can be written in any order, but
|
|
conventionally <function>errcode</function>
|
|
and <function>errmsg</function> appear first.
|
|
</para>
|
|
|
|
<para>
|
|
If the severity level is <literal>ERROR</literal> or higher,
|
|
<function>ereport</function> aborts execution of the current query
|
|
and does not return to the caller. If the severity level is
|
|
lower than <literal>ERROR</literal>, <function>ereport</function> returns normally.
|
|
</para>
|
|
|
|
<para>
|
|
The available auxiliary routines for <function>ereport</function> are:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
<function>errcode(sqlerrcode)</function> specifies the SQLSTATE error identifier
|
|
code for the condition. If this routine is not called, the error
|
|
identifier defaults to
|
|
<literal>ERRCODE_INTERNAL_ERROR</literal> when the error severity level is
|
|
<literal>ERROR</literal> or higher, <literal>ERRCODE_WARNING</literal> when the
|
|
error level is <literal>WARNING</literal>, otherwise (for <literal>NOTICE</literal>
|
|
and below) <literal>ERRCODE_SUCCESSFUL_COMPLETION</literal>.
|
|
While these defaults are often convenient, always think whether they
|
|
are appropriate before omitting the <function>errcode()</function> call.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errmsg(const char *msg, ...)</function> specifies the primary error
|
|
message text, and possibly run-time values to insert into it. Insertions
|
|
are specified by <function>sprintf</function>-style format codes. In addition to
|
|
the standard format codes accepted by <function>sprintf</function>, the format
|
|
code <literal>%m</literal> can be used to insert the error message returned
|
|
by <function>strerror</function> for the current value of <literal>errno</literal>.
|
|
<footnote>
|
|
<para>
|
|
That is, the value that was current when the <function>ereport</function> call
|
|
was reached; changes of <literal>errno</literal> within the auxiliary reporting
|
|
routines will not affect it. That would not be true if you were to
|
|
write <literal>strerror(errno)</literal> explicitly in <function>errmsg</function>'s
|
|
parameter list; accordingly, do not do so.
|
|
</para>
|
|
</footnote>
|
|
<literal>%m</literal> does not require any
|
|
corresponding entry in the parameter list for <function>errmsg</function>.
|
|
Note that the message string will be run through <function>gettext</function>
|
|
for possible localization before format codes are processed.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errmsg_internal(const char *msg, ...)</function> is the same as
|
|
<function>errmsg</function>, except that the message string will not be
|
|
translated nor included in the internationalization message dictionary.
|
|
This should be used for <quote>cannot happen</quote> cases that are probably
|
|
not worth expending translation effort on.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errmsg_plural(const char *fmt_singular, const char *fmt_plural,
|
|
unsigned long n, ...)</function> is like <function>errmsg</function>, but with
|
|
support for various plural forms of the message.
|
|
<replaceable>fmt_singular</replaceable> is the English singular format,
|
|
<replaceable>fmt_plural</replaceable> is the English plural format,
|
|
<replaceable>n</replaceable> is the integer value that determines which plural
|
|
form is needed, and the remaining arguments are formatted according
|
|
to the selected format string. For more information see
|
|
<xref linkend="nls-guidelines"/>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errdetail(const char *msg, ...)</function> supplies an optional
|
|
<quote>detail</quote> message; this is to be used when there is additional
|
|
information that seems inappropriate to put in the primary message.
|
|
The message string is processed in just the same way as for
|
|
<function>errmsg</function>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errdetail_internal(const char *msg, ...)</function> is the same
|
|
as <function>errdetail</function>, except that the message string will not be
|
|
translated nor included in the internationalization message dictionary.
|
|
This should be used for detail messages that are not worth expending
|
|
translation effort on, for instance because they are too technical to be
|
|
useful to most users.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errdetail_plural(const char *fmt_singular, const char *fmt_plural,
|
|
unsigned long n, ...)</function> is like <function>errdetail</function>, but with
|
|
support for various plural forms of the message.
|
|
For more information see <xref linkend="nls-guidelines"/>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errdetail_log(const char *msg, ...)</function> is the same as
|
|
<function>errdetail</function> except that this string goes only to the server
|
|
log, never to the client. If both <function>errdetail</function> (or one of
|
|
its equivalents above) and
|
|
<function>errdetail_log</function> are used then one string goes to the client
|
|
and the other to the log. This is useful for error details that are
|
|
too security-sensitive or too bulky to include in the report
|
|
sent to the client.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errdetail_log_plural(const char *fmt_singular, const char
|
|
*fmt_plural, unsigned long n, ...)</function> is like
|
|
<function>errdetail_log</function>, but with support for various plural forms of
|
|
the message.
|
|
For more information see <xref linkend="nls-guidelines"/>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errhint(const char *msg, ...)</function> supplies an optional
|
|
<quote>hint</quote> message; this is to be used when offering suggestions
|
|
about how to fix the problem, as opposed to factual details about
|
|
what went wrong.
|
|
The message string is processed in just the same way as for
|
|
<function>errmsg</function>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errhint_plural(const char *fmt_singular, const char *fmt_plural,
|
|
unsigned long n, ...)</function> is like <function>errhint</function>, but with
|
|
support for various plural forms of the message.
|
|
For more information see <xref linkend="nls-guidelines"/>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errcontext(const char *msg, ...)</function> is not normally called
|
|
directly from an <function>ereport</function> message site; rather it is used
|
|
in <literal>error_context_stack</literal> callback functions to provide
|
|
information about the context in which an error occurred, such as the
|
|
current location in a PL function.
|
|
The message string is processed in just the same way as for
|
|
<function>errmsg</function>. Unlike the other auxiliary functions, this can
|
|
be called more than once per <function>ereport</function> call; the successive
|
|
strings thus supplied are concatenated with separating newlines.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errposition(int cursorpos)</function> specifies the textual location
|
|
of an error within a query string. Currently it is only useful for
|
|
errors detected in the lexical and syntactic analysis phases of
|
|
query processing.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errtable(Relation rel)</function> specifies a relation whose
|
|
name and schema name should be included as auxiliary fields in the error
|
|
report.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errtablecol(Relation rel, int attnum)</function> specifies
|
|
a column whose name, table name, and schema name should be included as
|
|
auxiliary fields in the error report.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errtableconstraint(Relation rel, const char *conname)</function>
|
|
specifies a table constraint whose name, table name, and schema name
|
|
should be included as auxiliary fields in the error report. Indexes
|
|
should be considered to be constraints for this purpose, whether or
|
|
not they have an associated <structname>pg_constraint</structname> entry. Be
|
|
careful to pass the underlying heap relation, not the index itself, as
|
|
<literal>rel</literal>.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errdatatype(Oid datatypeOid)</function> specifies a data
|
|
type whose name and schema name should be included as auxiliary fields
|
|
in the error report.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errdomainconstraint(Oid datatypeOid, const char *conname)</function>
|
|
specifies a domain constraint whose name, domain name, and schema name
|
|
should be included as auxiliary fields in the error report.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errcode_for_file_access()</function> is a convenience function that
|
|
selects an appropriate SQLSTATE error identifier for a failure in a
|
|
file-access-related system call. It uses the saved
|
|
<literal>errno</literal> to determine which error code to generate.
|
|
Usually this should be used in combination with <literal>%m</literal> in the
|
|
primary error message text.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errcode_for_socket_access()</function> is a convenience function that
|
|
selects an appropriate SQLSTATE error identifier for a failure in a
|
|
socket-related system call.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errhidestmt(bool hide_stmt)</function> can be called to specify
|
|
suppression of the <literal>STATEMENT:</literal> portion of a message in the
|
|
postmaster log. Generally this is appropriate if the message text
|
|
includes the current statement already.
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
<function>errhidecontext(bool hide_ctx)</function> can be called to
|
|
specify suppression of the <literal>CONTEXT:</literal> portion of a message in
|
|
the postmaster log. This should only be used for verbose debugging
|
|
messages where the repeated inclusion of context would bloat the log
|
|
too much.
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<note>
|
|
<para>
|
|
At most one of the functions <function>errtable</function>,
|
|
<function>errtablecol</function>, <function>errtableconstraint</function>,
|
|
<function>errdatatype</function>, or <function>errdomainconstraint</function> should
|
|
be used in an <function>ereport</function> call. These functions exist to
|
|
allow applications to extract the name of a database object associated
|
|
with the error condition without having to examine the
|
|
potentially-localized error message text.
|
|
These functions should be used in error reports for which it's likely
|
|
that applications would wish to have automatic error handling. As of
|
|
<productname>PostgreSQL</productname> 9.3, complete coverage exists only for
|
|
errors in SQLSTATE class 23 (integrity constraint violation), but this
|
|
is likely to be expanded in future.
|
|
</para>
|
|
</note>
|
|
|
|
<para>
|
|
There is an older function <function>elog</function> that is still heavily used.
|
|
An <function>elog</function> call:
|
|
<programlisting>
|
|
elog(level, "format string", ...);
|
|
</programlisting>
|
|
is exactly equivalent to:
|
|
<programlisting>
|
|
ereport(level, errmsg_internal("format string", ...));
|
|
</programlisting>
|
|
Notice that the SQLSTATE error code is always defaulted, and the message
|
|
string is not subject to translation.
|
|
Therefore, <function>elog</function> should be used only for internal errors and
|
|
low-level debug logging. Any message that is likely to be of interest to
|
|
ordinary users should go through <function>ereport</function>. Nonetheless,
|
|
there are enough internal <quote>cannot happen</quote> error checks in the
|
|
system that <function>elog</function> is still widely used; it is preferred for
|
|
those messages for its notational simplicity.
|
|
</para>
|
|
|
|
<para>
|
|
Advice about writing good error messages can be found in
|
|
<xref linkend="error-style-guide"/>.
|
|
</para>
|
|
</sect1>
|
|
|
|
<sect1 id="error-style-guide">
|
|
<title>Error Message Style Guide</title>
|
|
|
|
<para>
|
|
This style guide is offered in the hope of maintaining a consistent,
|
|
user-friendly style throughout all the messages generated by
|
|
<productname>PostgreSQL</productname>.
|
|
</para>
|
|
|
|
<simplesect>
|
|
<title>What Goes Where</title>
|
|
|
|
<para>
|
|
The primary message should be short, factual, and avoid reference to
|
|
implementation details such as specific function names.
|
|
<quote>Short</quote> means <quote>should fit on one line under normal
|
|
conditions</quote>. Use a detail message if needed to keep the primary
|
|
message short, or if you feel a need to mention implementation details
|
|
such as the particular system call that failed. Both primary and detail
|
|
messages should be factual. Use a hint message for suggestions about what
|
|
to do to fix the problem, especially if the suggestion might not always be
|
|
applicable.
|
|
</para>
|
|
|
|
<para>
|
|
For example, instead of:
|
|
<programlisting>
|
|
IpcMemoryCreate: shmget(key=%d, size=%u, 0%o) failed: %m
|
|
(plus a long addendum that is basically a hint)
|
|
</programlisting>
|
|
write:
|
|
<programlisting>
|
|
Primary: could not create shared memory segment: %m
|
|
Detail: Failed syscall was shmget(key=%d, size=%u, 0%o).
|
|
Hint: the addendum
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: keeping the primary message short helps keep it to the point,
|
|
and lets clients lay out screen space on the assumption that one line is
|
|
enough for error messages. Detail and hint messages can be relegated to a
|
|
verbose mode, or perhaps a pop-up error-details window. Also, details and
|
|
hints would normally be suppressed from the server log to save
|
|
space. Reference to implementation details is best avoided since users
|
|
aren't expected to know the details.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Formatting</title>
|
|
|
|
<para>
|
|
Don't put any specific assumptions about formatting into the message
|
|
texts. Expect clients and the server log to wrap lines to fit their own
|
|
needs. In long messages, newline characters (\n) can be used to indicate
|
|
suggested paragraph breaks. Don't end a message with a newline. Don't
|
|
use tabs or other formatting characters. (In error context displays,
|
|
newlines are automatically added to separate levels of context such as
|
|
function calls.)
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: Messages are not necessarily displayed on terminal-type
|
|
displays. In GUI displays or browsers these formatting instructions are
|
|
at best ignored.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Quotation Marks</title>
|
|
|
|
<para>
|
|
English text should use double quotes when quoting is appropriate.
|
|
Text in other languages should consistently use one kind of quotes that is
|
|
consistent with publishing customs and computer output of other programs.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: The choice of double quotes over single quotes is somewhat
|
|
arbitrary, but tends to be the preferred use. Some have suggested
|
|
choosing the kind of quotes depending on the type of object according to
|
|
SQL conventions (namely, strings single quoted, identifiers double
|
|
quoted). But this is a language-internal technical issue that many users
|
|
aren't even familiar with, it won't scale to other kinds of quoted terms,
|
|
it doesn't translate to other languages, and it's pretty pointless, too.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Use of Quotes</title>
|
|
|
|
<para>
|
|
Always use quotes to delimit file names, user-supplied identifiers, and
|
|
other variables that might contain words. Do not use them to mark up
|
|
variables that will not contain words (for example, operator names).
|
|
</para>
|
|
|
|
<para>
|
|
There are functions in the backend that will double-quote their own output
|
|
as needed (for example, <function>format_type_be()</function>). Do not put
|
|
additional quotes around the output of such functions.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: Objects can have names that create ambiguity when embedded in a
|
|
message. Be consistent about denoting where a plugged-in name starts and
|
|
ends. But don't clutter messages with unnecessary or duplicate quote
|
|
marks.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Grammar and Punctuation</title>
|
|
|
|
<para>
|
|
The rules are different for primary error messages and for detail/hint
|
|
messages:
|
|
</para>
|
|
|
|
<para>
|
|
Primary error messages: Do not capitalize the first letter. Do not end a
|
|
message with a period. Do not even think about ending a message with an
|
|
exclamation point.
|
|
</para>
|
|
|
|
<para>
|
|
Detail and hint messages: Use complete sentences, and end each with
|
|
a period. Capitalize the first word of sentences. Put two spaces after
|
|
the period if another sentence follows (for English text; might be
|
|
inappropriate in other languages).
|
|
</para>
|
|
|
|
<para>
|
|
Error context strings: Do not capitalize the first letter and do
|
|
not end the string with a period. Context strings should normally
|
|
not be complete sentences.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: Avoiding punctuation makes it easier for client applications to
|
|
embed the message into a variety of grammatical contexts. Often, primary
|
|
messages are not grammatically complete sentences anyway. (And if they're
|
|
long enough to be more than one sentence, they should be split into
|
|
primary and detail parts.) However, detail and hint messages are longer
|
|
and might need to include multiple sentences. For consistency, they should
|
|
follow complete-sentence style even when there's only one sentence.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Upper Case vs. Lower Case</title>
|
|
|
|
<para>
|
|
Use lower case for message wording, including the first letter of a
|
|
primary error message. Use upper case for SQL commands and key words if
|
|
they appear in the message.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: It's easier to make everything look more consistent this
|
|
way, since some messages are complete sentences and some not.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Avoid Passive Voice</title>
|
|
|
|
<para>
|
|
Use the active voice. Use complete sentences when there is an acting
|
|
subject (<quote>A could not do B</quote>). Use telegram style without
|
|
subject if the subject would be the program itself; do not use
|
|
<quote>I</quote> for the program.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: The program is not human. Don't pretend otherwise.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Present vs. Past Tense</title>
|
|
|
|
<para>
|
|
Use past tense if an attempt to do something failed, but could perhaps
|
|
succeed next time (perhaps after fixing some problem). Use present tense
|
|
if the failure is certainly permanent.
|
|
</para>
|
|
|
|
<para>
|
|
There is a nontrivial semantic difference between sentences of the form:
|
|
<programlisting>
|
|
could not open file "%s": %m
|
|
</programlisting>
|
|
and:
|
|
<programlisting>
|
|
cannot open file "%s"
|
|
</programlisting>
|
|
The first one means that the attempt to open the file failed. The
|
|
message should give a reason, such as <quote>disk full</quote> or
|
|
<quote>file doesn't exist</quote>. The past tense is appropriate because
|
|
next time the disk might not be full anymore or the file in question might
|
|
exist.
|
|
</para>
|
|
|
|
<para>
|
|
The second form indicates that the functionality of opening the named file
|
|
does not exist at all in the program, or that it's conceptually
|
|
impossible. The present tense is appropriate because the condition will
|
|
persist indefinitely.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: Granted, the average user will not be able to draw great
|
|
conclusions merely from the tense of the message, but since the language
|
|
provides us with a grammar we should use it correctly.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Type of the Object</title>
|
|
|
|
<para>
|
|
When citing the name of an object, state what kind of object it is.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: Otherwise no one will know what <quote>foo.bar.baz</quote>
|
|
refers to.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Brackets</title>
|
|
|
|
<para>
|
|
Square brackets are only to be used (1) in command synopses to denote
|
|
optional arguments, or (2) to denote an array subscript.
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: Anything else does not correspond to widely-known customary
|
|
usage and will confuse people.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Assembling Error Messages</title>
|
|
|
|
<para>
|
|
When a message includes text that is generated elsewhere, embed it in
|
|
this style:
|
|
<programlisting>
|
|
could not open file %s: %m
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: It would be difficult to account for all possible error codes
|
|
to paste this into a single smooth sentence, so some sort of punctuation
|
|
is needed. Putting the embedded text in parentheses has also been
|
|
suggested, but it's unnatural if the embedded text is likely to be the
|
|
most important part of the message, as is often the case.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Reasons for Errors</title>
|
|
|
|
<para>
|
|
Messages should always state the reason why an error occurred.
|
|
For example:
|
|
<programlisting>
|
|
BAD: could not open file %s
|
|
BETTER: could not open file %s (I/O failure)
|
|
</programlisting>
|
|
If no reason is known you better fix the code.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Function Names</title>
|
|
|
|
<para>
|
|
Don't include the name of the reporting routine in the error text. We have
|
|
other mechanisms for finding that out when needed, and for most users it's
|
|
not helpful information. If the error text doesn't make as much sense
|
|
without the function name, reword it.
|
|
<programlisting>
|
|
BAD: pg_strtoint32: error in "z": cannot parse "z"
|
|
BETTER: invalid input syntax for type integer: "z"
|
|
</programlisting>
|
|
</para>
|
|
|
|
<para>
|
|
Avoid mentioning called function names, either; instead say what the code
|
|
was trying to do:
|
|
<programlisting>
|
|
BAD: open() failed: %m
|
|
BETTER: could not open file %s: %m
|
|
</programlisting>
|
|
If it really seems necessary, mention the system call in the detail
|
|
message. (In some cases, providing the actual values passed to the
|
|
system call might be appropriate information for the detail message.)
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: Users don't know what all those functions do.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Tricky Words to Avoid</title>
|
|
|
|
<formalpara>
|
|
<title>Unable</title>
|
|
<para>
|
|
<quote>Unable</quote> is nearly the passive voice. Better use
|
|
<quote>cannot</quote> or <quote>could not</quote>, as appropriate.
|
|
</para>
|
|
</formalpara>
|
|
|
|
<formalpara>
|
|
<title>Bad</title>
|
|
<para>
|
|
Error messages like <quote>bad result</quote> are really hard to interpret
|
|
intelligently. It's better to write why the result is <quote>bad</quote>,
|
|
e.g., <quote>invalid format</quote>.
|
|
</para>
|
|
</formalpara>
|
|
|
|
<formalpara>
|
|
<title>Illegal</title>
|
|
<para>
|
|
<quote>Illegal</quote> stands for a violation of the law, the rest is
|
|
<quote>invalid</quote>. Better yet, say why it's invalid.
|
|
</para>
|
|
</formalpara>
|
|
|
|
<formalpara>
|
|
<title>Unknown</title>
|
|
<para>
|
|
Try to avoid <quote>unknown</quote>. Consider <quote>error: unknown
|
|
response</quote>. If you don't know what the response is, how do you know
|
|
it's erroneous? <quote>Unrecognized</quote> is often a better choice.
|
|
Also, be sure to include the value being complained of.
|
|
<programlisting>
|
|
BAD: unknown node type
|
|
BETTER: unrecognized node type: 42
|
|
</programlisting>
|
|
</para>
|
|
</formalpara>
|
|
|
|
<formalpara>
|
|
<title>Find vs. Exists</title>
|
|
<para>
|
|
If the program uses a nontrivial algorithm to locate a resource (e.g., a
|
|
path search) and that algorithm fails, it is fair to say that the program
|
|
couldn't <quote>find</quote> the resource. If, on the other hand, the
|
|
expected location of the resource is known but the program cannot access
|
|
it there then say that the resource doesn't <quote>exist</quote>. Using
|
|
<quote>find</quote> in this case sounds weak and confuses the issue.
|
|
</para>
|
|
</formalpara>
|
|
|
|
<formalpara>
|
|
<title>May vs. Can vs. Might</title>
|
|
<para>
|
|
<quote>May</quote> suggests permission (e.g., "You may borrow my rake."),
|
|
and has little use in documentation or error messages.
|
|
<quote>Can</quote> suggests ability (e.g., "I can lift that log."),
|
|
and <quote>might</quote> suggests possibility (e.g., "It might rain
|
|
today."). Using the proper word clarifies meaning and assists
|
|
translation.
|
|
</para>
|
|
</formalpara>
|
|
|
|
<formalpara>
|
|
<title>Contractions</title>
|
|
<para>
|
|
Avoid contractions, like <quote>can't</quote>; use
|
|
<quote>cannot</quote> instead.
|
|
</para>
|
|
</formalpara>
|
|
|
|
<formalpara>
|
|
<title>Non-negative</title>
|
|
<para>
|
|
Avoid <quote>non-negative</quote> as it is ambiguous
|
|
about whether it accepts zero. It's better to use
|
|
<quote>greater than zero</quote> or
|
|
<quote>greater than or equal to zero</quote>.
|
|
</para>
|
|
</formalpara>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Proper Spelling</title>
|
|
|
|
<para>
|
|
Spell out words in full. For instance, avoid:
|
|
<itemizedlist>
|
|
<listitem>
|
|
<para>
|
|
spec
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
stats
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
parens
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
auth
|
|
</para>
|
|
</listitem>
|
|
<listitem>
|
|
<para>
|
|
xact
|
|
</para>
|
|
</listitem>
|
|
</itemizedlist>
|
|
</para>
|
|
|
|
<para>
|
|
Rationale: This will improve consistency.
|
|
</para>
|
|
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Localization</title>
|
|
|
|
<para>
|
|
Keep in mind that error message texts need to be translated into other
|
|
languages. Follow the guidelines in <xref linkend="nls-guidelines"/>
|
|
to avoid making life difficult for translators.
|
|
</para>
|
|
</simplesect>
|
|
|
|
</sect1>
|
|
|
|
<sect1 id="source-conventions">
|
|
<title>Miscellaneous Coding Conventions</title>
|
|
|
|
<simplesect>
|
|
<title>C Standard</title>
|
|
<para>
|
|
Code in <productname>PostgreSQL</productname> should only rely on language
|
|
features available in the C99 standard. That means a conforming
|
|
C99 compiler has to be able to compile postgres, at least aside
|
|
from a few platform dependent pieces.
|
|
</para>
|
|
<para>
|
|
A few features included in the C99 standard are, at this time, not
|
|
permitted to be used in core <productname>PostgreSQL</productname>
|
|
code. This currently includes variable length arrays, intermingled
|
|
declarations and code, <literal>//</literal> comments, universal
|
|
character names. Reasons for that include portability and historical
|
|
practices.
|
|
</para>
|
|
<para>
|
|
Features from later revisions of the C standard or compiler specific
|
|
features can be used, if a fallback is provided.
|
|
</para>
|
|
<para>
|
|
For example <literal>_Static_assert()</literal> and
|
|
<literal>__builtin_constant_p</literal> are currently used, even though
|
|
they are from newer revisions of the C standard and a
|
|
<productname>GCC</productname> extension respectively. If not available
|
|
we respectively fall back to using a C99 compatible replacement that
|
|
performs the same checks, but emits rather cryptic messages and do not
|
|
use <literal>__builtin_constant_p</literal>.
|
|
</para>
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Function-Like Macros and Inline Functions</title>
|
|
<para>
|
|
Both, macros with arguments and <literal>static inline</literal>
|
|
functions, may be used. The latter are preferable if there are
|
|
multiple-evaluation hazards when written as a macro, as e.g., the
|
|
case with
|
|
<programlisting>
|
|
#define Max(x, y) ((x) > (y) ? (x) : (y))
|
|
</programlisting>
|
|
or when the macro would be very long. In other cases it's only
|
|
possible to use macros, or at least easier. For example because
|
|
expressions of various types need to be passed to the macro.
|
|
</para>
|
|
<para>
|
|
When the definition of an inline function references symbols
|
|
(i.e., variables, functions) that are only available as part of the
|
|
backend, the function may not be visible when included from frontend
|
|
code.
|
|
<programlisting>
|
|
#ifndef FRONTEND
|
|
static inline MemoryContext
|
|
MemoryContextSwitchTo(MemoryContext context)
|
|
{
|
|
MemoryContext old = CurrentMemoryContext;
|
|
|
|
CurrentMemoryContext = context;
|
|
return old;
|
|
}
|
|
#endif /* FRONTEND */
|
|
</programlisting>
|
|
In this example <literal>CurrentMemoryContext</literal>, which is only
|
|
available in the backend, is referenced and the function thus
|
|
hidden with a <literal>#ifndef FRONTEND</literal>. This rule
|
|
exists because some compilers emit references to symbols
|
|
contained in inline functions even if the function is not used.
|
|
</para>
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Writing Signal Handlers</title>
|
|
<para>
|
|
To be suitable to run inside a signal handler code has to be
|
|
written very carefully. The fundamental problem is that, unless
|
|
blocked, a signal handler can interrupt code at any time. If code
|
|
inside the signal handler uses the same state as code outside
|
|
chaos may ensue. As an example consider what happens if a signal
|
|
handler tries to acquire a lock that's already held in the
|
|
interrupted code.
|
|
</para>
|
|
<para>
|
|
Barring special arrangements code in signal handlers may only
|
|
call async-signal safe functions (as defined in POSIX) and access
|
|
variables of type <literal>volatile sig_atomic_t</literal>. A few
|
|
functions in <command>postgres</command> are also deemed signal safe, importantly
|
|
<function>SetLatch()</function>.
|
|
</para>
|
|
<para>
|
|
In most cases signal handlers should do nothing more than note
|
|
that a signal has arrived, and wake up code running outside of
|
|
the handler using a latch. An example of such a handler is the
|
|
following:
|
|
<programlisting>
|
|
static void
|
|
handle_sighup(SIGNAL_ARGS)
|
|
{
|
|
int save_errno = errno;
|
|
|
|
got_SIGHUP = true;
|
|
SetLatch(MyLatch);
|
|
|
|
errno = save_errno;
|
|
}
|
|
</programlisting>
|
|
<varname>errno</varname> is saved and restored because
|
|
<function>SetLatch()</function> might change it. If that were not done
|
|
interrupted code that's currently inspecting <varname>errno</varname> might see the wrong
|
|
value.
|
|
</para>
|
|
</simplesect>
|
|
|
|
<simplesect>
|
|
<title>Calling Function Pointers</title>
|
|
|
|
<para>
|
|
For clarity, it is preferred to explicitly dereference a function pointer
|
|
when calling the pointed-to function if the pointer is a simple variable,
|
|
for example:
|
|
<programlisting>
|
|
(*emit_log_hook) (edata);
|
|
</programlisting>
|
|
(even though <literal>emit_log_hook(edata)</literal> would also work).
|
|
When the function pointer is part of a structure, then the extra
|
|
punctuation can and usually should be omitted, for example:
|
|
<programlisting>
|
|
paramInfo->paramFetch(paramInfo, paramId);
|
|
</programlisting>
|
|
</para>
|
|
</simplesect>
|
|
</sect1>
|
|
</chapter>
|