Use "a" and "an" correctly, mostly in comments. Two error messages were
also fixed (they were just elogs, so no translation work required). Two
function comments in pg_proc.h were also fixed. Etsuro Fujita reported one
of these, but I found a lot more with grep.
Also fix a few other typos spotted while grepping for the a/an typos.
For example, "consists out of ..." -> "consists of ...". Plus a "though"/
"through" mixup reported by Euler Taveira.
Many of these typos were in old code, which would be nice to backpatch to
make future backpatching easier. But much of the code was new, and I didn't
feel like crafting separate patches for each branch. So no backpatching.
This reverts commit 16304a013432931e61e623c8d85e9fe24709d9ba, except
for its changes in src/port/snprintf.c; as well as commit
cac18a76bb6b08f1ecc2a85e46c9d2ab82dd9d23 which is no longer needed.
Fujii Masao reported that the previous commit caused failures in psql on
OS X, since if one exits the pager program early while viewing a query
result, psql sees an EPIPE error from fprintf --- and the wrapper function
thought that was reason to panic. (It's a bit surprising that the same
does not happen on Linux.) Further discussion among the security list
concluded that the risk of other such failures was far too great, and
that the one-size-fits-all approach to error handling embodied in the
previous patch is unlikely to be workable.
This leaves us again exposed to the possibility of the type of failure
envisioned in CVE-2015-3166. However, that failure mode is strictly
hypothetical at this point: there is no concrete reason to believe that
an attacker could trigger information disclosure through the supposed
mechanism. In the first place, the attack surface is fairly limited,
since so much of what the backend does with format strings goes through
stringinfo.c or psprintf(), and those already had adequate defenses.
In the second place, even granting that an unprivileged attacker could
control the occurrence of ENOMEM with some precision, it's a stretch to
believe that he could induce it just where the target buffer contains some
valuable information. So we concluded that the risk of non-hypothetical
problems induced by the patch greatly outweighs the security risks.
We will therefore revert, and instead undertake closer analysis to
identify specific calls that may need hardening, rather than attempt a
universal solution.
We have kept the portion of the previous patch that improved snprintf.c's
handling of errors when it calls the platform's sprintf(). That seems to
be an unalloyed improvement.
Security: CVE-2015-3166
All known standard library implementations of these functions can fail
with ENOMEM. A caller neglecting to check for failure would experience
missing output, information exposure, or a crash. Check return values
within wrappers and code, currently just snprintf.c, that bypasses the
wrappers. The wrappers do not return after an error, so their callers
need not check. Back-patch to 9.0 (all supported versions).
Popular free software standard library implementations do take pains to
bypass malloc() in simple cases, but they risk ENOMEM for floating point
numbers, positional arguments, large field widths, and large precisions.
No specification demands such caution, so this commit regards every call
to a printf family function as a potential threat.
Injecting the wrappers implicitly is a compromise between patch scope
and design goals. I would prefer to edit each call site to name a
wrapper explicitly. libpq and the ECPG libraries would, ideally, convey
errors to the caller rather than abort(). All that would be painfully
invasive for a back-patched security fix, hence this compromise.
Security: CVE-2015-3166
The "check" target no longer needs to depend on "all", because it now
runs "install" directly, which in turn depends on "all". Doing both
will cause problems with parallel make, because two builds will run next
to each other.
Also remove the redirection of the temp-install output into a log file.
This was appropriate when this was done from within pg_regress, but now
it's just a regular make run, and especially with the above changes this
will now take the place of running the "all" target before the test
suites.
problem report by Jeff Janes, patch in part by Michael Paquier
This provides a mechanism for specifying conversions between SQL data
types and procedural languages. As examples, there are transforms
for hstore and ltree for PL/Perl and PL/Python.
reviews by Pavel Stěhule and Andres Freund
Each of the libraries incorporates src/port files, which often check
FRONTEND. Build systems disagreed on whether to build libpgtypes this
way. Only libecpg incorporates files that rely on it today. Back-patch
to 9.0 (all supported versions) to forestall surprises.
Before, make check-world would create a new temporary installation for
each test suite, which is slow and wasteful. Instead, we now create one
test installation that is used by all test suites that are part of a
make run.
The management of the temporary installation is removed from pg_regress
and handled in the makefiles. This allows for better control, and
unifies the code with that of test suites not run through pg_regress.
review and msvc support by Michael Paquier <michael.paquier@gmail.com>
more review by Fabien Coelho <coelho@cri.ensmp.fr>
If someone else already set the callbacks, don't overwrite them with
ours. When unsetting the callbacks, only unset them if they point to
ours.
Author: Jan Urbański <wulczer@wulczer.org>
This is the second try at this, after fcef1617295 failed miserably and
had to be reverted: as it turns out, libpq cannot depend on libpgcommon
after all. Instead of shuffling code in the master branch, make that one
just like 9.4 and accept the duplication. (This was all my own mistake,
not the patch submitter's).
psql was already accepting conninfo strings as the first parameter in
\connect, but the way it worked wasn't sane; some of the other
parameters would get the previous connection's values, causing it to
connect to a completely unexpected server or, more likely, not finding
any server at all because of completely wrong combinations of
parameters.
Fix by explicitely checking for a conninfo-looking parameter in the
dbname position; if one is found, use its complete specification rather
than mix with the other arguments. Also, change tab-completion to not
try to complete conninfo/URI-looking "dbnames" and document that
conninfos are accepted as first argument.
There was a weak consensus to backpatch this, because while the behavior
of using the dbname as a conninfo is nowhere documented for \connect, it
is reasonable to expect that it works because it does work in many other
contexts. Therefore this is backpatched all the way back to 9.0.
Author: David Fetter, Andrew Dunstan. Some editorialization by me
(probably earning a Gierth's "Sloppy" badge in the process.)
Reviewers: Andrew Gierth, Erik Rijkers, Pavel Stěhule, Stephen Frost,
Robert Haas, Andrew Dunstan.
psql was already accepting conninfo strings as the first parameter in
\connect, but the way it worked wasn't sane; some of the other
parameters would get the previous connection's values, causing it to
connect to a completely unexpected server or, more likely, not finding
any server at all because of completely wrong combinations of
parameters.
Fix by explicitely checking for a conninfo-looking parameter in the
dbname position; if one is found, use its complete specification rather
than mix with the other arguments. Also, change tab-completion to not
try to complete conninfo/URI-looking "dbnames" and document that
conninfos are accepted as first argument.
There was a weak consensus to backpatch this, because while the behavior
of using the dbname as a conninfo is nowhere documented for \connect, it
is reasonable to expect that it works because it does work in many other
contexts. Therefore this is backpatched all the way back to 9.0.
To implement this, routines previously private to libpq have been
duplicated so that psql can decide what looks like a conninfo/URI
string. In back branches, just duplicate the same code all the way back
to 9.2, where URIs where introduced; 9.0 and 9.1 have a simpler version.
In master, the routines are moved to src/common and renamed.
Author: David Fetter, Andrew Dunstan. Some editorialization by me
(probably earning a Gierth's "Sloppy" badge in the process.)
Reviewers: Andrew Gierth, Erik Rijkers, Pavel Stěhule, Stephen Frost,
Robert Haas, Andrew Dunstan.
This improves on commit bbfd7edae5aa5ad5553d3c7e102f2e450d4380d4 by
making two simple changes:
* pg_attribute_noreturn now takes parentheses, ie pg_attribute_noreturn().
Likewise pg_attribute_unused(), pg_attribute_packed(). This reduces
pgindent's tendency to misformat declarations involving them.
* attributes are now always attached to function declarations, not
definitions. Previously some places were taking creative shortcuts,
which were not merely candidates for bad misformatting by pgindent
but often were outright wrong anyway. (It does little good to put a
noreturn annotation where callers can't see it.) In any case, if
we would like to believe that these macros can be used with non-gcc
compilers, we should avoid gratuitous variance in usage patterns.
I also went through and manually improved the formatting of a lot of
declarations, and got rid of excessively repetitive (and now obsolete
anyway) comments informing the reader what pg_attribute_printf is for.
While the SQL standard is pretty vague on the overall topic of operator
precedence (because it never presents a unified BNF for all expressions),
it does seem reasonable to conclude from the spec for <boolean value
expression> that OR has the lowest precedence, then AND, then NOT, then IS
tests, then the six standard comparison operators, then everything else
(since any non-boolean operator in a WHERE clause would need to be an
argument of one of these).
We were only sort of on board with that: most notably, while "<" ">" and
"=" had properly low precedence, "<=" ">=" and "<>" were treated as generic
operators and so had significantly higher precedence. And "IS" tests were
even higher precedence than those, which is very clearly wrong per spec.
Another problem was that "foo NOT SOMETHING bar" constructs, such as
"x NOT LIKE y", were treated inconsistently because of a bison
implementation artifact: they had the documented precedence with respect
to operators to their right, but behaved like NOT (i.e., very low priority)
with respect to operators to their left.
Fixing the precedence issues is just a small matter of rearranging the
precedence declarations in gram.y, except for the NOT problem, which
requires adding an additional lookahead case in base_yylex() so that we
can attach a different token precedence to NOT LIKE and allied two-word
operators.
The bulk of this patch is not the bug fix per se, but adding logic to
parse_expr.c to allow giving warnings if an expression has changed meaning
because of these precedence changes. These warnings are off by default
and are enabled by the new GUC operator_precedence_warning. It's believed
that very few applications will be affected by these changes, but it was
agreed that a warning mechanism is essential to help debug any that are.
Until now __attribute__() was defined to be empty for all compilers but
gcc. That's problematic because it prevents using it in other compilers;
which is necessary e.g. for atomics portability. It's also just
generally dubious to do so in a header as widely included as c.h.
Instead add pg_attribute_format_arg, pg_attribute_printf,
pg_attribute_noreturn macros which are implemented in the compilers that
understand them. Also add pg_attribute_noreturn and pg_attribute_packed,
but don't provide fallbacks, since they can affect functionality.
This means that external code that, possibly unwittingly, relied on
__attribute__ defined to be empty on !gcc compilers may now run into
warnings or errors on those compilers. But there shouldn't be many
occurances of that and it's hard to work around...
Discussion: 54B58BA3.8040302@ohmu.fi
Author: Oskari Saarenmaa, with some minor changes by me.
Commit 865f14a2d31af23a05bbf2df04c274629c5d5c4d was quite a few bricks
shy of a load: psql, ecpg, and plpgsql were all left out-of-step with
the core lexer. Of these only the last was likely to be a fatal
problem; but still, a minimal amount of grepping, or even just reading
the comments adjacent to the places that were changed, would have found
the other places that needed to be changed.
This is a possibly-vain effort to silence a Coverity warning about
bogus endianness dependency. The code's fine, because it takes care
of endianness issues for itself, but Coverity sees an int64 being
passed to an int* argument and not unreasonably suspects something's
wrong. I'm not sure if putting the void* cast in the way will shut it
up; but it can't hurt and seems better from a documentation standpoint
anyway, since the pointer is not used as an int* in this code path.
Just for a bit of additional safety, verify that the result length
is 8 bytes as expected.
Back-patch to 9.3 where the code in question was added.
The SGML docs claimed that 1-byte integers could be sent or received with
the "isint" options, but no such behavior has ever been implemented in
pqGetInt() or pqPutInt(). The in-code documentation header for PQfn() was
even less in tune with reality, and the code itself used parameter names
matching neither the SGML docs nor its libpq-fe.h declaration. Do a bit
of additional wordsmithing on the SGML docs while at it.
Since the business about 1-byte integers is a clear documentation bug,
back-patch to all supported branches.
There are a couple of places in our grammar that fail to be strict LALR(1),
by requiring more than a single token of lookahead to decide what to do.
Up to now we've dealt with that by using a filter between the lexer and
parser that merges adjacent tokens into one in the places where two tokens
of lookahead are necessary. But that creates a number of user-visible
anomalies, for instance that you can't name a CTE "ordinality" because
"WITH ordinality AS ..." triggers folding of WITH and ORDINALITY into one
token. I realized that there's a better way.
In this patch, we still do the lookahead basically as before, but we never
merge the second token into the first; we replace just the first token by
a special lookahead symbol when one of the lookahead pairs is seen.
This requires a couple extra productions in the grammar, but it involves
fewer special tokens, so that the grammar tables come out a bit smaller
than before. The filter logic is no slower than before, perhaps a bit
faster.
I also fixed the filter logic so that when backing up after a lookahead,
the current token's terminator is correctly restored; this eliminates some
weird behavior in error message issuance, as is shown by the one change in
existing regression test outputs.
I believe that this patch entirely eliminates odd behaviors caused by
lookahead for WITH. It doesn't really improve the situation for NULLS
followed by FIRST/LAST unfortunately: those sequences still act like a
reserved word, even though there are cases where they should be seen as two
ordinary identifiers, eg "SELECT nulls first FROM ...". I experimented
with additional grammar hacks but couldn't find any simple solution for
that. Still, this is better than before, and it seems much more likely
that we *could* somehow solve the NULLS case on the basis of this filter
behavior than the previous one.
If libpq output buffer is full, pqSendSome() function tries to drain any
incoming data. This avoids deadlock, if the server e.g. sends a lot of
NOTICE messages, and blocks until we read them. However, pqSendSome() only
did that in blocking mode. In non-blocking mode, the deadlock could still
happen.
To fix, take a two-pronged approach:
1. Change the documentation to instruct that when PQflush() returns 1, you
should wait for both read- and write-ready, and call PQconsumeInput() if it
becomes read-ready. That fixes the deadlock, but applications are not going
to change overnight.
2. In pqSendSome(), drain the input buffer before returning 1. This
alleviates the problem for applications that only wait for write-ready. In
particular, a slow but steady stream of NOTICE messages during COPY FROM
STDIN will no longer cause a deadlock. The risk remains that the server
attempts to send a large burst of data and fills its output buffer, and at
the same time the client also sends enough data to fill its output buffer.
The application will deadlock if it goes to sleep, waiting for the socket
to become write-ready, before the server's data arrives. In practice,
NOTICE messages and such that the server might be sending are usually
short, so it's highly unlikely that the server would fill its output buffer
so quickly.
Backpatch to all supported versions.
After finding an "=" character, the pointer was advanced twice when it
should only advance once. This is harmless as long as the value after "="
has at least one character; but if it doesn't, we'd miss the terminator
character and include too much in the value.
In principle this could lead to reading off the end of memory. It does not
seem worth treating as a security issue though, because it would happen on
client side, and besides client logic that's taking conninfo strings from
untrusted sources has much worse security problems than this.
Report and patch received off-list from Thomas Fanghaenel.
Back-patch to 9.2 where the faulty code was introduced.
When ecpg was rewritten to the new protocol version not all variable types
were corrected. This patch rewrites the code for these types to fix that. It
also fixes the documentation to correctly tell the status of array handling.
actually check the returned pointer allocated, potentially NULL which
could be the result of a malloc call.
Issue noted by Coverity, fixed by Michael Paquier <michael@otacoo.com>
All the other new SSL information functions had dummy versions in
be-secure.c, but I missed PQsslAttributes(). Oops. Surprisingly, the linker
did not complain about the missing function on most platforms represented in
the buildfarm, even though it is exported, except for a few Windows systems.
This makes it possible to query for things like the SSL version and cipher
used, without depending on OpenSSL functions or macros. That is a good
thing if we ever get another SSL implementation.
PQgetssl() still works, but it should be considered as deprecated as it
only works with OpenSSL. In particular, PQgetSslInUse() should be used to
check if a connection uses SSL, because as soon as we have another
implementation, PQgetssl() will return NULL even if SSL is in use.
strncpy() has a well-deserved reputation for being unsafe, so make an
effort to get rid of nearly all occurrences in HEAD.
A large fraction of the remaining uses were passing length less than or
equal to the known strlen() of the source, in which case no null-padding
can occur and the behavior is equivalent to memcpy(), though doubtless
slower and certainly harder to reason about. So just use memcpy() in
these cases.
In other cases, use either StrNCpy() or strlcpy() as appropriate (depending
on whether padding to the full length of the destination buffer seems
useful).
I left a few strncpy() calls alone in the src/timezone/ code, to keep it
in sync with upstream (the IANA tzcode distribution). There are also a
few such calls in ecpg that could possibly do with more analysis.
AFAICT, none of these changes are more than cosmetic, except for the four
occurrences in fe-secure-openssl.c, which are in fact buggy: an overlength
source leads to a non-null-terminated destination buffer and ensuing
misbehavior. These don't seem like security issues, first because no stack
clobber is possible and second because if your values of sslcert etc are
coming from untrusted sources then you've got problems way worse than this.
Still, it's undesirable to have unpredictable behavior for overlength
inputs, so back-patch those four changes to all active branches.
Some users run their applications in chroot environments that lack an
/etc/passwd file. This means that the current UID's user name and home
directory are not obtainable. libpq used to be all right with that,
so long as the database role name to use was specified explicitly.
But commit a4c8f14364c27508233f8a31ac4b10a4c90235a9 broke such cases by
causing any failure of pg_fe_getauthname() to be treated as a hard error.
In any case it did little to advance its nominal goal of causing errors
in pg_fe_getauthname() to be reported better. So revert that and instead
put some real error-reporting code in place. This requires changes to the
APIs of pg_fe_getauthname() and pqGetpwuid(), since the latter had
departed from the POSIX-specified API of getpwuid_r() in a way that made
it impossible to distinguish actual lookup errors from "no such user".
To allow such failures to be reported, while not failing if the caller
supplies a role name, add a second call of pg_fe_getauthname() in
connectOptions2(). This is a tad ugly, and could perhaps be avoided with
some refactoring of PQsetdbLogin(), but I'll leave that idea for later.
(Note that the complained-of misbehavior only occurs in PQsetdbLogin,
not when using the PQconnect functions, because in the latter we will
never bother to call pg_fe_getauthname() if the user gives a role name.)
In passing also clean up the Windows-side usage of GetUserName(): the
recommended buffer size is 257 bytes, the passed buffer length should
be the buffer size not buffer size less 1, and any error is reported
by GetLastError() not errno.
Per report from Christoph Berg. Back-patch to 9.4 where the chroot
failure case was introduced. The generally poor reporting of errors
here is of very long standing, of course, but given the lack of field
complaints about it we won't risk changing these APIs further back
(even though they're theoretically internal to libpq).