conversion of basic ASCII letters. Remove all uses of strcasecmp and
strncasecmp in favor of new functions pg_strcasecmp and pg_strncasecmp;
remove most but not all direct uses of toupper and tolower in favor of
pg_toupper and pg_tolower. These functions use the same notions of
case folding already developed for identifier case conversion. I left
the straight locale-based folding in place for situations where we are
just manipulating user data and not trying to match it to built-in
strings --- for example, the SQL upper() function is still locale
dependent. Perhaps this will prove not to be what's wanted, but at
the moment we can initdb and pass regression tests in Turkish locale.
behavior of malloc and realloc when request size is 0. Fix escape
sequence recognizer so that only valid 3-digit octal sequences are
treated as escape sequences ... isdigit() is not a correct test.
will downcase the supplied field name unless it is double-quoted. Also,
upgrade the routine's handling of double quotes to match the backend,
in particular support doubled double quotes within quoted identifiers.
Per pgsql-interfaces discussion a couple weeks ago.
via extended query protocol, because it sends Sync right after Execute
without realizing that the command to be executed is COPY. There seems
to be no reasonable way for it to realize that, either, so the best fix
seems to be to make the backend ignore Sync during copy-in mode. Bit of
a wart on the protocol, but little alternative. Also, libpq must send
another Sync after terminating the COPY, if the command was issued via
Execute.
libpq users to perform Bind/Execute of previously prepared statements.
Per yesterday's discussion, this offers enough performance improvement
to justify bending the 'no new features during beta' rule.
minute and a half to decode a 500Kb on a fairly fast machine. I think the
culprit is sscanf.
I attach a patch that replaces the function with one used to perform the same
task in pyPgSQL (a Python interface to PostgreSQL). This code was written by
Billy Allie, author of pyPgSQL. I've changed a few variable names to match
those in the original code and removed a bit of Pythonness.
Billy has kindly looked at the code and points out that it is slightly
stricter than the original implementation and if it encounters an invalid
bytea such as '\12C' it drops the unescape '\' and outputs '12C'.
The code is licensed by the author under a BSD license.
I've performed limited testing of the function by putting JPEGs into
PostgreSQL, extracting them using them using the new function and diffing
against the original files.
The new function is significantly faster on my machine with the JPEGs being
decoded in less than a second. I attach a modified libpq example program that
I used for my testing.
Ben Lamb.
protocol 3, then falls back to 2 if postmaster rejects the startup packet
with an old-format error message. A side benefit of the rewrite is that
SSL-encrypted connections can now be made without blocking. (I think,
anyway, but do not have a good way to test.)
handle multiple 'formats' for data I/O. Restructure CommandDest and
DestReceiver stuff one more time (it's finally starting to look a bit
clean though). Code now matches latest 3.0 protocol document as far
as message formats go --- but there is no support for binary I/O yet.
for tableID/columnID in RowDescription. (The latter isn't really
implemented yet though --- the backend always sends zeroes, and libpq
just throws away the data.)
initial values and runtime changes in selected parameters. This gets
rid of the need for an initial 'select pg_client_encoding()' query in
libpq, bringing us back to one message transmitted in each direction
for a standard connection startup. To allow server version to be sent
using the same GUC mechanism that handles other parameters, invent the
concept of a never-settable GUC parameter: you can 'show server_version'
but it's not settable by any GUC input source. Create 'lc_collate' and
'lc_ctype' never-settable parameters so that people can find out these
settings without need for pg_controldata. (These side ideas were all
discussed some time ago in pgsql-hackers, but not yet implemented.)
rewritten and the protocol is changed, but most elog calls are still
elog calls. Also, we need to contemplate mechanisms for controlling
all this functionality --- eg, how much stuff should appear in the
postmaster log? And what API should libpq expose for it?
have length words. COPY OUT reimplemented per new protocol: it doesn't
need \. anymore, thank goodness. COPY BINARY to/from frontend works,
at least as far as the backend is concerned --- libpq's PQgetline API
is not up to snuff, and will have to be replaced with something that is
null-safe. libpq uses message length words for performance improvement
(no cycles wasted rescanning long messages), but not yet for error
recovery.
value '-2' is used to indicate a variable-width type whose width is
computed as strlen(datum)+1. Everything that looks at typlen is updated
except for array support, which Joe Conway is working on; at the moment
it wouldn't work to try to create an array of cstring.
This is necessary for mulibyte character sequences.
See "[HACKERS] PQescapeBytea is not multibyte aware" thread posted around
2002/04/05 for more details.
o Change all current CVS messages of NOTICE to WARNING. We were going
to do this just before 7.3 beta but it has to be done now, as you will
see below.
o Change current INFO messages that should be controlled by
client_min_messages to NOTICE.
o Force remaining INFO messages, like from EXPLAIN, VACUUM VERBOSE, etc.
to always go to the client.
o Remove INFO from the client_min_messages options and add NOTICE.
Seems we do need three non-ERROR elog levels to handle the various
behaviors we need for these messages.
Regression passed.
queries over non-blocking connections with libpq. "Larger" here
basically means that it doesn't fit into the output buffer.
The basic strategy is to fix pqFlush and pqPutBytes.
The problem with pqFlush as it stands now is that it returns EOF when an
error occurs or when not all data could be sent. The latter case is
clearly not an error for a non-blocking connection but the caller can't
distringuish it from an error very well.
The first part of the fix is therefore to fix pqFlush. This is done by
to renaming it to pqSendSome which only differs from pqFlush in its
return values to allow the caller to make the above distinction and a
new pqFlush which is implemented in terms of pqSendSome and behaves
exactly like the old pqFlush.
The second part of the fix modifies pqPutBytes to use pqSendSome instead
of pqFlush and to either send all the data or if not all data can be
sent on a non-blocking connection to at least put all data into the
output buffer, enlarging it if necessary. The callers of pqPutBytes
don't have to be changed because from their point of view pqPutBytes
behaves like before. It either succeeds in queueing all output data or
fails with an error.
I've also added a new API function PQsendSome which analogously to
PQflush just calls pqSendSome. Programs using non-blocking queries
should use this new function. The main difference is that this function
will have to be called repeatedly (calling select() properly in between)
until all data has been written.
AFAICT, the code in CVS HEAD hasn't changed with respect to non-blocking
queries and this fix should work there, too, but I haven't tested that
yet.
Bernhard Herzog
>
> 1. Now outputs '\\' instead of '\134' when using encode(bytea, 'escape')
> Note that I ended up leaving \0 as \000 so that there are no ambiguities
> when decoding something like, for example, \0123.
>
> 2. Fixed bug in byteain which allowed input values which were not valid
> octals (e.g. \789), to be parsed as if they were octals.
>
> Joe
>
Here's rev 2 of the bytea string support patch. Changes:
1. Added missing declaration for MatchBytea function
2. Added PQescapeBytea to fe-exec.c
3. Applies cleanly on cvs tip from this afternoon
I'm hoping that someone can review/approve/apply this before beta starts, so
I guess I'd vote (not that it counts for much) to delay beta a few days :-)
Joe Conway
> null bytes to be literally '\0', the following can happen:
> 1. User inputs string value as "<null byte>##" where ## are digits in the
> range of 0 to 7.
> 2. PQescapeString converts this to "\0##"
> 3. Escaped string is used in a context that causes "\0##" to be evaluated as
> an octal escape sequence.
I agree that this is a problem, though it is not possible to do
anything harmful with it. In addition, it only occurs if there are
any NUL characters in its input, which is very unlikely if you are
using C strings.
The patch below addresses the issue by removing escaping of \0
characters entirely.
> If the goal is to "safely" encode null bytes, and preserve the rest of the
> string as it was entered, I think the null bytes should be escaped as \\000
> (note that if you simply use \000 the same string truncation problem
> occurs).
We can't do that, this would require 4n + 1 bytes of storage for the
result, breaking the interface.
Florian Weimer
discussion on pgsql-hackers (especially the frightening memory dump in
<12273.999562219@sss.pgh.pa.us>), we decided that it is best not to
use identifiers from an untrusted source at all. Therefore, all
claims of the suitability of PQescapeString() for identifiers have
been removed.
Florian Weimer