"val AS name" to "name := val", as per recent discussion.
This patch catches everything in the original named-parameters patch,
but I'm not certain that no other dependencies snuck in later (grepping
the source tree for all uses of AS soon proved unworkable).
In passing I note that we've dropped the ball at least once on keeping
ecpg's lexer (as opposed to parser) in sync with the backend. It would
be a good idea to go through all of pgc.l and see if it's in sync now.
I didn't attempt that at the moment.
directly. This was a lot of trouble, but should be worth it in terms of
not having to keep the plpgsql lexer in step with core anymore. In addition
the handling of keywords is significantly better-structured, allowing us to
de-reserve a number of words that plpgsql formerly treated as reserved.
of different parsers having different YYSTYPE unions that they want to use
with it. I defined a new union core_YYSTYPE that is just the (very short)
list of semantic values returned by the core scanner. I had originally
worried that this would require an extra interface layer, but actually we can
have parser.c's base_yylex (formerly filtered_base_yylex) take care of that at
no extra cost. Names associated with the core scanner are now "core_yy_foo",
with "base_yy_foo" being used in the core Bison parser and the parser.c
interface layer.
This solves the last serious stumbling block to eliminating plpgsql's separate
lexer. One restriction that will still be present is that plpgsql and the
core will have to agree on the token numbers assigned to tokens that can be
returned by the core lexer. Since Bison doesn't seem willing to accept
external assignments of those numbers, we'll have to live with decreeing that
core and plpgsql grammars declare these tokens first and in the same order.
Changes:
Pass in the keyword lookup array instead of having it be hardwired.
(This incidentally allows elimination of some duplicate coding in ecpg.)
Re-order the token declarations in gram.y so that non-keyword tokens have
numbers that won't change when keywords are added or removed.
Add ".." and ":=" to the set of tokens recognized by scan.l. (Since these
combinations are nowhere legal in core SQL, this does not change anything
except the precise wording of the error you get when you write this.)
size_t arguments, the emitted scanner actually prototypes them with
type yy_size_t, which is sometimes not the same thing depending on
flex version and platform. Easiest fix seems to be to use yy_size_t.
Per buildfarm results.
of features added to flex and bison since this code was originally written.
This change doesn't in itself offer any new capability, but it's needed
infrastructure for planned improvements in plpgsql.
Another feature now available in flex is the ability to make it use palloc
instead of malloc, so do that to avoid possible memory leaks. (We should
at some point change the other lexers likewise, but this commit doesn't
touch them.)
distinction between the external API (parser.h) and declarations that only
need to be visible within the raw parser code (gramparse.h, which now is only
included by parser.c, gram.y, scan.l, and keywords.c). This is in preparation
for the upcoming change to a reentrant lexer, which will require referencing
YYSTYPE in the declarations of base_yylex and filtered_base_yylex, hence
gram.h will have to be included by gramparse.h. We don't want any more files
than absolutely necessary to depend on gram.h, so some cleanup is called for.
select u&42 from table-with-a-u-column;
Also fix missing SET_YYLLOC() in the {dolqfailed} production that I suppose
this was based on. The latter is a pre-existing bug, but the only effect
is to misplace the error cursor by one token, so probably not worth
backpatching.
likewise increase the initial size of the scanner's literal buffer to 1024
(from 128). Instrumentation of the regression tests suggests that this
saves a useful amount of repalloc() traffic --- the number of calls occurring
during one set of tests drops from about 6900 to about 3900. The old sizes
were chosen in the late 90's with an eye to machines much smaller than
are common today.
return true for exactly the characters treated as whitespace by their flex
scanners. Per report from Victor Snezhko and subsequent investigation.
Also fix a passel of unsafe usages of <ctype.h> functions, that is, ye olde
char-vs-unsigned-char issue. I won't miss <ctype.h> when we are finally
able to stop using it.
parser will allow "\'" to be used to represent a literal quote mark. The
"\'" representation has been deprecated for some time in favor of the
SQL-standard representation "''" (two single quote marks), but it has been
used often enough that just disallowing it immediately won't do. Hence
backslash_quote allows the settings "on", "off", and "safe_encoding",
the last meaning to allow "\'" only if client_encoding is a valid server
encoding. That is now the default, and the reason is that in encodings
such as SJIS that allow 0x5c (ASCII backslash) to be the last byte of a
multibyte character, accepting "\'" allows SQL-injection attacks as per
CVE-2006-2314 (further details will be published after release). The
"on" setting is available for backward compatibility, but it must not be
used with clients that are exposed to untrusted input.
Thanks to Akio Ishida and Yasuo Ohgaki for identifying this security issue.
throw warnings for 100%-SQL-standard constructs, clean up some minor
infelicities, try to un-break ecpg to the best of my ability. (It's not clear
how ecpg is going to find out the setting of standard_conforming_strings,
though.) I think pg_dump still needs work, too.
during parse analysis, not only errors detected in the flex/bison stages.
This is per my earlier proposal. This commit includes all the basic
infrastructure, but locations are only tracked and reported for errors
involving column references, function calls, and operators. More could
be done later but this seems like a good set to start with. I've also
moved the ReportSyntaxErrorPosition logic out of psql and into libpq,
which should make it available to more people --- even within psql this
is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
not likely ever to be implemented seeing it's been removed from SQL2003.
This allows getting rid of the 'filter' version of yylex() that we had in
parser.c, which should save at least a few microseconds in parsing.
with main, avoid using a SQL-defined SQLSTATE for what is most definitely
not a SQL-compatible error condition, fix documentation omissions,
adhere to message style guidelines, don't use two GUC_REPORT variables
when one is sufficient. Nothing done about pg_dump issues.
literally.
Add GUC variables:
"escape_string_warning" - warn about backslashes in non-E strings
"escape_string_syntax" - supports E'' syntax?
"standard_compliant_strings" - treats backslashes literally in ''
Update code to use E'' when escapes are used.
scanner anyway) to avoid having any backup states. According to the
flex manual, this should speed things up, and indeed the backend scanner
is about a third faster according to some quick profiling checks.
I haven't tried to measure the speed change in psql, but it probably
is similar.
Document use of macros for pg_printf functions.
Bump major versions of all interfaces to handle movement of get_progname
from libpq to libpgport in 8.0, and probably other libpgport changes in 8.1.
Also performed an initial run through of upgrading our Copyright date to
extend to 2005 ... first run here was very simple ... change everything
where: grep 1996-2004 && the word 'Copyright' ... scanned through the
generated list with 'less' first, and after, to make sure that I only
picked up the right entries ...