Improve parser's one-extra-token lookahead mechanism.

mirror of https://github.com/postgres/postgres.git synced 2025-08-27 07:42:10 +03:00

There are a couple of places in our grammar that fail to be strict LALR(1),
by requiring more than a single token of lookahead to decide what to do.
Up to now we've dealt with that by using a filter between the lexer and
parser that merges adjacent tokens into one in the places where two tokens
of lookahead are necessary.  But that creates a number of user-visible
anomalies, for instance that you can't name a CTE "ordinality" because
"WITH ordinality AS ..." triggers folding of WITH and ORDINALITY into one
token.  I realized that there's a better way.

In this patch, we still do the lookahead basically as before, but we never
merge the second token into the first; we replace just the first token by
a special lookahead symbol when one of the lookahead pairs is seen.

This requires a couple extra productions in the grammar, but it involves
fewer special tokens, so that the grammar tables come out a bit smaller
than before.  The filter logic is no slower than before, perhaps a bit
faster.

I also fixed the filter logic so that when backing up after a lookahead,
the current token's terminator is correctly restored; this eliminates some
weird behavior in error message issuance, as is shown by the one change in
existing regression test outputs.

I believe that this patch entirely eliminates odd behaviors caused by
lookahead for WITH.  It doesn't really improve the situation for NULLS
followed by FIRST/LAST unfortunately: those sequences still act like a
reserved word, even though there are cases where they should be seen as two
ordinary identifiers, eg "SELECT nulls first FROM ...".  I experimented
with additional grammar hacks but couldn't find any simple solution for
that.  Still, this is better than before, and it seems much more likely
that we *could* somehow solve the NULLS case on the basis of this filter
behavior than the previous one.

This commit is contained in:

Tom Lane

2015-02-24 17:53:42 -05:00

parent 23a78352c0

commit d809fd0008

8 changed files with 173 additions and 113 deletions

									
										6

src/interfaces/ecpg/preproc/parse.pl
									
												View File
												
				@@ -42,10 +42,8 @@ my %replace_token = (

				# or in the block

				my %replace_string = (

					'WITH_TIME'       => 'with time',

					'WITH_ORDINALITY' => 'with ordinality',

					'NULLS_FIRST'     => 'nulls first',

					'NULLS_LAST'      => 'nulls last',

					'NULLS_LA'        => 'nulls',

					'WITH_LA'         => 'with',

					'TYPECAST'        => '::',

					'DOT_DOT'         => '..',

					'COLON_EQUALS'    => ':=',);

Improve parser's one-extra-token lookahead mechanism.

6 src/interfaces/ecpg/preproc/parse.pl Unescape Escape View File

6

src/interfaces/ecpg/preproc/parse.pl

View File