1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-26 01:22:12 +03:00

Fix assorted bugs by changing TS_execute's callback API to ternary logic.

Text search sometimes failed to find valid matches, for instance
'!crew:A'::tsquery might fail to locate 'crew:1B'::tsvector during
an index search.  The root of the issue is that TS_execute's callback
functions were not changed to use ternary (yes/no/maybe) reporting
when we made the search logic itself do so.  It's somewhat annoying
to break that API, but on the other hand we now see that any code
using plain boolean logic is almost certainly broken since the
addition of phrase search.  There seem to be very few outside callers
of this code anyway, so we'll just break them intentionally to get
them to adapt.

This allows removal of tsginidx.c's private re-implementation of
TS_execute, since that's now entirely duplicative.  It's also no
longer necessary to avoid use of CALC_NOT in tsgistidx.c, since
the underlying callbacks can now do something reasonable.

Back-patch into v13.  We can't change this in stable branches,
but it seems not quite too late to fix it in v13.

Tom Lane and Pavel Borisov

Discussion: https://postgr.es/m/CALT9ZEE-aLotzBg-pOp2GFTesGWVYzXA3=mZKzRDa_OKnLF7Mg@mail.gmail.com
This commit is contained in:
Tom Lane
2020-07-24 15:26:51 -04:00
parent 25244b8972
commit 2f2007fbb2
10 changed files with 395 additions and 189 deletions

View File

@ -124,13 +124,21 @@ extern text *generateHeadline(HeadlineParsedText *prs);
* whether a given primitive tsquery value is matched in the data.
*/
/* TS_execute requires ternary logic to handle NOT with phrase matches */
typedef enum
{
TS_NO, /* definitely no match */
TS_YES, /* definitely does match */
TS_MAYBE /* can't verify match for lack of pos data */
} TSTernaryValue;
/*
* struct ExecPhraseData is passed to a TSExecuteCallback function if we need
* lexeme position data (because of a phrase-match operator in the tsquery).
* The callback should fill in position data when it returns true (success).
* If it cannot return position data, it may leave "data" unchanged, but
* then the caller of TS_execute() must pass the TS_EXEC_PHRASE_NO_POS flag
* and must arrange for a later recheck with position data available.
* The callback should fill in position data when it returns TS_YES (success).
* If it cannot return position data, it should leave "data" unchanged and
* return TS_MAYBE. The caller of TS_execute() must then arrange for a later
* recheck with position data available.
*
* The reported lexeme positions must be sorted and unique. Callers must only
* consult the position bits of the pos array, ie, WEP_GETPOS(data->pos[i]).
@ -162,12 +170,13 @@ typedef struct ExecPhraseData
* val: lexeme to test for presence of
* data: to be filled with lexeme positions; NULL if position data not needed
*
* Return true if lexeme is present in data, else false. If data is not
* NULL, it should be filled with lexeme positions, but function can leave
* it as zeroes if position data is not available.
* Return TS_YES if lexeme is present in data, TS_MAYBE if it might be
* present, TS_NO if it definitely is not present. If data is not NULL,
* it must be filled with lexeme positions if available. If position data
* is not available, leave *data as zeroes and return TS_MAYBE, never TS_YES.
*/
typedef bool (*TSExecuteCallback) (void *arg, QueryOperand *val,
ExecPhraseData *data);
typedef TSTernaryValue (*TSExecuteCallback) (void *arg, QueryOperand *val,
ExecPhraseData *data);
/*
* Flag bits for TS_execute
@ -175,10 +184,7 @@ typedef bool (*TSExecuteCallback) (void *arg, QueryOperand *val,
#define TS_EXEC_EMPTY (0x00)
/*
* If TS_EXEC_CALC_NOT is not set, then NOT expressions are automatically
* evaluated to be true. Useful in cases where NOT cannot be accurately
* computed (GiST) or it isn't important (ranking). From TS_execute's
* perspective, !CALC_NOT means that the TSExecuteCallback function might
* return false-positive indications of a lexeme's presence.
* evaluated to be true. Useful in cases where NOT isn't important (ranking).
*/
#define TS_EXEC_CALC_NOT (0x01)
/*