pg_stat_statements anticipates that certain constant locations may be
recorded multiple times and attempts to avoid calculating a length for
these locations in fill_in_constant_lengths().
However, during generate_normalized_query() where normalized query
strings are generated, these locations are not excluded from
consideration. This could increment the parameter number counter for
every recorded occurrence at such a location, leading to an incorrect
normalization in certain cases with gaps in the numbers reported.
For example, take this query:
SELECT WHERE '1' IN ('2'::int, '3'::int::text)
Before this commit, it would be normalized like that, with gaps in the
parameter numbers:
SELECT WHERE $1 IN ($3::int, $4::int::text)
However the correct, less confusing one should be like that:
SELECT WHERE $1 IN ($2::int, $3::int::text)
This commit fixes the computation of the parameter numbers to track the
number of constants replaced with an $n by a separate counter instead of
the iterator used to loop through the list of locations.
The underlying query IDs are not changed, neither are the normalized
strings for existing PGSS hash entries. New entries with fresh
normalized queries would automatically get reshaped based on the new
parameter numbering.
Issue discovered while discussing a separate problem for HEAD, but this
affects all the stable branches.
Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0tzxvWXsacGyxrixdhy3tTTDfJQqxyFBRFh31nNHBQ5qA@mail.gmail.com
Backpatch-through: 13
This commit adds query ID reports for two code paths when processing
extended query protocol messages:
- When receiving a bind message, setting it to the first Query retrieved
from a cached cache.
- When receiving an execute message, setting it to the first PlannedStmt
stored in a portal.
An advantage of this method is that this is able to cover all the types
of portals handled in the extended query protocol, particularly these
two when the report done in ExecutorStart() is not enough (neither is an
addition in ExecutorRun(), actually, for the second point):
- Multiple execute messages, with multiple ExecutorRun().
- Portal with execute/fetch messages, like a query with a RETURNING
clause and a fetch size that stores the tuples in a first execute
message going though ExecutorStart() and ExecuteRun(), followed by one
or more execute messages doing only fetches from the tuplestore created
in the first message. This corresponds to the case where
execute_is_fetch is set, for example.
Note that the query ID reporting done in ExecutorStart() is still
necessary, as an EXECUTE requires it. Query ID reporting is optimistic
and more calls to pgstat_report_query_id() don't matter as the first
report takes priority except if the report is forced. The comment in
ExecutorStart() is adjusted to reflect better the reality with the
extended query protocol.
The test added in pg_stat_statements is a courtesy of Robert Haas. This
uses psql's \bind metacommand, hence this part is backpatched down to
v16.
Reported-by: Kaido Vaikla, Erik Wienhold
Author: Sami Imseih
Reviewed-by: Jian He, Andrei Lepikhov, Michael Paquier
Discussion: https://postgr.es/m/CA+427g8DiW3aZ6pOpVgkPbqK97ouBdf18VLiHFesea2jUk3XoQ@mail.gmail.com
Discussion: https://postgr.es/m/CA+TgmoZxtnf_jZ=VqBSyaU8hfUkkwoJCJ6ufy4LGpXaunKrjrg@mail.gmail.com
Discussion: https://postgr.es/m/1391613709.939460.1684777418070@office.mailbox.org
Backpatch-through: 14