Prior to this patch, every FETCH call would generate a unique queryId
with a different size specified. Depending on the workloads, this could
lead to a significant bloat in pg_stat_statements, as repeatedly calling
a specific cursor would result in a new queryId each time. For example,
FETCH 1 c1; and FETCH 2 c1; would produce different queryIds.
This patch improves the situation by normalizing the fetch size, so as
semantically similar statements generate the same queryId. As a result,
statements like the below, which differ syntactically but have the same
effect, will now share a single queryId:
FETCH FROM c1
FETCH NEXT c1
FETCH 1 c1
In order to do a normalization based on the keyword used in FETCH,
FetchStmt is tweaked with a new FetchDirectionKeywords. This matters
for "howMany", which could be set to a negative value depending on the
direction, and we want to normalize the queries with enough information
about the direction keywords provided, including RELATIVE, ABSOLUTE or
all the ALL variants.
Author: Sami Imseih <samimseih@gmail.com>
Discussion: https://postgr.es/m/CAA5RZ0tA6LbHCg2qSS+KuM850BZC_+ZgHV7Ug6BXw22TNyF+MA@mail.gmail.com
This is preliminary patch. It adds NOT NULL checking for the result of
pg_stat_statements_reset() function. It is needed for upcoming patch
"Track statement entry timestamp" that will change the result type of
this function to the timestamp of a reset performed.
Discussion: https://postgr.es/m/flat/72e80e7b160a6eb189df9ef6f068cce3765d37f8.camel%40moonset.ru
Author: Andrei Zubkov
Reviewed-by: Julien Rouhaud, Hayato Kuroda, Yuki Seino, Chengxi Sun
Reviewed-by: Anton Melnikov, Darren Rush, Michael Paquier, Sergei Kornilov
Reviewed-by: Alena Rybakina, Andrei Lepikhov
This commit adds more coverage for utility statements so as it is
possible to track down all the effects of query normalization done for
all the queries that use either Const or A_Const nodes, which are the
nodes where normalization makes the most sense as they apply to
constants (well, most of the time, really).
This set of queries is extracted from an analysis done while looking at
full dumps of the regression database when applying different levels of
normalization to either Const or A_Const nodes for utilities, as of a
minimal set of these, for:
- All relkinds (CREATE, ALTER, DROP)
- Policies
- Cursors
- Triggers
- Types
- Rules
- Statistics
- CALL
- Transaction statements (isolation level, options)
- EXPLAIN
- COPY
Note that pg_stat_statements is not switched yet to show any
normalization for utilities, still it improves the default coverage of
the query jumbling code (not by as much as enabling query jumbling on
the main regression test suite, though):
- queryjumblefuncs.funcs.c: 36.8% => 48.5%
- queryjumblefuncs.switch.c: 33.2% => 43.1%
Reviewed-by: Bertrand Drouvot
Discussion: https://postgr.es/m/Y+MRdEq9W9XVa2AB@paquier.xyz