1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

Allow sampling of statements depending on duration

This allows logging a sample of statements, without incurring excessive
log traffic (which may impact performance).  This can be useful when
analyzing workloads with lots of short queries.

The sampling is configured using two new GUC parameters:

 * log_min_duration_sample - minimum required statement duration

 * log_statement_sample_rate - sample rate (0.0 - 1.0)

Only statements with duration exceeding log_min_duration_sample are
considered for sampling. To enable sampling, both those GUCs have to
be set correctly.

The existing log_min_duration_statement GUC has a higher priority, i.e.
statements with duration exceeding log_min_duration_statement will be
always logged, irrespectedly of how the sampling is configured. This
means only configurations

  log_min_duration_sample < log_min_duration_statement

do actually sample the statements, instead of logging everything.

Author: Adrien Nayrat
Reviewed-by: David Rowley, Vik Fearing, Tomas Vondra
Discussion: https://postgr.es/m/bbe0a1a8-a8f7-3be2-155a-888e661cc06c@anayrat.info
This commit is contained in:
Tomas Vondra
2019-11-04 01:57:45 +01:00
parent 11d9ac28e5
commit 6e3e6cc0e8
5 changed files with 153 additions and 12 deletions

View File

@ -5950,6 +5950,12 @@ local0.* /var/log/postgresql
Only superusers can change this setting.
</para>
<para>
This overrides <xref linkend="guc-log-min-duration-sample"/>,
meaning that queries with duration exceeding this setting are not
subject to sampling and are always logged.
</para>
<para>
For clients using extended query protocol, durations of the Parse,
Bind, and Execute steps are logged independently.
@ -5972,6 +5978,85 @@ local0.* /var/log/postgresql
</listitem>
</varlistentry>
<varlistentry id="guc-log-min-duration-sample" xreflabel="log_min_duration_sample">
<term><varname>log_min_duration_sample</varname> (<type>integer</type>)
<indexterm>
<primary><varname>log_min_duration_sample</varname> configuration parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Allows to sample the logging of the duration of each completed
statement if the statement ran for at least the specified amount of
time. If this value is specified without units, it is taken as milliseconds.
Setting this to zero samples all statement durations.
Minus-one (the default) disables sampling statement durations.
For example, if you set it to <literal>250ms</literal>
then all SQL statements that run 250ms or longer will be considered
for sampling, with sample rate is controlled by <xref linkend="guc-log-statement-sample-rate"/>.
Enabling this parameter can be helpful when the traffic too high to
sample all queries.
Only superusers can change this setting.
</para>
<para>
This option has lower priority than <xref linkend="guc-log-min-duration-statement"/>,
meaning that statements with durations exceeding <xref linkend="guc-log-min-duration-statement"/>
are not subject to sampling and are always logged.
</para>
<para>
For clients using extended query protocol, durations of the Parse,
Bind, and Execute steps are logged independently.
</para>
<note>
<para>
When using this option together with
<xref linkend="guc-log-statement"/>,
the text of statements that are logged because of
<varname>log_statement</varname> will not be repeated in the
duration log message.
If you are not using <application>syslog</application>, it is recommended
that you log the PID or session ID using
<xref linkend="guc-log-line-prefix"/>
so that you can link the statement message to the later
duration message using the process ID or session ID.
</para>
</note>
</listitem>
</varlistentry>
<varlistentry id="guc-log-statement-sample-rate" xreflabel="log_statement_sample_rate">
<term><varname>log_statement_sample_rate</varname> (<type>real</type>)
<indexterm>
<primary><varname>log_statement_sample_rate</varname> configuration parameter</primary>
</indexterm>
</term>
<listitem>
<para>
Determines the fraction of statements with duration exceeding
<xref linkend="guc-log-min-duration-sample"/> to be logged.
This is a statistical parameter, for example <literal>0.5</literal>
means there is statistically one in two chances to log the statement.
The default is <literal>1.0</literal>, meaning log all such
statements.
Setting this to zero disables sampling logging, same as setting
<varname>log_min_duration_sample</varname> to
<literal>-1</literal>.
<varname>log_statement_sample_rate</varname> is helpful when the
traffic is too high to log all queries.
Only superusers can change this setting.
</para>
<note>
<para>
Like all statement-logging options, this option can add significant
overhead.
</para>
</note>
</listitem>
</varlistentry>
<varlistentry id="guc-log-transaction-sample-rate" xreflabel="log_transaction_sample_rate">
<term><varname>log_transaction_sample_rate</varname> (<type>real</type>)
<indexterm>