1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-09 06:21:09 +03:00
Commit Graph

1128 Commits

Author SHA1 Message Date
Robert Haas
c7ea68ff8d Limit maximum parallel degree to 1024.
This new limit affects both the max_parallel_degree GUC and the
parallel_degree reloption.  There may some day be a use case for using
more than 1024 CPUs for a single query, but that's surely not the case
right now.  Not only do not very many people have that many CPUs, but
the code hasn't been tested at that kind of scale and is very unlikely
to perform well, or even work at all, without a lot more work.  The
issue addressed by commit 06bd458cb8 is
probably just one problem of many.

The idea of a more reasonable limit here was suggested by Tom Lane;
the value of 1024 was suggested by Amit Kapila.
2016-05-06 14:50:54 -04:00
Robert Haas
1e77949e67 Note that max_worker_processes requires restart.
Since this is a minor issue, no back-patch.

Julien Rouhaud
2016-05-03 10:39:21 -04:00
Tom Lane
4c804fbdfb Clean up parsing of synchronous_standby_names GUC variable.
Commit 989be0810d added a flex/bison lexer/parser to interpret
synchronous_standby_names.  It was done in a pretty crufty way, though,
making assorted end-use sites responsible for calling the parser at the
right times.  That was not only vulnerable to errors of omission, but made
it possible that lexer/parser errors occur at very undesirable times,
and created memory leakages even if there was no error.

Instead, perform the parsing once during check_synchronous_standby_names
and let guc.c manage the resulting data.  To do that, we have to flatten
the parsed representation into a single hunk of malloc'd memory, but that
is not very hard.

While at it, work a little harder on making useful error reports for
parsing problems; the previous code felt that "synchronous_standby_names
parser returned 1" was an appropriate user-facing error message.  (To
be fair, it did also log a syntax error message, but separately from the
GUC problem report, which is at best confusing.)  It had some outright
bugs in the face of invalid input, too.

I (tgl) also concluded that we need to restrict unquoted names in
synchronous_standby_names to be just SQL identifiers.  The previous coding
would accept darn near anything, which (1) makes the quoting convention
both nearly-unnecessary and formally ambiguous, (2) makes it very hard to
understand what is a syntax error and what is a creative interpretation of
the input as a standby name, and (3) makes it impossible to further extend
the syntax in future without a compatibility break.  I presume that we're
intending future extensions of the syntax, else this parsing infrastructure
is massive overkill, so (3) is an important objection.  Since we've taken
a compatibility hit for non-identifier names with this change anyway, we
might as well lock things down now and insist that users use double quotes
for standby names that aren't identifiers.

Kyotaro Horiguchi and Tom Lane
2016-04-27 17:55:25 -04:00
Robert Haas
372ff7cae2 Fix wrong word.
Commit a31212b429 was a little too hasty.

Per report from Tom Lane.
2016-04-27 14:23:56 -04:00
Robert Haas
a31212b429 Change postgresql.conf.sample to say that fsync=off will corrupt data.
Discussion: 24748.1461764666@sss.pgh.pa.us

Per a suggestion from Craig Ringer.  This wording from Tom Lane,
following discussion.
2016-04-27 13:47:07 -04:00
Robert Haas
77cd477c4b Enable parallel query by default.
Change max_parallel_degree default from 0 to 2.  It is possible that
this is not a good idea, or that we should go with 1 worker rather
than 2, but we won't find out without trying it.  Along the way,
reword the documentation for max_parallel_degree a little bit to
hopefully make it more clear.

Discussion: 20160420174631.3qjjhpwsvvx5bau5@alap3.anarazel.de
2016-04-26 08:35:58 -04:00
Andres Freund
8f91d87d43 Fix documentation & config inconsistencies around 428b1d6b2.
Several issues:
1) checkpoint_flush_after doc and code disagreed about the default
2) new GUCs were missing from postgresql.conf.sample
3) Outdated source-code comment about bgwriter_flush_after's default
4) Sub-optimal categories assigned to new GUCs
5) Docs suggested backend_flush_after is PGC_SIGHUP, but it's PGC_USERSET.
6) Spell out int as integer in the docs, as done elsewhere

Reported-By: Magnus Hagander, Fujii Masao
Discussion: CAHGQGwETyTG5VYQQ5C_srwxWX7RXvFcD3dKROhvAWWhoSBdmZw@mail.gmail.com
2016-04-24 12:26:55 -07:00
Kevin Grittner
848ef42bb8 Add the "snapshot too old" feature
This feature is controlled by a new old_snapshot_threshold GUC.  A
value of -1 disables the feature, and that is the default.  The
value of 0 is just intended for testing.  Above that it is the
number of minutes a snapshot can reach before pruning and vacuum
are allowed to remove dead tuples which the snapshot would
otherwise protect.  The xmin associated with a transaction ID does
still protect dead tuples.  A connection which is using an "old"
snapshot does not get an error unless it accesses a page modified
recently enough that it might not be able to produce accurate
results.

This is similar to the Oracle feature, and we use the same SQLSTATE
and error message for compatibility.
2016-04-08 14:36:30 -05:00
Robert Haas
0711803775 Use quicksort, not replacement selection, for external sorting.
We still use replacement selection for the first run of the sort only
and only when the number of tuples is relatively small.  Otherwise,
the first run, and subsequent runs in all cases, are produced using
quicksort.  This tends to be faster except perhaps for very small
amounts of working memory.

Peter Geoghegan, reviewed by Tomas Vondra, Jeff Janes, Mithun Cy,
Greg Stark, and me.
2016-04-08 02:36:26 -04:00
Simon Riggs
137805f89a Use Foreign Key relationships to infer multi-column join selectivity
In cases where joins use multiple columns we currently assess each join
separately causing gross mis-estimates for join cardinality.

This patch adds use of FK information for the first time into the
planner. When FKs are present and we have multi-column join information,
plan estimates will be drastically improved. Cases with multiple FKs
are handled, though partial matches are ignored currently.

Net effect is substantial performance improvements for joins in many
common cases. Additional planning time is isolated to cases that are
currently performing poorly, measured at 0.08 - 0.15 ms.

Please watch for planner performance regressions; circumstances seem
unlikely but the law of unintended consequences may apply somewhen.
Additional complex tests welcome to prove this before release.

Tests can be performed using SET enable_fkey_estimates = on | off
using scripts provided during Hackers discussions, message id:
552335D9.3090707@2ndquadrant.com

Authors: Tomas Vondra and David Rowley
Reviewed and tested by Simon Riggs, adding comments only
2016-04-08 02:51:09 +01:00
Fujii Masao
989be0810d Support multiple synchronous standby servers.
Previously synchronous replication offered only the ability to confirm
that all changes made by a transaction had been transferred to at most
one synchronous standby server.

This commit extends synchronous replication so that it supports multiple
synchronous standby servers. It enables users to consider one or more
standby servers as synchronous, and increase the level of transaction
durability by ensuring that transaction commits wait for replies from
all of those synchronous standbys.

Multiple synchronous standby servers are configured in
synchronous_standby_names which is extended to support new syntax of
'num_sync ( standby_name [ , ... ] )', where num_sync specifies
the number of synchronous standbys that transaction commits need to
wait for replies from and standby_name is the name of a standby
server.

The syntax of 'standby_name [ , ... ]' which was used in 9.5 or before
is also still supported. It's the same as new syntax with num_sync=1.

This commit doesn't include "quorum commit" feature which was discussed
in pgsql-hackers. Synchronous standbys are chosen based on their priorities.
synchronous_standby_names determines the priority of each standby for
being chosen as a synchronous standby. The standbys whose names appear
earlier in the list are given higher priority and will be considered as
synchronous. Other standby servers appearing later in this list
represent potential synchronous standbys.

The regression test for multiple synchronous standbys is not included
in this commit. It should come later.

Authors: Sawada Masahiko, Beena Emerson, Michael Paquier, Fujii Masao
Reviewed-By: Kyotaro Horiguchi, Amit Kapila, Robert Haas, Simon Riggs,
Amit Langote, Thomas Munro, Sameer Thakur, Suraj Kharage, Abhijit Menon-Sen,
Rajeev Rastogi

Many thanks to the various individuals who were involved in
discussing and developing this feature.
2016-04-06 17:18:25 +09:00
Tom Lane
99f3b5613b Disallow newlines in parameter values to be set in ALTER SYSTEM.
As noted by Julian Schauder in bug #14063, the configuration-file parser
doesn't support embedded newlines in string literals.  While there might
someday be a good reason to remove that restriction, there doesn't seem
to be one right now.  However, ALTER SYSTEM SET could accept strings
containing newlines, since many of the variable-specific value-checking
routines would just see a newline as whitespace.  This led to writing a
postgresql.auto.conf file that was broken and had to be removed manually.

Pending a reason to work harder, just throw an error if someone tries this.

In passing, fix several places in the ALTER SYSTEM logic that failed to
provide an errcode() for an ereport(), and thus would falsely log the
failure as an internal XX000 error.

Back-patch to 9.4 where ALTER SYSTEM was introduced.
2016-04-04 18:05:23 -04:00
Robert Haas
314cbfc5da Add new replication mode synchronous_commit = 'remote_apply'.
In this mode, the master waits for the transaction to be applied on
the remote side, not just written to disk.  That means that you can
count on a transaction started on the standby to see all commits
previously acknowledged by the master.

To make this work, the standby sends a reply after replaying each
commit record generated with synchronous_commit >= 'remote_apply'.
This introduces a small inefficiency: the extra replies will be sent
even by standbys that aren't the current synchronous standby.  But
previously-existing synchronous_commit levels make no attempt at all
to optimize which replies are sent based on what the primary cares
about, so this is no worse, and at least avoids any extra replies for
people not using the feature at all.

Thomas Munro, reviewed by Michael Paquier and by me.  Some additional
tweaks by me.
2016-03-29 21:29:49 -04:00
Robert Haas
ae507d9222 Make max_parallel_degree PGC_USERSET.
It was intended to be this way all along, just like other planner
GUCs such as work_mem.  But I goofed.
2016-03-21 10:54:36 -04:00
Peter Eisentraut
b555ed8102 Merge wal_level "archive" and "hot_standby" into new name "replica"
The distinction between "archive" and "hot_standby" existed only because
at the time "hot_standby" was added, there was some uncertainty about
stability.  This is now a long time ago.  We would like to move forward
with simplifying the replication configuration, but this distinction is
in the way, because a primary server cannot tell (without asking a
standby or predicting the future) which one of these would be the
appropriate level.

Pick a new name for the combined setting to make it clearer that it
covers all (non-logical) backup and replication uses.  The old values
are still accepted but are converted internally.

Reviewed-by: Michael Paquier <michael.paquier@gmail.com>
Reviewed-by: David Steele <david@pgmasters.net>
2016-03-18 23:56:03 +01:00
Robert Haas
2d8a1e22b1 Various minor corrections of and improvements to comments.
Aleksander Alekseev
2016-03-18 09:38:59 -04:00
Peter Eisentraut
fc201dfd95 Add syslog_split_messages parameter
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2016-03-16 23:21:44 -04:00
Peter Eisentraut
f4c454e9ba Add syslog_sequence_numbers parameter
Reviewed-by: Andreas Karlsson <andreas@proxel.se>
2016-03-16 23:21:44 -04:00
Robert Haas
c6dda1f48e Add idle_in_transaction_session_timeout.
Vik Fearing, reviewed by Stéphane Schildknecht and me, and revised
slightly by me.
2016-03-16 11:30:45 -04:00
Robert Haas
3aff33aa68 Fix typos.
Oskari Saarenmaa
2016-03-15 18:06:11 -04:00
Andres Freund
428b1d6b29 Allow to trigger kernel writeback after a configurable number of writes.
Currently writes to the main data files of postgres all go through the
OS page cache. This means that some operating systems can end up
collecting a large number of dirty buffers in their respective page
caches.  When these dirty buffers are flushed to storage rapidly, be it
because of fsync(), timeouts, or dirty ratios, latency for other reads
and writes can increase massively.  This is the primary reason for
regular massive stalls observed in real world scenarios and artificial
benchmarks; on rotating disks stalls on the order of hundreds of seconds
have been observed.

On linux it is possible to control this by reducing the global dirty
limits significantly, reducing the above problem. But global
configuration is rather problematic because it'll affect other
applications; also PostgreSQL itself doesn't always generally want this
behavior, e.g. for temporary files it's undesirable.

Several operating systems allow some control over the kernel page
cache. Linux has sync_file_range(2), several posix systems have msync(2)
and posix_fadvise(2). sync_file_range(2) is preferable because it
requires no special setup, whereas msync() requires the to-be-flushed
range to be mmap'ed. For the purpose of flushing dirty data
posix_fadvise(2) is the worst alternative, as flushing dirty data is
just a side-effect of POSIX_FADV_DONTNEED, which also removes the pages
from the page cache.  Thus the feature is enabled by default only on
linux, but can be enabled on all systems that have any of the above
APIs.

While desirable and likely possible this patch does not contain an
implementation for windows.

With the infrastructure added, writes made via checkpointer, bgwriter
and normal user backends can be flushed after a configurable number of
writes. Each of these sources of writes controlled by a separate GUC,
checkpointer_flush_after, bgwriter_flush_after and backend_flush_after
respectively; they're separate because the number of flushes that are
good are separate, and because the performance considerations of
controlled flushing for each of these are different.

A later patch will add checkpoint sorting - after that flushes from the
ckeckpoint will almost always be desirable. Bgwriter flushes are most of
the time going to be random, which are slow on lots of storage hardware.
Flushing in backends works well if the storage and bgwriter can keep up,
but if not it can have negative consequences.  This patch is likely to
have negative performance consequences without checkpoint sorting, but
unfortunately so has sorting without flush control.

Discussion: alpine.DEB.2.10.1506011320000.28433@sto
Author: Fabien Coelho and Andres Freund
2016-03-10 17:04:34 -08:00
Andres Freund
1d4a0ab19a Avoid unlikely data-loss scenarios due to rename() without fsync.
Renaming a file using rename(2) is not guaranteed to be durable in face
of crashes. Use the previously added durable_rename()/durable_link_or_rename()
in various places where we previously just renamed files.

Most of the changed call sites are arguably not critical, but it seems
better to err on the side of too much durability.  The most prominent
known case where the previously missing fsyncs could cause data loss is
crashes at the end of a checkpoint. After the actual checkpoint has been
performed, old WAL files are recycled. When they're filled, their
contents are fdatasynced, but we did not fsync the containing
directory. An OS/hardware crash in an unfortunate moment could then end
up leaving that file with its old name, but new content; WAL replay
would thus not replay it.

Reported-By: Tomas Vondra
Author: Michael Paquier, Tomas Vondra, Andres Freund
Discussion: 56583BDD.9060302@2ndquadrant.com
Backpatch: All supported branches
2016-03-09 18:53:53 -08:00
Joe Conway
dc7d70ea05 Expose control file data via SQL accessible functions.
Add four new SQL accessible functions: pg_control_system(),
pg_control_checkpoint(), pg_control_recovery(), and pg_control_init()
which expose a subset of the control file data.

Along the way move the code to read and validate the control file to
src/common, where it can be shared by the new backend functions
and the original pg_controldata frontend program.

Patch by me, significant input, testing, and review by Michael Paquier.
2016-03-05 11:10:19 -08:00
Joe Conway
a5c43b8869 Add new system view, pg_config
Move and refactor the underlying code for the pg_config client
application to src/common in support of sharing it with a new
system information SRF called pg_config() which makes the same
information available via SQL. Additionally wrap the SRF with a
new system view, as called pg_config.

Patch by me with extensive input and review by Michael Paquier
and additional review by Alvaro Herrera.
2016-02-17 09:12:06 -08:00
Andres Freund
7975c5e0a9 Allow the WAL writer to flush WAL at a reduced rate.
Commit 4de82f7d7 increased the WAL flush rate, mainly to increase the
likelihood that hint bits can be set quickly. More quickly set hint bits
can reduce contention around the clog et al.  But unfortunately the
increased flush rate can have a significant negative performance impact,
I have measured up to a factor of ~4.  The reason for this slowdown is
that if there are independent writes to the underlying devices, for
example because shared buffers is a lot smaller than the hot data set,
or because a checkpoint is ongoing, the fdatasync() calls force cache
flushes to be emitted to the storage.

This is achieved by flushing WAL only if the last flush was longer than
wal_writer_delay ago, or if more than wal_writer_flush_after (new GUC)
unflushed blocks are pending. Based on some tests the default for
wal_writer_delay is 1MB, which seems to work well both on SSD and
rotational media.

To avoid negative performance impact due to 4de82f7d7 an earlier
commit (db76b1e) made SetHintBits() more likely to succeed; preventing
performance regressions in the pgbench tests I performed.

Discussion: 20160118163908.GW10941@awork2.anarazel.de
2016-02-16 00:56:34 +01:00
Robert Haas
7c944bd903 Introduce a new GUC force_parallel_mode for testing purposes.
When force_parallel_mode = true, we enable the parallel mode restrictions
for all queries for which this is believed to be safe.  For the subset of
those queries believed to be safe to run entirely within a worker, we spin
up a worker and run the query there instead of running it in the
original process.  When force_parallel_mode = regress, make additional
changes to allow the regression tests to run cleanly even though parallel
workers have been injected under the hood.

Taken together, this facilitates both better user testing and better
regression testing of the parallelism code.

Robert Haas, with help from Amit Kapila and Rushabh Lathia.
2016-02-07 11:41:33 -05:00
Noah Misch
f4aa3a18a2 Force certain "pljava" custom GUCs to be PGC_SUSET.
Future PL/Java versions will close CVE-2016-0766 by making these GUCs
PGC_SUSET.  This PostgreSQL change independently mitigates that PL/Java
vulnerability, helping sites that update PostgreSQL more frequently than
PL/Java.  Back-patch to 9.1 (all supported versions).
2016-02-05 20:22:51 -05:00
Peter Eisentraut
f8003e07f9 Improve error message 2016-02-04 20:41:32 -05:00
Peter Eisentraut
ac7238dc0f Improve error reporting when location specified by postgres -D does not exist
Previously, the first error seen would be that postgresql.conf does not
exist.  But for the case where the whole directory does not exist, give
an error message about that, together with a hint for how to create one.
2016-02-02 21:03:19 -05:00
Tom Lane
5d35438273 Adjust behavior of row_security GUC to match the docs.
Some time back we agreed that row_security=off should not be a way to
bypass RLS entirely, but only a way to get an error if it was being
applied.  However, the code failed to act that way for table owners.
Per discussion, this is a must-fix bug for 9.5.0.

Adjust the logic in rls.c to behave as expected; also, modify the
error message to be more consistent with the new interpretation.
The regression tests need minor corrections as well.  Also update
the comments about row_security in ddl.sgml to be correct.  (The
official description of the GUC in config.sgml is already correct.)

I failed to resist the temptation to do some other very minor
cleanup as well, such as getting rid of a duplicate extern declaration.
2016-01-04 12:21:41 -05:00
Bruce Momjian
ee94300446 Update copyright for 2016
Backpatch certain files through 9.1
2016-01-02 13:33:40 -05:00
Peter Eisentraut
5db837d3f2 Message improvements 2015-11-16 21:39:23 -05:00
Bruce Momjian
e57646e962 Fix spelling error in postgresql.conf
Report by Greg Clough
2015-11-14 14:00:17 -05:00
Robert Haas
84ef9c596e Put back ssl_renegotiation_limit parameter, but only allow 0.
Per a report from Shay Rojansky, Npgsql sends ssl_renegotiation_limit=0
in the startup packet because it does not support renegotiation; other
clients which have not attempted to support renegotiation might well
behave similarly.  The recent removal of this parameter forces them to
break compatibility with either current PostgreSQL versions, or
previous ones.  Per discussion, the best solution is to accept the
parameter but only allow a value of 0.

Shay Rojansky, edited a little by me.
2015-10-20 09:56:04 -04:00
Stephen Frost
088c83363a ALTER TABLE .. FORCE ROW LEVEL SECURITY
To allow users to force RLS to always be applied, even for table owners,
add ALTER TABLE .. FORCE ROW LEVEL SECURITY.

row_security=off overrides FORCE ROW LEVEL SECURITY, to ensure pg_dump
output is complete (by default).

Also add SECURITY_NOFORCE_RLS context to avoid data corruption when
ALTER TABLE .. FORCE ROW SECURITY is being used. The
SECURITY_NOFORCE_RLS security context is used only during referential
integrity checks and is only considered in check_enable_rls() after we
have already checked that the current user is the owner of the relation
(which should always be the case during referential integrity checks).

Back-patch to 9.5 where RLS was added.
2015-10-04 21:05:08 -04:00
Peter Eisentraut
6390c8c654 Group cluster_name and update_process_title settings together 2015-10-04 12:29:36 -04:00
Noah Misch
3cb0a7e75a Make BYPASSRLS behave like superuser RLS bypass.
Specifically, make its effect independent from the row_security GUC, and
make it affect permission checks pertinent to views the BYPASSRLS role
owns.  The row_security GUC thereby ceases to change successful-query
behavior; it can only make a query fail with an error.  Back-patch to
9.5, where BYPASSRLS was introduced.
2015-10-03 20:19:57 -04:00
Robert Haas
3bd909b220 Add a Gather executor node.
A Gather executor node runs any number of copies of a plan in an equal
number of workers and merges all of the results into a single tuple
stream.  It can also run the plan itself, if the workers are
unavailable or haven't started up yet.  It is intended to work with
the Partial Seq Scan node which will be added in future commits.

It could also be used to implement parallel query of a different sort
by itself, without help from Partial Seq Scan, if the single_copy mode
is used.  In that mode, a worker executes the plan, and the parallel
leader does not, merely collecting the worker's results.  So, a Gather
node could be inserted into a plan to split the execution of that plan
across two processes.  Nested Gather nodes aren't currently supported,
but we might want to add support for that in the future.

There's nothing in the planner to actually generate Gather nodes yet,
so it's not quite time to break out the champagne.  But we're getting
close.

Amit Kapila.  Some designs suggestions were provided by me, and I also
reviewed the patch.  Single-copy mode, documentation, and other minor
changes also by me.
2015-09-30 19:23:36 -04:00
Andres Freund
020235a575 Lower *_freeze_max_age minimum values.
The old minimum values are rather large, making it time consuming to
test related behaviour. Additionally the current limits, especially for
multixacts, can be problematic in space-constrained systems. 10000000
multixacts can contain a lot of members.

Since there's no good reason for the current limits, lower them a good
bit. Setting them to 0 would be a bad idea, triggering endless vacuums,
so still retain a limit.

While at it fix autovacuum_multixact_freeze_max_age to refer to
multixact.c instead of varsup.c.

Reviewed-By: Robert Haas
Discussion: CA+TgmoYmQPHcrc3GSs7vwvrbTkbcGD9Gik=OztbDGGrovkkEzQ@mail.gmail.com
Backpatch: back to 9.0 (in parts)
2015-09-24 14:53:32 +02:00
Noah Misch
7f11724bd6 Remove the SECURITY_ROW_LEVEL_DISABLED security context bit.
This commit's parent made superfluous the bit's sole usage.  Referential
integrity checks have long run as the subject table's owner, and that
now implies RLS bypass.  Safe use of the bit was tricky, requiring
strict control over the SQL expressions evaluating therein.  Back-patch
to 9.5, where the bit was introduced.

Based on a patch by Stephen Frost.
2015-09-20 20:47:17 -04:00
Noah Misch
537bd178c7 Remove the row_security=force GUC value.
Every query of a single ENABLE ROW SECURITY table has two meanings, with
the row_security GUC selecting between them.  With row_security=force
available, every function author would have been advised to either set
the GUC locally or test both meanings.  Non-compliance would have
threatened reliability and, for SECURITY DEFINER functions, security.
Authors already face an obligation to account for search_path, and we
should not mimic that example.  With this change, only BYPASSRLS roles
need exercise the aforementioned care.  Back-patch to 9.5, where the
row_security GUC was introduced.

Since this narrows the domain of pg_db_role_setting.setconfig and
pg_proc.proconfig, one might bump catversion.  A row_security=force
setting in one of those columns will elicit a clear message, so don't.
2015-09-20 20:45:41 -04:00
Fujii Masao
043113e798 Add gin_fuzzy_search_limit to postgresql.conf.sample.
This was forgotten in 8a3631f (commit that originally added the parameter)
and 0ca9907 (commit that added the documentation later that year).

Back-patch to all supported versions.
2015-09-09 02:25:50 +09:00
Alvaro Herrera
1aba62ec63 Allow per-tablespace effective_io_concurrency
Per discussion, nowadays it is possible to have tablespaces that have
wildly different I/O characteristics from others.  Setting different
effective_io_concurrency parameters for those has been measured to
improve performance.

Author: Julien Rouhaud
Reviewed by: Andres Freund
2015-09-08 12:51:42 -03:00
Jeff Davis
f828654e10 Add log_line_prefix option 'n' for Unix epoch.
Prints time as Unix epoch with milliseconds.

Tomas Vondra, reviewed by Fabien Coelho.
2015-09-07 13:46:31 -07:00
Peter Eisentraut
b386271594 Improve whitespace 2015-08-22 21:54:35 -04:00
Andres Freund
6c772c7453 Don't use 'bool' as a struct member name in help_config.c.
Doing so doesn't work if bool is a macro rather than a typedef.

Although c.h spends some effort to support configurations where bool is
a preexisting macro, help_config.c has existed this way since
2003 (b700a6), and there have not been any reports of
problems. Backpatch anyway since this is as riskless as it gets.

Discussion: 20150812084351.GD8470@awork2.anarazel.de
Backpatch: 9.0-master
2015-08-15 16:32:38 +02:00
Robert Haas
369342cf70 Cap wal_buffers to avoid a server crash when it's set very large.
It must be possible to multiply wal_buffers by XLOG_BLCKSZ without
overflowing int, or calculations in StartupXLOG will go badly wrong
and crash the server.  Avoid that by imposing a maximum value on
wal_buffers.  This will be just under 2GB, assuming the usual value
for XLOG_BLCKSZ.

Josh Berkus, per an analysis by Andrew Gierth.
2015-08-04 12:58:54 -04:00
Joe Conway
7b4bfc87d5 Plug RLS related information leak in pg_stats view.
The pg_stats view is supposed to be restricted to only show rows
about tables the user can read. However, it sometimes can leak
information which could not otherwise be seen when row level security
is enabled. Fix that by not showing pg_stats rows to users that would
be subject to RLS on the table the row is related to. This is done
by creating/using the newly introduced SQL visible function,
row_security_active().

Along the way, clean up three call sites of check_enable_rls(). The second
argument of that function should only be specified as other than
InvalidOid when we are checking as a different user than the current one,
as in when querying through a view. These sites were passing GetUserId()
instead of InvalidOid, which can cause the function to return incorrect
results if the current user has the BYPASSRLS privilege and row_security
has been set to OFF.

Additionally fix a bug causing RI Trigger error messages to unintentionally
leak information when RLS is enabled, and other minor cleanup and
improvements. Also add WITH (security_barrier) to the definition of pg_stats.

Bumped CATVERSION due to new SQL functions and pg_stats view definition.

Back-patch to 9.5 where RLS was introduced. Reported by Yaroslav.
Patch by Joe Conway and Dean Rasheed with review and input by
Michael Paquier and Stephen Frost.
2015-07-28 13:21:22 -07:00
Andres Freund
426746b930 Remove ssl renegotiation support.
While postgres' use of SSL renegotiation is a good idea in theory, it
turned out to not work well in practice. The specification and openssl's
implementation of it have lead to several security issues. Postgres' use
of renegotiation also had its share of bugs.

Additionally OpenSSL has a bunch of bugs around renegotiation, reported
and open for years, that regularly lead to connections breaking with
obscure error messages. We tried increasingly complex workarounds to get
around these bugs, but we didn't find anything complete.

Since these connection breakages often lead to hard to debug problems,
e.g. spuriously failing base backups and significant latency spikes when
synchronous replication is used, we have decided to change the default
setting for ssl renegotiation to 0 (disabled) in the released
backbranches and remove it entirely in 9.5 and master.

Author: Andres Freund
Discussion: 20150624144148.GQ4797@alap3.anarazel.de
Backpatch: 9.5 and master, 9.0-9.4 get a different patch
2015-07-28 22:06:31 +02:00
Tom Lane
dd7a8f66ed Redesign tablesample method API, and do extensive code review.
The original implementation of TABLESAMPLE modeled the tablesample method
API on index access methods, which wasn't a good choice because, without
specialized DDL commands, there's no way to build an extension that can
implement a TSM.  (Raw inserts into system catalogs are not an acceptable
thing to do, because we can't undo them during DROP EXTENSION, nor will
pg_upgrade behave sanely.)  Instead adopt an API more like procedural
language handlers or foreign data wrappers, wherein the only SQL-level
support object needed is a single handler function identified by having
a special return type.  This lets us get rid of the supporting catalog
altogether, so that no custom DDL support is needed for the feature.

Adjust the API so that it can support non-constant tablesample arguments
(the original coding assumed we could evaluate the argument expressions at
ExecInitSampleScan time, which is undesirable even if it weren't outright
unsafe), and discourage sampling methods from looking at invisible tuples.
Make sure that the BERNOULLI and SYSTEM methods are genuinely repeatable
within and across queries, as required by the SQL standard, and deal more
honestly with methods that can't support that requirement.

Make a full code-review pass over the tablesample additions, and fix
assorted bugs, omissions, infelicities, and cosmetic issues (such as
failure to put the added code stanzas in a consistent ordering).
Improve EXPLAIN's output of tablesample plans, too.

Back-patch to 9.5 so that we don't have to support the original API
in production.
2015-07-25 14:39:00 -04:00