This can cause valgrind to complain, as the flag marking a buffer as a
temporary copy was not getting initialized.
While on it, fill in with zeros newly-created buffer pages. This does
not matter when loading a block from a temporary file, but it makes the
push of an index tuple into a new buffer page safer.
This has been introduced by 1d27dcf, so backpatch all the way down to
9.4.
Author: Alexander Lakhin
Discussion: https://postgr.es/m/15899-0d24fb273b3dd90c@postgresql.org
Backpatch-through: 9.4
86b85044e abstracted calls to heap functions in COPY FROM to support a
generic table AM. However, when performing a copy into a partitioned
table, this commit neglected to call table_finish_bulk_insert for each
partition. Before 86b85044e, when we always called the heap functions,
there was no need to call heapam_finish_bulk_insert for partitions since
it only did any work when performing a copy without WAL. For partitioned
tables, this was unsupported anyway, so there was no issue. With
pluggable storage, we can't make any assumptions about what the table AM
might want to do in its equivalent function, so we'd better ensure we
always call table_finish_bulk_insert each partition that's received a row.
For now, we make the table_finish_bulk_insert call whenever we evict a
CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean
that it's possible that we call table_finish_bulk_insert multiple times
per partition, which is not a problem other than being an inefficiency.
Improving this requires a more invasive patch, so let's leave that for
another day.
This also changes things so that we no longer needlessly call
table_finish_bulk_insert when performing a COPY FROM for a non-partitioned
table when not using multi-inserts.
Reported-by: Robert Haas
Backpatch-through: 12
Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com
Otherwise the regressplans.sh tests generate extremely slow nested
loop joins. Back-patch to 11 where the hash join tests came in.
Reported-by: Michael Paquier
Discussion: https://postgr.es/m/20190708055256.GB2709%40paquier.xyz
This code failed to account for the possibility that malloc() would
change errno, resulting in wrong output for %m, not to mention the
possibility of message truncation. Such a change is obviously
expected when malloc fails, but there's reason to fear that on some
platforms even a successful malloc call can modify errno.
Discussion: https://postgr.es/m/2576.1527382833@sss.pgh.pa.us
In normal interactive mode, psql's log messages accidentally got a
"psql:" prefix that was not supposed to be there. This only happened
if there was no .psqlrc file being read, so it wasn't discovered for a
while. Fix this by adding the appropriate logging format
configuration call in the right code path.
Discussion: https://www.postgresql.org/message-id/7586.1560540361@sss.pgh.pa.us
The itemlen variable used to be referenced in multiple places, but since
reworking the serialization code it's used only in one assert. Fixed by
removing the variable and calling the macro from the assert directly.
Backpatch to 12, where this code was introduced.
Reported-by: Jeff Janes
Discussion: https://postgr.es/m/CAMkU=1zc_ovH9NZd_9ovuiEWkF9yX06URUDdXCmgDydf-bqB5A@mail.gmail.com
The serialization format of multivariate MCV lists included alignment in
order to allow direct access to part of the serialized data, but despite
multiple fixes (see for example commits d85e0f366a and ea4e1c0e8f) this
proved to be problematic.
This commit abandons alignment in the serialized format, and just copies
everything during deserialization. We now also track amount of memory
needed after deserialization (including alignment), which allows us to
deserialize the MCV list in a single pass.
Bump catversion, as this affects contents of pg_statistic_ext_data.
Backpatch to 12, where multi-column MCV lists were introduced.
Author: Tomas Vondra
Reviewed-by: Tom Lane
Discussion: https://postgr.es/m/2201.1561521148@sss.pgh.pa.us
The function pg_mcv_list_items() returns values stored in MCV items. The
items may contain columns with different data types, so the function was
generating text array-like representation, but in an ad-hoc way without
properly escaping various characters etc.
Fixed by simply building a text[] array, which also makes it easier to
use from queries etc.
Requires changes to pg_proc entry, so bump catversion.
Backpatch to 12, where multi-column MCV lists were introduced.
Author: Tomas Vondra
Reviewed-by: Dean Rasheed
Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development
When building multi-column MCV lists, we compute base frequency for each
item, i.e. a product of per-column frequencies for values from the item.
As a value may be in multiple groups, the code was scanning the whole
array of groups while adding items to the MCV list. This works fine as
long as the number of distinct groups is small, but it's easy to trigger
trigger O(N^2) behavior, especially after increasing statistics target.
This commit precomputes frequencies for values in all columns, so that
when computing the base frequency it's enough to make a simple bsearch
lookup in the array.
Backpatch to 12, where multi-column MCV lists were introduced.
Discussion: https://postgr.es/m/20190618205920.qtlzcu73whfpfqne@development
A function that is declared to return a named composite type must
return tuple datums that are physically marked as having that type.
The plpgsql code path that allowed directly returning an expanded-record
datum forgot to check that, so that an expanded record marked as type
RECORDOID could be returned if it had a physically-compatible tupdesc.
This'd be harmless, I think, if the record value never escaped the
current session --- but it's possible for it to get stored into a table,
and then subsequent sessions can't interpret the anonymous record type.
Fix by flattening the record into a tuple datum and overwriting its
type/typmod fields, if its declared type doesn't match the function's
declared type. (In principle it might be possible to just change the
expanded record's stored type ID info, but there are enough tricky
consequences that I didn't want to mess with that, especially not in
a back-patched bug fix.)
Per bug report from Steve Rogerson. Back-patch to v11 where the bug
was introduced.
Discussion: https://postgr.es/m/cbaecae6-7b87-584e-45f6-4d047b92ca2a@yewtc.demon.co.uk
d4c3a156c added code to remove columns that were not part of a table's
PRIMARY KEY constraint from the GROUP BY clause when all the primary key
columns were present in the group by. This is fine to do since we know
that there will only be one row per group coming from this relation.
However, the logic failed to consider inheritance parent relations. These
can have child relations without a primary key, but even if they did, they
could duplicate one of the parent's rows or one from another child
relation. In this case, those additional GROUP BY columns are required.
Fix this by disabling the optimization for inheritance parent tables.
In v11 and beyond, partitioned tables are fine since partitions cannot
overlap and before v11 partitioned tables could not have a primary key.
Reported-by: Manuel Rigger
Discussion: http://postgr.es/m/CA+u7OA7VLKf_vEr6kLF3MnWSA9LToJYncgpNX2tQ-oWzYCBQAw@mail.gmail.com
Backpatch-through: 9.6
This fixes at the same time a set of inconsistencies in the
documentation and the scripts related to the versions of Windows SDK
supported.
Author: Haribabu Kommi
Reviewed-by: Andrew Dunstan, Juan José Santamaría Flecha, Michael
Paquier
Discussion: https://postgr.es/m/CAJrrPGcfqXhfPyMrny9apoDU7M1t59dzVAvoJ9AeAh5BJi+UzA@mail.gmail.com
Backpatch-through: 9.4
Commit 4f3b38fe2 supposed that complete_from_const() is equivalent to
the one-element-list case of complete_from_list(), but that's not
really true at all. complete_from_const() supposes that the completion
is certain enough to justify wiping out whatever the user typed, while
complete_from_list() will only provide completions that match the
word-so-far.
In practice, given the lame parsing technology used by tab-complete.c,
it's fairly hard to believe that we're *ever* certain enough about
a completion to justify auto-correcting user input that doesn't match.
Hence, remove the inappropriate unification of the two cases.
As things now stand, complete_from_const() is used only for the
situation where we have no matches and we need to keep readline
from applying its default complete-with-file-names behavior.
This (mis?) behavior actually exists much further back, but
I'm hesitant to change it in released branches. It's not too
late for v12, though, especially seeing that the aforesaid
commit is new in v12.
Per gripe from Ken Tanzer.
Discussion: https://postgr.es/m/CAD3a31XpXzrZA9TT3BqLSHghdTK+=cXjNCE+oL2Zn4+oWoc=qA@mail.gmail.com
Don't think that the context "UPDATE tab SET var =" is a GUC-setting
command.
If we have "SET var =" but the "var" is not a known GUC variable,
don't offer any completions. The most likely explanation is that
we've misparsed the context and it's not really a GUC-setting command.
Per gripe from Ken Tanzer. Back-patch to 9.6. The issue exists
further back, but before 9.6 the code looks very different and it
doesn't actually know whether the "var" name matches anything,
so I desisted from trying to fix it.
Discussion: https://postgr.es/m/CAD3a31XpXzrZA9TT3BqLSHghdTK+=cXjNCE+oL2Zn4+oWoc=qA@mail.gmail.com
This reverts commit f03a9ca4366d064d89b7cf7ed75d4e43f2ed0667,
in the v12 branch only. We don't want to ship v12 with that,
since it causes occasional test failures (as a result of statistics
transmission not being entirely reliable).
I'll leave it in HEAD though, in hopes that we'll eventually
capture an instance of the original problematic behavior.
4de60244e added the call to table_finish_bulk_insert to the
CopyMultiInsertBufferCleanup function. We use a CopyMultiInsertBuffer even
for non-partitioned tables, so having the cleanup do that meant we would
call table_finsh_bulk_insert twice when performing COPY FROM with
a non-partitioned table.
Here we can just remove the direct call in CopyFrom and let
CopyMultiInsertBufferCleanup handle the call instead.
86b85044e abstracted calls to heap functions in COPY FROM to support a
generic table AM. However, when performing a copy into a partitioned
table, this commit neglected to call table_finish_bulk_insert for each
partition. Before 86b85044e, when we always called the heap functions,
there was no need to call heapam_finish_bulk_insert for partitions since
it only did any work when performing a copy without WAL. For partitioned
tables, this was unsupported anyway, so there was no issue. With pluggable
storage, we can't make any assumptions about what the table AM might want
to do in its equivalent function, so we'd better ensure we always call
table_finish_bulk_insert each partition that's received a row.
For now, we make the table_finish_bulk_insert call whenever we evict a
CopyMultiInsertBuffer out of the CopyMultiInsertInfo. This does mean
that it's possible that we call table_finish_bulk_insert multiple times
per partition, which is not a problem other than being an inefficiency.
Improving this requires a more invasive patch, so let's leave that for
another day.
In passing, move the table_finish_bulk_insert for the target of the COPY
command so that it's only called when we're actually performing bulk
inserts. We don't need to call this when inserting 1 row at a time.
Reported-by: Robert Haas
Discussion: https://postgr.es/m/CA+TgmoYK=6BpxiJ0tN-p9wtH0BTAfbdxzHhwou0mdud4+BkYuQ@mail.gmail.com
UBSan complains about this. Instead, cast to a suitable type requiring
only 4-byte alignment. DatumGetAnyArrayP() already assumes one can cast
between AnyArrayType and ArrayType, so this doesn't introduce a new
assumption. Back-patch to 9.5, where AnyArrayType was introduced.
Reviewed by Tom Lane.
Discussion: https://postgr.es/m/20190629210334.GA1244217@rfd.leadboat.com
The logic in reorder_grouping_sets to order grouping set elements to
match a pre-specified sort ordering was defective, resulting in
unnecessary sort nodes (though the query output would still be
correct). Repair, simplifying the code a little, and add a test.
Per report from Richard Guo, though I didn't use their patch. Original
bug seems to have been my fault.
Backpatch back to 9.5 where grouping sets were introduced.
Discussion: https://postgr.es/m/CAN_9JTzyjGcUjiBHxLsgqfk7PkdLGXiM=pwM+=ph2LsWw0WO1A@mail.gmail.com
Up to now, pg_regress --config-auth had a hard-wired assumption
that the target cluster uses the default bootstrap superuser name.
pg_dump's 010_dump_connstr.pl TAP test uses non-default superuser
names, and was klugily getting around the restriction by listing
the desired superuser name as a role to "create". This is pretty
confusing (or at least, it confused me). Let's make it clearer by
allowing --config-auth mode to be told the bootstrap superuser name.
Repurpose the existing --user switch for that, since it has no
other function in --config-auth mode.
Per buildfarm. I don't have an environment at hand in which I can
test this fix, but the buildfarm should soon show if it works.
Discussion: https://postgr.es/m/3142.1561840611@sss.pgh.pa.us
This test script is unsafe to run in "make installcheck" mode for
(at least) two reasons: it creates and destroys some role names
that don't follow the "regress_xxx" naming convention, and it
sets and then resets the application_name GUC attached to every
existing role. While we've not had complaints, these surely are
not good things to do within a production installation, and
regress.sgml pretty clearly implies that we won't do them.
Rather than lose test coverage altogether, let's just move this
script somewhere where it will get run by "make check" but not
"make installcheck". src/test/modules/ already has that property.
Since it seems likely that we'll want other regression tests in
future that also exceed the constraints of "make installcheck",
create a generically-named src/test/modules/unsafe_tests/
directory to hold them.
Discussion: https://postgr.es/m/16638.1468620817@sss.pgh.pa.us
Instead of calling pg_lsn_in() in check_recovery_target_lsn and
timestamptz_in() in check_recovery_target_time, reorganize the
respective code so that we don't raise any errors in the check hooks.
The previous code tried to use PG_TRY/PG_CATCH to handle errors in a
way that is not safe, so now the code contains no ereport() calls and
can operate safely within the GUC error handling system.
Moreover, since the interpretation of the recovery_target_time string
may depend on the time zone, we cannot do the final processing of that
string until all the GUC processing is done. Instead,
check_recovery_target_time() now does some parsing for syntax
checking, but the actual conversion to a timestamptz value is done
later in the recovery code that uses it.
Reported-by: Andres Freund <andres@anarazel.de>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://www.postgresql.org/message-id/flat/20190611061115.njjwkagvxp4qujhp%40alap3.anarazel.de
The date/time values 'current', 'invalid', and 'undefined' were
removed a long time ago, but the code still contains explicit error
handling for the transition. To simplify the code and avoid having to
handle these values everywhere, just remove the recognition of these
tokens altogether now.
Reviewed-by: Michael Paquier <michael@paquier.xyz>
In commit 18555b132 we tentatively established a rule that regression
tests should use names containing "regression" for databases, and names
starting with "regress_" for all other globally-visible object names, so
as to circumscribe the side-effects that "make installcheck" could have
on an existing installation.
This commit adds a simple enforcement mechanism for that rule: if the code
is compiled with ENFORCE_REGRESSION_TEST_NAME_RESTRICTIONS defined, it
will emit a warning (not an error) whenever a database, role, tablespace,
subscription, or replication origin name is created that doesn't obey the
rule. Running one or more buildfarm members with that symbol defined
should be enough to catch new violations, at least in the regular
regression tests. Most TAP tests wouldn't notice such warnings, but
that's actually fine because TAP tests don't execute against an existing
server anyway.
Since it's already the case that running src/test/modules/ tests in
installcheck mode is deprecated, we can use that as a home for tests
that seem unsafe to run against an existing server, such as tests that
might have side-effects on existing roles. Document that (though this
commit doesn't in itself make it any less safe than before).
Update regress.sgml to define these restrictions more clearly, and
to clean up assorted lack-of-up-to-date-ness in its descriptions of
the available regression tests.
Discussion: https://postgr.es/m/16638.1468620817@sss.pgh.pa.us
In commit 18555b132 we tentatively established a rule that regression
tests should use names containing "regression" for databases, and names
starting with "regress_" for all other globally-visible object names, so
as to circumscribe the side-effects that "make installcheck" could have on
an existing installation. However, no enforcement mechanism was created,
so it's unsurprising that some new violations have crept in since then.
In fact, a whole new *category* of violations has crept in, to wit we now
also have globally-visible subscription and replication origin names, and
"make installcheck" could very easily clobber user-created objects of
those types. So it's past time to do something about this.
This commit sanitizes the tests enough that they will pass (i.e. not
generate any visible warnings) with the enforcement mechanism I'll add
in the next commit. There are some TAP tests that still trigger the
warnings, but the warnings do not cause test failure. Since these tests
do not actually run against a pre-existing installation, there's no need
to worry whether they could conflict with user-created objects.
The problem with rolenames.sql testing special role names like "user"
is still there, and is dealt with only very cosmetically in this patch
(by hiding the warnings :-(). What we actually need to do to be safe is
to take that test script out of "make installcheck" altogether, but that
seems like material for a separate patch.
Discussion: https://postgr.es/m/16638.1468620817@sss.pgh.pa.us
Since we generate such names internally, it seems like a good idea
to have a policy of disallowing them for user use, as we do for many
other object types. Otherwise attempts to use them will randomly
fail due to collisions with internally-generated names.
Discussion: https://postgr.es/m/3606.1561747369@sss.pgh.pa.us