1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-19 13:42:17 +03:00
Commit Graph

5114 Commits

Author SHA1 Message Date
Michael Paquier
df9133fa63 Move SQL-callable code related to multixacts into its own file
A patch is under discussion to add more SQL capabilities related to
multixacts, and this move avoids bloating the file more than necessary.
This affects pg_get_multixact_members().  A side effect of this move is
the requirement to add mxstatus_to_string() to multixact.h.

Extracted from a larger patch by the same author, tweaked by me.

Author: Naga Appani <nagnrik@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Reviewed-by: Ashutosh Bapat <ashutosh.bapat.oss@gmail.com>
Discussion: https://postgr.es/m/CA+QeY+AAsYK6WvBW4qYzHz4bahHycDAY_q5ECmHkEV_eB9ckzg@mail.gmail.com
2025-08-18 14:57:55 +09:00
Masahiko Sawada
37265ca01f Fix constant when extracting timestamp from UUIDv7.
When extracting a timestamp from a UUIDv7, a conversion from
milliseconds to microseconds was using the incorrect constant
NS_PER_US instead of US_PER_MS. Although both constants have the same
value, this fix improves code clarity by using the semantically
correct constant.

Backpatch to v18, where UUIDv7 was introduced.

Author: Erik Nordström <erik@tigerdata.com>
Reviewed-by: Andrey Borodin <x4mmm@yandex-team.ru>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Discussion: https://postgr.es/m/CACAa4V+i07eaP6h4MHNydZeX47kkLPwAg0sqe67R=M5tLdxNuQ@mail.gmail.com
Backpatch-through: 18
2025-08-15 11:58:53 -07:00
Tom Lane
ee54046601 Grab the low-hanging fruit from forcing USE_FLOAT8_BYVAL to true.
Remove conditionally-compiled code for the other case.

Replace uses of FLOAT8PASSBYVAL with constant "true", mainly because
it was quite confusing in cases where the type we were dealing with
wasn't float8.

I left the associated pg_control and Pg_magic_struct fields in place.
Perhaps we should get rid of them, but it would save little, so it
doesn't seem worth thinking hard about the compatibility implications.
I just labeled them "vestigial" in places where that seemed helpful.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us
2025-08-13 17:18:22 -04:00
Tom Lane
6aebedc384 Grab the low-hanging fruit from forcing sizeof(Datum) to 8.
Remove conditionally-compiled code for smaller Datum widths,
and simplify comments that describe cases no longer of interest.

I also fixed up a few more places that were not using
DatumGetIntXX where they should, and made some cosmetic
adjustments such as using sizeof(int64) not sizeof(Datum)
in places where that fit better with the surrounding code.

One thing I remembered while preparing this part is that SP-GiST
stores pass-by-value prefix keys as Datums, so that the on-disk
representation depends on sizeof(Datum).  That's even more
unfortunate than the existing commentary makes it out to be,
because now there is a hazard that the change of sizeof(Datum)
will break SP-GiST indexes on 32-bit machines.  It appears that
there are no existing SP-GiST opclasses that are actually
affected; and if there are some that I didn't find, the number
of installations that are using them on 32-bit machines is
doubtless tiny.  So I'm proceeding on the assumption that we
can get away with this, but it's something to worry about.

(gininsert.c looks like it has a similar problem, but it's okay
because the "tuples" it's constructing are just transient data
within the tuplesort step.  That's pretty poorly documented
though, so I added some comments.)

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us
2025-08-13 17:18:22 -04:00
Dean Rasheed
22424953cd Fix security checks in selectivity estimation functions.
Commit e2d4ef8de8 (the fix for CVE-2017-7484) added security checks
to the selectivity estimation functions to prevent them from running
user-supplied operators on data obtained from pg_statistic if the user
lacks privileges to select from the underlying table. In cases
involving inheritance/partitioning, those checks were originally
performed against the child RTE (which for plain inheritance might
actually refer to the parent table). Commit 553d2ec271 then extended
that to also check the parent RTE, allowing access if the user had
permissions on either the parent or the child. It turns out, however,
that doing any checks using the child RTE is incorrect, since
securityQuals is set to NULL when creating an RTE for an inheritance
child (whether it refers to the parent table or the child table), and
therefore such checks do not correctly account for any RLS policies or
security barrier views. Therefore, do the security checks using only
the parent RTE. This is consistent with how RLS policies are applied,
and the executor's ACL checks, both of which use only the parent
table's permissions/policies. Similar checks are performed in the
extended stats code, so update that in the same way, centralizing all
the checks in a new function.

In addition, note that these checks by themselves are insufficient to
ensure that the user has access to the table's data because, in a
query that goes via a view, they only check that the view owner has
permissions on the underlying table, not that the current user has
permissions on the view itself. In the selectivity estimation
functions, there is no easy way to navigate from underlying tables to
views, so add permissions checks for all views mentioned in the query
to the planner startup code. If the user lacks permissions on a view,
a permissions error will now be reported at planner-startup, and the
selectivity estimation functions will not be run.

Checking view permissions at planner-startup in this way is a little
ugly, since the same checks will be repeated at executor-startup.
Longer-term, it might be better to move all the permissions checks
from the executor to the planner so that permissions errors can be
reported sooner, instead of creating a plan that won't ever be run.
However, such a change seems too far-reaching to be back-patched.

Back-patch to all supported versions. In v13, there is the added
complication that UPDATEs and DELETEs on inherited target tables are
planned using inheritance_planner(), which plans each inheritance
child table separately, so that the selectivity estimation functions
do not know that they are dealing with a child table accessed via its
parent. Handle that by checking access permissions on the top parent
table at planner-startup, in the same way as we do for views. Any
securityQuals on the top parent table are moved down to the child
tables by inheritance_planner(), so they continue to be checked by the
selectivity estimation functions.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Noah Misch <noah@leadboat.com>
Backpatch-through: 13
Security: CVE-2025-8713
2025-08-11 09:03:11 +01:00
Tom Lane
665c3dbba4 Mop-up for Datum conversion cleanups.
Fix a couple more places where an explicit Datum conversion
is needed (not clear how we missed these in ff89e182d and
previous commits).

Replace the minority usage "(Datum) NULL" with "(Datum) 0".
The former depends on the assumption that Datum is the same
width as Pointer, the latter doesn't.  Anyway consistency
is a good thing.

This is, I believe, the last of the notational mop-up needed
before we can consider changing Datum to uint64 everywhere.
It's also important cleanup for more aggressive ideas such
as making Datum a struct.

Discussion: https://postgr.es/m/1749799.1752797397@sss.pgh.pa.us
Discussion: https://postgr.es/m/8246d7ff-f4b7-4363-913e-827dadfeb145@eisentraut.org
2025-08-08 18:44:57 -04:00
Peter Eisentraut
ff89e182d4 Add missing Datum conversions
Add various missing conversions from and to Datum.  The previous code
mostly relied on implicit conversions or its own explicit casts
instead of using the correct DatumGet*() or *GetDatum() functions.

We think these omissions are harmless.  Some actual bugs that were
discovered during this process have been committed
separately (80c758a2e1, fd2ab03fea).

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
2025-08-08 22:06:57 +02:00
Peter Eisentraut
dcfc0f8912 Remove useless/superfluous Datum conversions
Remove useless DatumGetFoo() and FooGetDatum() calls.  These are
places where no conversion from or to Datum was actually happening.

We think these extra calls covered here were harmless.  Some actual
bugs that were discovered during this process have been committed
separately (80c758a2e1, 2242b26ce4).

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
2025-08-08 22:06:57 +02:00
Dean Rasheed
d699687b32 Extend int128.h to support more numeric code.
This adds a few more functions to int128.h, allowing more of numeric.c
to use 128-bit integers on all platforms.

Specifically, int64_div_fast_to_numeric() and the following aggregate
functions can now use 128-bit integers for improved performance on all
platforms, rather than just platforms with native support for int128:

- SUM(int8)
- AVG(int8)
- STDDEV_POP(int2 or int4)
- STDDEV_SAMP(int2 or int4)
- VAR_POP(int2 or int4)
- VAR_SAMP(int2 or int4)

In addition to improved performance on platforms lacking native
128-bit integer support, this significantly simplifies this numeric
code by allowing a lot of conditionally compiled code to be deleted.

A couple of numeric functions (div_var_int64() and sqrt_var()) still
contain conditionally compiled 128-bit integer code that only works on
platforms with native 128-bit integer support. Making those work more
generally would require rolling our own higher precision 128-bit
division, which isn't supported for now.

Author: Dean Rasheed <dean.a.rasheed@gmail.com>
Reviewed-by: John Naylor <johncnaylorls@gmail.com>
Discussion: https://postgr.es/m/CAEZATCWgBMc9ZwKMYqQpaQz2X6gaamYRB+RnMsUNcdMcL2Mj_w@mail.gmail.com
2025-08-07 15:49:24 +01:00
Michael Paquier
2242b26ce4 Fix incorrect Datum conversion in timestamptz_trunc_internal()
The code used a PG_RETURN_TIMESTAMPTZ() where the return type is
TimestampTz and not a Datum.

On 64-bit systems, there is no effect since this just ends up casting
64-bit integers back and forth.  On 32-bit systems, timestamptz is
pass-by-reference.  PG_RETURN_TIMESTAMPTZ() allocates new memory and
returns the address, meaning that the caller could interpret this as a
timestamp value.

The effect is using "date_trunc(..., 'infinity'::timestamptz) will
return random values (instead of the correct return value 'infinity').

Bug introduced in commit d85ce012f9.

Author: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/2d320b6f-b4af-4fbc-9eec-5d0fa15d187b@eisentraut.org
Discussion: https://postgr.es/m/4bf60a84-2862-4a53-acd5-8eddf134a60e@eisentraut.org
Backpatch-through: 18
2025-08-07 11:02:04 +09:00
Peter Eisentraut
0f5ade7a36 Fix varatt versus Datum type confusions
Macros like VARDATA() and VARSIZE() should be thought of as taking
values of type pointer to struct varlena or some other related struct.
The way they are implemented, you can pass anything to it and it will
cast it right.  But this is in principle incorrect.  To fix, add the
required DatumGetPointer() calls.  Or in a couple of cases, remove
superfluous PointerGetDatum() calls.

It is planned in a subsequent patch to change macros like VARDATA()
and VARSIZE() to inline functions, which will enforce stricter typing.
This is in preparation for that.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/928ea48f-77c6-417b-897c-621ef16685a6%40eisentraut.org
2025-08-05 12:11:36 +02:00
Peter Eisentraut
2ad6e80de9 Fix various hash function uses
These instances were using Datum-returning functions where a
lower-level function returning uint32 would be more appropriate.

Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://www.postgresql.org/message-id/flat/8246d7ff-f4b7-4363-913e-827dadfeb145%40eisentraut.org
2025-08-05 11:47:23 +02:00
Amit Kapila
fd5a1a0c3e Detect and report update_deleted conflicts.
This enhancement builds upon the infrastructure introduced in commit
228c370868, which enables the preservation of deleted tuples and their
origin information on the subscriber. This capability is crucial for
handling concurrent transactions replicated from remote nodes.

The update introduces support for detecting update_deleted conflicts
during the application of update operations on the subscriber. When an
update operation fails to locate the target row-typically because it has
been concurrently deleted-we perform an additional table scan. This scan
uses the SnapshotAny mechanism and we do this additional scan only when
the retain_dead_tuples option is enabled for the relevant subscription.

The goal of this scan is to locate the most recently deleted tuple-matching
the old column values from the remote update-that has not yet been removed
by VACUUM and is still visible according to our slot (i.e., its deletion
is not older than conflict-detection-slot's xmin). If such a tuple is
found, the system reports an update_deleted conflict, including the origin
and transaction details responsible for the deletion.

This provides a groundwork for more robust and accurate conflict
resolution process, preventing unexpected behavior by correctly
identifying cases where a remote update clashes with a deletion from
another origin.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Hayato Kuroda <kuroda.hayato@fujitsu.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-08-04 04:02:47 +00:00
Peter Eisentraut
c3019bb778 Update comment
The code being referred to was moved to a different function in commit
eb8312a22a, so update the comment accordingly.
2025-07-29 18:57:14 +02:00
Tom Lane
902f922218 Remove unnecessary complication around xmlParseBalancedChunkMemory.
When I prepared 71c0921b6 et al yesterday, I was thinking that the
logic involving explicitly freeing the node_list output was still
needed to dodge leakage bugs in libxml2.  But I was misremembering:
we introduced that only because with early 2.13.x releases we could
not trust xmlParseBalancedChunkMemory's result code, so we had to
look to see if a node list was returned or not.  There's no reason
to believe that xmlParseBalancedChunkMemory will fail to clean up
the node list when required, so simplify.  (This essentially
completes reverting all the non-cosmetic changes in 6082b3d5d.)

Reported-by: Jim Jones <jim.jones@uni-muenster.de>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/997668.1753802857@sss.pgh.pa.us
Backpatch-through: 13
2025-07-29 12:47:38 -04:00
Tom Lane
71c0921b64 Avoid regression in the size of XML input that we will accept.
This mostly reverts commit 6082b3d5d, "Use xmlParseInNodeContext
not xmlParseBalancedChunkMemory".  It turns out that
xmlParseInNodeContext will reject text chunks exceeding 10MB, while
(in most libxml2 versions) xmlParseBalancedChunkMemory will not.
The bleeding-edge libxml2 bug that we needed to work around a year
ago is presumably no longer a factor, and the argument that
xmlParseBalancedChunkMemory is semi-deprecated is not enough to
justify a functionality regression.  Hence, go back to doing it
the old way.

Reported-by: Michael Paquier <michael@paquier.xyz>
Author: Michael Paquier <michael@paquier.xyz>
Co-authored-by: Erik Wienhold <ewie@ewie.name>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aIGknLuc8b8ega2X@paquier.xyz
Backpatch-through: 13
2025-07-28 16:50:41 -04:00
Amit Kapila
228c370868 Preserve conflict-relevant data during logical replication.
Logical replication requires reliable conflict detection to maintain data
consistency across nodes. To achieve this, we must prevent premature
removal of tuples deleted by other origins and their associated commit_ts
data by VACUUM, which could otherwise lead to incorrect conflict reporting
and resolution.

This patch introduces a mechanism to retain deleted tuples on the
subscriber during the application of concurrent transactions from remote
nodes. Retaining these tuples allows us to correctly ignore concurrent
updates to the same tuple. Without this, an UPDATE might be misinterpreted
as an INSERT during resolutions due to the absence of the original tuple.

Additionally, we ensure that origin metadata is not prematurely removed by
vacuum freeze, which is essential for detecting update_origin_differs and
delete_origin_differs conflicts.

To support this, a new replication slot named pg_conflict_detection is
created and maintained by the launcher on the subscriber. Each apply
worker tracks its own non-removable transaction ID, which the launcher
aggregates to determine the appropriate xmin for the slot, thereby
retaining necessary tuples.

Conflict information retention (deleted tuples and commit_ts) can be
enabled per subscription via the retain_conflict_info option. This is
disabled by default to avoid unnecessary overhead for configurations that
do not require conflict resolution or logging.

During upgrades, if any subscription on the old cluster has
retain_conflict_info enabled, a conflict detection slot will be created to
protect relevant tuples from deletion when the new cluster starts.

This is a foundational work to correctly detect update_deleted conflict
which will be done in a follow-up patch.

Author: Zhijie Hou <houzj.fnst@fujitsu.com>
Reviewed-by: shveta malik <shveta.malik@gmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Dilip Kumar <dilipbalaut@gmail.com>
Reviewed-by: Nisha Moond <nisha.moond412@gmail.com>
Reviewed-by: Amit Kapila <amit.kapila16@gmail.com>
Discussion: https://postgr.es/m/OS0PR01MB5716BE80DAEB0EE2A6A5D1F5949D2@OS0PR01MB5716.jpnprd01.prod.outlook.com
2025-07-23 02:56:00 +00:00
Tom Lane
aadf7db66e Mostly-cosmetic adjustments to estimate_multivariate_bucketsize().
The only practical effect of these changes is to avoid a useless
list_copy() operation when there is a single hashclause.  That's
never going to make any noticeable performance difference, but
the code is arguably clearer this way, especially if we take the
opportunity to add some comments so that readers don't have to
reverse-engineer the usage of these local variables.  Also add
some braces for better/more consistent style.

Author: Tender Wang <tndrwang@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAHewXNnHBOO9NEa=NBDYOrwZL4oHu2NOcTYvqyNyWEswo8f5OQ@mail.gmail.com
2025-07-19 14:23:02 -04:00
Tom Lane
3683af6170 Speed up byteain by not parsing traditional-style input twice.
Instead of laboriously computing the exact output length, use strlen
to get an upper bound cheaply.  (This is still O(N) of course, but
the constant factor is a lot less.)  This will typically result in
overallocating the output datum, but that's of little concern since
it's a short-lived allocation in just about all use-cases.

A simple microbenchmark showed about 40% speedup for long input
strings.

While here, make some cosmetic cleanups and add a test case that
covers the double-backslash code path in byteain and byteaout.

Author: Steven Niu <niushiji@gmail.com>
Reviewed-by: Kirill Reshke <reshkekirill@gmail.com>
Reviewed-by: Stepan Neretin <slpmcf@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/ca315729-140b-426e-81a6-6cd5cfe7ecc5@gmail.com
2025-07-18 16:42:10 -04:00
Tom Lane
2a3a396432 Clarify the ra != rb case in compareJsonbContainers().
It's impossible to reach this case with either ra or rb being
WJB_DONE, because our earlier checks that the structure and
length of the inputs match should guarantee that we reach their
ends simultaneously.  However, the comment completely fails to
explain this, and the Asserts don't cover it either.  The comment
is pretty obscure anyway, so rewrite it, and extend the Asserts
to reject WJB_DONE.

This is only cosmetic, so no need for back-patch.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru
2025-07-15 18:21:12 -04:00
Tom Lane
aad1617b76 Silence uninitialized-value warnings in compareJsonbContainers().
Because not every path through JsonbIteratorNext() sets val->type,
some compilers complain that compareJsonbContainers() is comparing
possibly-uninitialized values.  The paths that don't set it return
WJB_DONE, WJB_END_ARRAY, or WJB_END_OBJECT, so it's clear by
manual inspection that the "(ra == rb)" code path is safe, and
indeed we aren't seeing warnings about that.  But the (ra != rb)
case is much less obviously safe.  In Assert-enabled builds it
seems that the asserts rejecting WJB_END_ARRAY and WJB_END_OBJECT
persuade gcc 15.x not to warn, which makes little sense because
it's impossible to believe that the compiler can prove of its
own accord that ra/rb aren't WJB_DONE here.  (In fact they never
will be, so the code isn't wrong, but why is there no warning?)
Without Asserts, the appearance of warnings is quite unsurprising.

We discussed fixing this by converting those two Asserts into
pg_assume, but that seems not very satisfactory when it's so unclear
why the compiler is or isn't warning: the warning could easily
reappear with some other compiler version.  Let's fix it in a less
magical, more future-proof way by changing JsonbIteratorNext()
so that it always does set val->type.  The cost of that should be
pretty negligible, and it makes the function's API spec less squishy.

Reported-by: Erik Rijkers <er@xs4all.nl>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Andres Freund <andres@anarazel.de>
Discussion: https://postgr.es/m/988bf1bc-3f1f-99f3-bf98-222f1cd9dc5e@xs4all.nl
Discussion: https://postgr.es/m/0c623e8a204187b87b4736792398eaf1@postgrespro.ru
Backpatch-through: 13
2025-07-15 18:11:18 -04:00
Tom Lane
84ce258707 Replace float8 with int in date2isoweek() and date2isoyear().
The values of the "result" variables in these functions are
always integers; using a float8 variable accomplishes nothing
except to incur useless conversions to and from float.  While
that wastes a few nanoseconds, these functions aren't all that
time-critical.  But it seems worth fixing to remove possible
reader confusion.

Also, in the case of date2isoyear(), "result" is a very poorly
chosen variable name because it is *not* the function's result.
Rename it to "week", and do the same in date2isoweek() for
consistency.

Since this is mostly cosmetic, there seems little need
for back-patch.

Author: Sergey Fukanchik <s.fukanchik@postgrespro.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/6323a-68726500-1-7def9d00@137821581
2025-07-12 11:50:37 -04:00
Tom Lane
64840e4624 Fix inconsistent quoting of role names in ACLs.
getid() and putid(), which parse and deparse role names within ACL
input/output, applied isalnum() to see if a character within a role
name requires quoting.  They did this even for non-ASCII characters,
which is problematic because the results would depend on encoding,
locale, and perhaps even platform.  So it's possible that putid()
could elect not to quote some string that, later in some other
environment, getid() will decide is not a valid identifier, causing
dump/reload or similar failures.

To fix this in a way that won't risk interoperability problems
with unpatched versions, make getid() treat any non-ASCII as a
legitimate identifier character (hence not requiring quotes),
while making putid() treat any non-ASCII as requiring quoting.
We could remove the resulting excess quoting once we feel that
no unpatched servers remain in the wild, but that'll be years.

A lesser problem is that getid() did the wrong thing with an input
consisting of just two double quotes ("").  That has to represent an
empty string, but getid() read it as a single double quote instead.
The case cannot arise in the normal course of events, since we don't
allow empty-string role names.  But let's fix it while we're here.

Although we've not heard field reports of problems with non-ASCII
role names, there's clearly a hazard there, so back-patch to all
supported versions.

Reported-by: Peter Eisentraut <peter@eisentraut.org>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/3792884.1751492172@sss.pgh.pa.us
Backpatch-through: 13
2025-07-11 18:50:13 -04:00
Jeff Davis
53cd0b71ee Change wchar2char() and char2wchar() to accept a locale_t.
These are libc-specific functions, so should require a locale_t rather
than a pg_locale_t (which could use another provider).

Discussion: https://postgr.es/m/a8666c391dfcabe79868d95f7160eac533ace718.camel%40j-davis.com
2025-07-09 08:45:34 -07:00
Tom Lane
e03c952877 Fix low-probability memory leak in XMLSERIALIZE(... INDENT).
xmltotext_with_options() did not consider the possibility that
pg_xml_init() could fail --- most likely due to OOM.  If that
happened, the already-parsed xmlDoc structure would be leaked.
Oversight in commit 483bdb2af.

Bug: #18981
Author: Dmitry Kovalenko <d.kovalenko@postgrespro.ru>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/18981-9bc3c80f107ae925@postgresql.org
Backpatch-through: 16
2025-07-08 12:50:33 -04:00
Álvaro Herrera
2633dae2e4 Standardize LSN formatting by zero padding
This commit standardizes the output format for LSNs to ensure consistent
representation across various tools and messages.  Previously, LSNs were
inconsistently printed as `%X/%X` in some contexts, while others used
zero-padding.  This often led to confusion when comparing.

To address this, the LSN format is now uniformly set to `%X/%08X`,
ensuring the lower 32-bit part is always zero-padded to eight
hexadecimal digits.

Author: Japin Li <japinli@hotmail.com>
Reviewed-by: Masahiko Sawada <sawada.mshk@gmail.com>
Reviewed-by: Álvaro Herrera <alvherre@kurilemu.de>
Discussion: https://postgr.es/m/ME0P300MB0445CA53CA0E4B8C1879AF84B641A@ME0P300MB0445.AUSP300.PROD.OUTLOOK.COM
2025-07-07 13:57:43 +02:00
Tom Lane
0059bbe1ec Break out xxx2yyy_opt_overflow APIs for more datetime conversions.
Previous commits invented timestamp2timestamptz_opt_overflow,
date2timestamp_opt_overflow, and date2timestamptz_opt_overflow
functions to perform non-error-throwing conversions between
datetime types.  This patch completes the set by adding
timestamp2date_opt_overflow, timestamptz2date_opt_overflow,
and timestamptz2timestamp_opt_overflow.

In addition, adjust timestamp2timestamptz_opt_overflow so that it
doesn't throw error if timestamp2tm fails, but treats that as an
overflow case.  The situation probably can't arise except with an
invalid timestamp value, and I can't think of a way that that would
happen except data corruption.  However, it's pretty silly to have a
function whose entire reason for existence is to not throw errors for
out-of-range inputs nonetheless throw an error for out-of-range input.

The new APIs are not used in this patch, but will be needed in
upcoming btree_gin changes.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Arseniy Mukhin <arseniy.mukhin.dev@gmail.com>
Discussion: https://postgr.es/m/262624.1738460652@sss.pgh.pa.us
2025-07-03 16:17:08 -04:00
Tom Lane
7374b3a536 Allow width_bucket()'s "operand" input to be NaN.
The array-based variant of width_bucket() has always accepted NaN
inputs, treating them as equal but larger than any non-NaN,
as we do in ordinary comparisons.  But up to now, the four-argument
variants threw errors for a NaN operand.  This is inconsistent
and unnecessary, since we can perfectly well regard NaN as falling
after the last bucket.

We do still throw error for NaN or infinity histogram-bound inputs,
since there's no way to compute sensible bucket boundaries.

Arguably this is a bug fix, but given the lack of field complaints
I'm content to fix it in master.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/2822872.1750540911@sss.pgh.pa.us
2025-07-02 11:34:40 -04:00
Michael Paquier
b45242fd30 Move code for the bytea data type from varlena.c to new bytea.c
This commit moves all the routines related to the bytea data type into
its own new file, called bytea.c, clearing some of the bloat in
varlena.c.  This includes the routines for:
- Input, output, receive and send
- Comparison
- Casts to integer types
- bytea-specific functions

The internals of the routines moved here are unchanged, with one
exception.  This comes with a twist in bytea_string_agg_transfn(), where
the call to makeStringAggState() is replaced by the internals of this
routine, still located in varlena.c.  This simplifies the move to the
new file by not having to expose makeStringAggState().

Author: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/CAJ7c6TMPVPJ5DL447zDz5ydctB8OmuviURtSwd=PHCRFEPDEAQ@mail.gmail.com
2025-07-02 09:52:21 +09:00
Jeff Davis
8af0d0ab01 Remove provider field from pg_locale_t.
The behavior of pg_locale_t is specified by methods, so a separate
provider field is no longer necessary.

Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
2025-07-01 07:50:46 -07:00
Jeff Davis
5a38104b36 Control ctype behavior internally with a method table.
Previously, pattern matching and case mapping behavior branched based
on the provider. Refactor to use a method table, which is less
error-prone.

This is also a step toward multiple provider versions, which we may
want to support in the future.

Reviewed-by: Andreas Karlsson <andreas@proxel.se>
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/2830211e1b6e6a2e26d845780b03e125281ea17b.camel%40j-davis.com
2025-07-01 07:44:47 -07:00
Jeff Davis
d81dcc8d62 Use pg_ascii_tolower()/pg_ascii_toupper() where appropriate.
Avoids unnecessary dependence on setlocale(). No behavior change.

This commit reverts e1458f2f1b, which reverted some changes
unintentionally committed before the branch for 19.

Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
Discussion: https://postgr.es/m/a8666c391dfcabe79868d95f7160eac533ace718.camel@j-davis.com
Discussion: https://postgr.es/m/7efaaa645aa5df3771bb47b9c35df27e08f3520e.camel@j-davis.com
2025-07-01 07:24:23 -07:00
Michael Paquier
2e94721747 Improve error handling of libxml2 calls in xml.c
This commit fixes some defects in the backend's xml.c, found upon
inspection of the internals of libxml2:
- xmlEncodeSpecialChars() can fail on malloc(), returning NULL back to
the caller.  xmltext() assumed that this could never happen.  Like other
code paths, a TRY/CATCH block is added there, covering also the fact
that cstring_to_text_with_len() could fail a memory allocation, where
the backend would miss to free the buffer allocated by
xmlEncodeSpecialChars().
- Some libxml2 routines called in xmlelement() can return NULL, like
xmlAddChildList() or xmlTextWriterStartElement().  Dedicated errors are
added for them.
- xml_xmlnodetoxmltype() missed that xmlXPathCastNodeToString() can fail
on an allocation failure.  In this case, the call can just be moved to
the existing TRY/CATCH block.

All these code paths would cause the server to crash.  As this is
unlikely a problem in practice, no backpatch is done.  Jim and I have
caught these defects, not sure who has scored the most.  The contrib
module xml2/ has similar defects, which will be addressed in a separate
change.

Reported-by: Jim Jones <jim.jones@uni-muenster.de>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Discussion: https://postgr.es/m/aEEingzOta_S_Nu7@paquier.xyz
2025-07-01 08:57:05 +09:00
Nathan Bossart
bd09f024a1 Add new OID alias type regdatabase.
This provides a convenient way to look up a database's OID.  For
example, the query

    SELECT * FROM pg_shdepend
    WHERE dbid = (SELECT oid FROM pg_database
                  WHERE datname = current_database());

can now be simplified to

    SELECT * FROM pg_shdepend
    WHERE dbid = current_database()::regdatabase;

Like the regrole type, regdatabase has cluster-wide scope, so we
disallow regdatabase constants from appearing in stored
expressions.

Bumps catversion.

Author: Ian Lawrence Barwick <barwick@gmail.com>
Reviewed-by: Greg Sabino Mullane <htamfids@gmail.com>
Reviewed-by: Jian He <jian.universality@gmail.com>
Reviewed-by: Fabrízio de Royes Mello <fabriziomello@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/aBpjJhyHpM2LYcG0%40nathan
2025-06-30 15:38:54 -05:00
Peter Eisentraut
cc2ac0e6f9 Remove unused #include's in src/backend/utils/adt/*
Author: Aleksander Alekseev <aleksander@timescale.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/CAJ7c6TOowVbR-0NEvvDm6a_mag18krR0XJ2FKrc9DHXj7hFRtQ%40mail.gmail.com
2025-06-30 12:00:00 +02:00
Peter Eisentraut
50fd428b2b Message style improvements 2025-06-28 19:18:06 +02:00
Tom Lane
ea06263c4a Doc: improve documentation about width_bucket().
Specify whether the bucket bounds are inclusive or exclusive,
and improve some other vague language.  Explain the behavior that
occurs when the "low" bound is greater than the "high" bound.
Make width_bucket_numeric's comment more like that for
width_bucket_float8, in particular noting that infinite
bounds are rejected (since they became possible in v14).

Reported-by: Ben Peachey Higdon <bpeacheyhigdon@gmail.com>
Author: Robert Treat <rob@xzilla.net>
Co-authored-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com>
Discussion: https://postgr.es/m/2BD74F86-5B89-4AC1-8F13-23CED3546AC1@gmail.com
Backpatch-through: 13
2025-06-21 12:52:37 -04:00
Tom Lane
b27644bade Sync typedefs.list with the buildfarm.
Our maintenance of typedefs.list has been a little haphazard
(and apparently we can't alphabetize worth a darn).  Replace
the file with the authoritative list from our buildfarm, and
run pgindent using that.

I also updated the additions/exclusions lists in pgindent where
necessary to keep pgindent from messing things up significantly.
Notably, now that regex_t and some related names are macros not real
typedefs, we have to whitelist them explicitly.  The exclusions list
has also drifted noticeably, presumably due to changes of system
headers on the buildfarm animals that contribute to the list.

Unlike in prior years, I've not manually added typedef names that
are missing from the buildfarm's list because they are not used to
declare any variables or fields.  So there are a few places where
the typedef declaration itself is formatted worse than before,
e.g. typedef enum IoMethod.  I could preserve the names that were
manually added to the list previously, but I'd really prefer to find
a less manual way of dealing with these cases.  A quick grep finds
about 75 such symbols, most of which have never gotten any special
treatment.

Per discussion among pgsql-release, doing this now seems appropriate
even though we're still a week or two away from making the v18 branch.
2025-06-15 13:04:24 -04:00
Jeff Davis
e1458f2f1b Revert a few small patches that were intended for version 19.
- 4c787a24e7
- 78bd364ee3
- 7a6880fadc
- 8898082a5d

Suggested-by: Robert Haas <robertmhaas@gmail.com>
Discussion: https://postgr.es/m/CA+TgmoZ=J=PVNZUNKaxULu+KUVSt3Y-aJ1DZ9Y3Co6mu0z62jA@mail.gmail.com
Discussion: https://postgr.es/m/60e8c6d0a6c08e67f15dbbe9e53df0119c710065.camel@j-davis.com
2025-06-11 15:10:12 -07:00
Jeff Davis
8898082a5d inet_net_pton.c: use pg_ascii_tolower() rather than tolower().
Avoid dependence on setlocale(). No behavior change.

Discussion: https://postgr.es/m/9875f7f9-50f1-4b5d-86fc-ee8b03e8c162@eisentraut.org
Reviewed-by: Peter Eisentraut <peter@eisentraut.org>
2025-06-10 11:23:20 -07:00
Peter Eisentraut
58fbfde152 Fix incorrect format placeholders 2025-06-03 21:38:04 +02:00
David Rowley
c3eda50b06 Change internal queryid type from uint64 to int64
uint64 was perhaps chosen in cff440d36 as the type was uint32 prior to
that widening work.

Having this as uint64 doesn't make much sense and just adds the overhead of
having to remember that we always output this in its signed form.  Let's
remove that overhead.

The signed form output is seemingly required since we have no way to
represent the full range of uint64 in an SQL type.  We use BIGINT in places
like pg_stat_statements, which maps directly to int64.

The release notes "Source Code" section may want to mention this
adjustment as some extensions may wish to adjust their code.

Author: David Rowley <dgrowleyml@gmail.com>
Suggested-by: Peter Eisentraut <peter@eisentraut.org>
Reviewed-by: Sami Imseih <samimseih@gmail.com>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/50cb0c8b-994b-48f9-a1c4-13039eb3536b@eisentraut.org
2025-05-30 22:59:39 +12:00
Tom Lane
e5d64fd654 Tighten parsing of datetime input.
ParseFraction only expects to deal with fields that contain a decimal
point and digit(s).  However it's possible in some edge cases for it
to be passed input that doesn't look like that.  In particular the
input could look like a valid floating-point number, such as ".123e6".
strtod() will happily eat that, possibly producing a result that is
not within the expected range 0..1, which can result in integer
overflow in the callers.  That doesn't have any security consequences,
but it's still not very desirable.  Fix by checking that the input
has the expected form.

Similarly, DecodeNumberField only expects to deal with fields that
contain a decimal point and digit(s), but it's sometimes abused to
parse strings that might not look like that.  This could result in
failure to reject bogus input, yielding silly results.  Again, fix
by rejecting input that doesn't look as-expected.  That decision
also means that we can affirmatively answer the very old comment
questioning whether we couldn't save some duplicative code by
using ParseFractionalSecond here.

While these changes should only reject input that nobody would
consider valid, it still doesn't seem like a change to make in
stable branches.  Apply to HEAD only.

Reported-by: Evgeniy Gorbanev <gorbanev.es@gmail.com>
Author: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/1328335.1748371099@sss.pgh.pa.us
2025-05-28 15:10:48 -04:00
Michael Paquier
d46911e584 Fix conversion of SIMILAR TO regexes for character classes
The code that translates SIMILAR TO pattern matching expressions to
POSIX-style regular expressions did not consider that square brackets
can be nested.  For example, in an expression like [[:alpha:]%_], the
logic replaced the placeholders '_' and '%' but it should not.

This commit fixes the conversion logic by tracking the nesting level of
square brackets marking character class areas, while considering that
in expressions like []] or [^]] the first closing square bracket is a
regular character.  Multiple tests are added to show how the conversions
should or should not apply applied while in a character class area, with
specific cases added for all the characters converted outside character
classes like an opening parenthesis '(', dollar sign '$', etc.

Author: Laurenz Albe <laurenz.albe@cybertec.at>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Michael Paquier <michael@paquier.xyz>
Discussion: https://postgr.es/m/16ab039d1af455652bdf4173402ddda145f2c73b.camel@cybertec.at
Backpatch-through: 13
2025-05-28 08:58:40 +09:00
Daniel Gustafsson
fb844b9f06 Revert function to get memory context stats for processes
Due to concerns raised about the approach, and memory leaks found
in sensitive contexts the functionality is reverted. This reverts
commits 45e7e8ca9, f8c115a6c, d2a1ed172, 55ef7abf8 and 042a66291
for v18 with an intent to revisit this patch for v19.

Discussion: https://postgr.es/m/594293.1747708165@sss.pgh.pa.us
2025-05-23 15:44:54 +02:00
Tom Lane
f24605e2dc Fix memory leak in XMLSERIALIZE(... INDENT).
xmltotext_with_options sometimes tries to replace the existing
root node of a libxml2 document.  In that case xmlDocSetRootElement
will unlink and return the old root node; if we fail to free it,
it's leaked for the remainder of the session.  The amount of memory
at stake is not large, a couple hundred bytes per occurrence, but
that could still become annoying in heavy usage.

Our only other xmlDocSetRootElement call is not at risk because
it's working on a just-created document, but let's modify that
code too to make it clear that it's dependent on that.

Author: Tom Lane <tgl@sss.pgh.pa.us>
Reviewed-by: Jim Jones <jim.jones@uni-muenster.de>
Discussion: https://postgr.es/m/1358967.1747858817@sss.pgh.pa.us
Backpatch-through: 16
2025-05-22 13:52:46 -04:00
Heikki Linnakangas
29f7ce6fe7 Fix deparsing FETCH FIRST <expr> ROWS WITH TIES
In the grammar, <expr> is a c_expr, which accepts only a limited set
of integer literals and simple expressions without parens. The
deparsing logic didn't quite match the grammar rule, and failed to use
parens e.g. for "5::bigint".

To fix, always surround the expression with parens. Would be nice to
omit the parens in simple cases, but unfortunately it's non-trivial to
detect such simple cases. Even if the expression is a simple literal
123 in the original query, after parse analysis it becomes a FuncExpr
with COERCE_IMPLICIT_CAST rather than a simple Const.

Reported-by: yonghao lee
Backpatch-through: 13
Discussion: https://www.postgresql.org/message-id/18929-077d6b7093b176e2@postgresql.org
2025-05-19 18:50:26 +03:00
Álvaro Herrera
0588656366 Fix comment of tsquerysend()
The comment describes the order in which fields are sent, and it had one
of the fields in the wrong place.

This has been wrong since e6dbcb72fa (2008), so backpatch all the way
back.

Author: Emre Hasegeli <emre@hasegeli.com>
Discussion: https://postgr.es/m/CAE2gYzzf38bR_R=izhpMxAmqHXKeM5ajkmukh4mNs_oXfxcMCA@mail.gmail.com
2025-05-11 09:47:10 -04:00
Álvaro Herrera
7b2ad43426 Sort includes in alphabetical order
Added by commit 042a66291b, no backpatch needed.
2025-05-11 09:15:05 -04:00
David Rowley
918e7287ed Fix broken indentation
I forgot to run pgindent in d8555e522.

Reported-by: Fujii Masao <masao.fujii@oss.nttdata.com>
Discussion: https://postgr.es/m/156083c9-eac0-418d-9667-92dec4d6d6cd@oss.nttdata.com
2025-04-30 19:18:30 +12:00