postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-21 00:42:43 +03:00

Author	SHA1	Message	Date
Masahiko Sawada	66e94448ab	Restrict accesses to non-system views and foreign tables during pg_dump. When pg_dump retrieves the list of database objects and performs the data dump, there was possibility that objects are replaced with others of the same name, such as views, and access them. This vulnerability could result in code execution with superuser privileges during the pg_dump process. This issue can arise when dumping data of sequences, foreign tables (only 13 or later), or tables registered with a WHERE clause in the extension configuration table. To address this, pg_dump now utilizes the newly introduced restrict_nonsystem_relation_kind GUC parameter to restrict the accesses to non-system views and foreign tables during the dump process. This new GUC parameter is added to back branches too, but these changes do not require cluster recreation. Back-patch to all supported branches. Reviewed-by: Noah Misch Security: CVE-2024-7348 Backpatch-through: 12	2024-08-05 06:05:33 -07:00
David Rowley	ca6fde9225	Optimize JSON escaping using SIMD Here we adjust escape_json_with_len() to make use of SIMD to allow processing of up to 16-bytes at a time rather than processing a single byte at a time. This has been shown to speed up escaping of JSON strings significantly. Escaping is required for both JSON string properties and also the property names themselves, so this should also help improve the speed of the conversion from JSON into text for JSON objects that have property names 16 or more bytes long. Escaping JSON strings was often a significant bottleneck for longer strings. With these changes, some benchmarking has shown a query performing nearly 4 times faster when escaping a JSON object with a 1MB text property. Tests with shorter text properties saw smaller but still significant performance improvements. For example, a test outputting 1024 JSON strings with a text property length ranging from 1 char to 1024 chars became around 2 times faster. Author: David Rowley Reviewed-by: Melih Mutlu Discussion: https://postgr.es/m/CAApHDvpLXwMZvbCKcdGfU9XQjGCDm7tFpRdTXuB9PVgpNUYfEQ@mail.gmail.com	2024-08-05 23:16:44 +12:00
Michael Paquier	7949d95945	Introduce pluggable APIs for Cumulative Statistics This commit adds support in the backend for $subject, allowing out-of-core extensions to plug their own custom kinds of cumulative statistics. This feature has come up a few times into the lists, and the first, original, suggestion came from Andres Freund, about pg_stat_statements to use the cumulative statistics APIs in shared memory rather than its own less efficient internals. The advantage of this implementation is that this can be extended to any kind of statistics. The stats kinds are divided into two parts: - The in-core "builtin" stats kinds, with designated initializers, able to use IDs up to 128. - The "custom" stats kinds, able to use a range of IDs from 128 to 256 (128 slots available as of this patch), with information saved in TopMemoryContext. This can be made larger, if necessary. There are two types of cumulative statistics in the backend: - For fixed-numbered objects (like WAL, archiver, etc.). These are attached to the snapshot and pgstats shmem control structures for efficiency, and built-in stats kinds still do that to avoid any redirection penalty. The data of custom kinds is stored in a first array in snapshot structure and a second array in the shmem control structure, both indexed by their ID, acting as an equivalent of the builtin stats. - For variable-numbered objects (like tables, functions, etc.). These are stored in a dshash using the stats kind ID in the hash lookup key. Internally, the handling of the builtin stats is unchanged, and both fixed and variabled-numbered objects are supported. Structure definitions for builtin stats kinds are renamed to reflect better the differences with custom kinds. Like custom RMGRs, custom cumulative statistics can only be loaded with shared_preload_libraries at startup, and must allocate a unique ID shared across all the PostgreSQL extension ecosystem with the following wiki page to avoid conflicts: https://wiki.postgresql.org/wiki/CustomCumulativeStats This makes the detection of the stats kinds and their handling when reading and writing stats much easier than, say, allocating IDs for stats kinds from a shared memory counter, that may change the ID used by a stats kind across restarts. When under development, extensions can use PGSTAT_KIND_EXPERIMENTAL. Two examples that can be used as templates for fixed-numbered and variable-numbered stats kinds will be added in some follow-up commits, with tests to provide coverage. Some documentation is added to explain how to use this plugin facility. Author: Michael Paquier Reviewed-by: Dmitry Dolgov, Bertrand Drouvot Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz	2024-08-04 19:41:24 +09:00
Michael Paquier	028b4b21df	Fix incorrect format placeholders in pgstat.c These should have been switched from %d to %u in `3188a4582a` in the debugging elogs added in `ca1ba50fcb`. PgStat_Kind should never be higher than INT32_MAX, but let's be clean. Issue noticed while hacking more on this area.	2024-08-04 03:07:20 +09:00
Alexander Korotkov	3c5db1d6b0	Implement pg_wal_replay_wait() stored procedure pg_wal_replay_wait() is to be used on standby and specifies waiting for the specific WAL location to be replayed. This option is useful when the user makes some data changes on primary and needs a guarantee to see these changes are on standby. The queue of waiters is stored in the shared memory as an LSN-ordered pairing heap, where the waiter with the nearest LSN stays on the top. During the replay of WAL, waiters whose LSNs have already been replayed are deleted from the shared memory pairing heap and woken up by setting their latches. pg_wal_replay_wait() needs to wait without any snapshot held. Otherwise, the snapshot could prevent the replay of WAL records, implying a kind of self-deadlock. This is why it is only possible to implement pg_wal_replay_wait() as a procedure working without an active snapshot, not a function. Catversion is bumped. Discussion: https://postgr.es/m/eb12f9b03851bb2583adab5df9579b4b%40postgrespro.ru Author: Kartyshov Ivan, Alexander Korotkov Reviewed-by: Michael Paquier, Peter Eisentraut, Dilip Kumar, Amit Kapila Reviewed-by: Alexander Lakhin, Bharath Rupireddy, Euler Taveira Reviewed-by: Heikki Linnakangas, Kyotaro Horiguchi	2024-08-02 21:16:56 +03:00
Michael Paquier	3188a4582a	Switch PgStat_Kind from an enum to a uint32 type A follow-up patch is planned to make cumulative statistics pluggable, and using a type is useful in the internal routines used by pgstats as PgStat_Kind may have a value that was not originally in the enum removed here, once made pluggable. While on it, this commit switches pgstat_is_kind_valid() to use PgStat_Kind rather than an int, to be more consistent with its existing callers. Some loops based on the stats kind IDs are switched to use PgStat_Kind rather than int, for consistency with the new time. Author: Michael Paquier Reviewed-by: Dmitry Dolgov, Bertrand Drouvot Discussion: https://postgr.es/m/Zmqm9j5EO0I4W8dx@paquier.xyz	2024-08-02 04:49:34 +09:00
Michael Paquier	b860848232	Add redo LSN to pgstats files This is used in the startup process to check that the pgstats file we are reading includes the redo LSN referring to the shutdown checkpoint where it has been written. The redo LSN in the pgstats file needs to match with what the control file has. This is intended to be used for an upcoming change that will extend the write of the stats file to happen during checkpoints, rather than only shutdown sequences. Bump PGSTAT_FILE_FORMAT_ID. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz	2024-08-02 01:57:28 +09:00
Peter Eisentraut	a292c98d62	Convert node test compile-time settings into run-time parameters This converts COPY_PARSE_PLAN_TREES WRITE_READ_PARSE_PLAN_TREES RAW_EXPRESSION_COVERAGE_TEST into run-time parameters debug_copy_parse_plan_trees debug_write_read_parse_plan_trees debug_raw_expression_coverage_test They can be activated for tests using PG_TEST_INITDB_EXTRA_OPTS. The compile-time symbols are kept for build farm compatibility, but they now just determine the default value of the run-time settings. Furthermore, support for these settings is not compiled in at all unless assertions are enabled, or the new symbol DEBUG_NODE_TESTS_ENABLED is defined at compile time, or any of the legacy compile-time setting symbols are defined. So there is no run-time overhead in production builds. (This is similar to the handling of DISCARD_CACHES_ENABLED.) Discussion: https://www.postgresql.org/message-id/flat/30747bd8-f51e-4e0c-a310-a6e2c37ec8aa%40eisentraut.org	2024-08-01 10:09:18 +02:00
Jeff Davis	679c5084cf	Relax check for return value from second call of pg_strnxfrm(). strxfrm() is not guaranteed to return the exact number of bytes needed to store the result; it may return a higher value. Discussion: https://postgr.es/m/32f85d88d1f64395abfe5a10dd97a62a4d3474ce.camel@j-davis.com Reviewed-by: Heikki Linnakangas Backpatch-through: 16	2024-07-30 16:23:20 -07:00
Heikki Linnakangas	f822be3962	Refactor getWeights to write to caller-supplied buffer This gets rid of the static result buffer. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi	2024-07-30 22:06:07 +03:00
Heikki Linnakangas	5bf948d564	Replace static bufs with a StringInfo in cash_words() For clarity. The code was correct, and the buffer was large enough, but string manipulation with no bounds checking is scary. This incurs an extra palloc+pfree to every call, but in quick performance testing, it doesn't seem to be significant. Reviewed-by: Robert Haas Discussion: https://www.postgresql.org/message-id/7f86e06a-98c5-4ce3-8ec9-3885c8de0358@iki.fi	2024-07-30 22:02:58 +03:00
Andrew Dunstan	524d490a9f	Preserve tz when converting to jsonb timestamptz This removes an inconsistency in the treatment of different datatypes by the jsonpath timestamp_tz() function. Conversions from data types that are not timestamp-aware, such as date and timestamp, are now treated consistently with conversion from those that are such as timestamptz. Author: David Wheeler Reviewed-by: Junwang Zhao and Jeevan Chalke Discussion: https://postgr.es/m/7DE080CE-6D8C-4794-9BD1-7D9699172FAB%40justatheory.com Backpatch to release 17.	2024-07-30 07:57:38 -04:00
Jeff Davis	72fe6d24a3	Make collation not depend on setlocale(). Now that the result of pg_newlocale_from_collation() is always non-NULL, then we can move the collate_is_c and ctype_is_c flags into pg_locale_t. That simplifies the logic in lc_collate_is_c() and lc_ctype_is_c(), removing the dependence on setlocale(). This commit also eliminates the multi-stage initialization of the collation cache. As long as we have catalog access, then it's now safe to call pg_newlocale_from_collation() without checking lc_collate_is_c() first. Discussion: https://postgr.es/m/cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org Reviewed-by: Peter Eisentraut, Andreas Karlsson	2024-07-30 00:58:06 -07:00
Richard Guo	9b282a9359	Fix partitionwise join with partially-redundant join clauses To determine if the two relations being joined can use partitionwise join, we need to verify the existence of equi-join conditions involving pairs of matching partition keys for all partition keys. Currently we do that by looking through the join's restriction clauses. However, it has been discovered that this approach is insufficient, because there might be partition keys known equal by a specific EC, but they do not form a join clause because it happens that other members of the EC than the partition keys are constrained to become a join clause. To address this issue, in addition to examining the join's restriction clauses, we also check if any partition keys are known equal by ECs, by leveraging function exprs_known_equal(). To accomplish this, we enhance exprs_known_equal() to check equality per the semantics of the opfamily, if provided. It could be argued that exprs_known_equal() could be called O(N^2) times, where N is the number of partition key expressions, resulting in noticeable performance costs if there are a lot of partition key expressions. But I think this is not a problem. The number of a joinrel's partition key expressions would only be equal to the join degree, since each base relation within the join contributes only one partition key expression. That is to say, it does not scale with the number of partitions. A benchmark with a query involving 5-way joins of partitioned tables, each with 3 partition keys and 1000 partitions, shows that the planning time is not significantly affected by this patch (within the margin of error), particularly when compared to the impact caused by partitionwise join. Thanks to Tom Lane for the idea of leveraging exprs_known_equal() to check if partition keys are known equal by ECs. Author: Richard Guo, Tom Lane Reviewed-by: Tom Lane, Ashutosh Bapat, Robert Haas Discussion: https://postgr.es/m/CAN_9JTzo_2F5dKLqXVtDX5V6dwqB0Xk+ihstpKEt3a1LT6X78A@mail.gmail.com	2024-07-30 15:51:54 +09:00
Michael Paquier	ca1ba50fcb	Add more debugging information when failing to read pgstats files This is useful to know which part of a stats file is corrupted when reading it, adding to the server logs a WARNING with details about what could not be read before giving up with the remaining data in the file. Author: Michael Paquier Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Zp8o6_cl0KSgsnvS@paquier.xyz	2024-07-30 15:08:21 +09:00
Jeff Davis	8240401437	Do not return NULL from pg_newlocale_from_collation(). Previously, pg_newlocale_from_collation() returned NULL as a special case for the DEFAULT_COLLATION_OID if the provider was libc. In that case the behavior would depend on the last call to setlocale(). Now, consistent with the other providers, it will return a pointer to default_locale, which is not dependent on setlocale(). Note: for the C and POSIX locales, the locale_t structure within the pg_locale_t will still be zero, because those locales are implemented with internal logic and do not use libc at all. lc_collate_is_c() and lc_ctype_is_c() still depend on setlocale() to determine the current locale, which will be removed in a subsequent commit. Discussion: https://postgr.es/m/cfd9eb85-c52a-4ec9-a90e-a5e4de56e57d@eisentraut.org Reviewed-by: Peter Eisentraut, Andreas Karlsson	2024-07-29 15:18:06 -07:00
Heikki Linnakangas	679f940740	Remove dead generators for cyrillic encoding conversion tables These tools were used to read the koi-iso.tab, koi-win.tab, and koi-alt.tab files, which contained the mappings between the single-byte cyrillic encodings. However, those data files were removed in commit `4c3c8c048d`, back in 2003. These code generators have been unused and unusable ever since. The generated tables live in cyrillic_and_mic.c. There has been one change to the tables since they were generated in 1999, in commit `f4b7624eb0`. So if we resurrected the original data tables, that change would need to be taken into account. So this code is very dead. The tables in cyrillic_and_mic.c, which were originally generated by these tools, are now the authoritative source for these mappings. Reviewed-by: Tom Lane, Aleksander Alekseev Discussion: https://www.postgresql.org/message-id/flat/a821c3dc-36ec-4cee-8b41-7ccaa17adb18@iki.fi	2024-07-29 20:38:19 +03:00
Heikki Linnakangas	9d9b9d46f3	Move cancel key generation to after forking the backend Move responsibility of generating the cancel key to the backend process. The cancel key is now generated after forking, and the backend advertises it in the ProcSignal array. When a cancel request arrives, the backend handling it scans the ProcSignal array to find the target pid and cancel key. This is similar to how this previously worked in the EXEC_BACKEND case with the ShmemBackendArray, just reusing the ProcSignal array. One notable change is that we no longer generate cancellation keys for non-backend processes. We generated them before just to prevent a malicious user from canceling them; the keys for non-backend processes were never actually given to anyone. There is now an explicit flag indicating whether a process has a valid key or not. I wrote this originally in preparation for supporting longer cancel keys, but it's a nice cleanup on its own. Reviewed-by: Jelte Fennema-Nio Discussion: https://www.postgresql.org/message-id/508d0505-8b7a-4864-a681-e7e5edfe32aa@iki.fi	2024-07-29 15:37:48 +03:00
David Rowley	da87dc07f1	Add missing pointer dereference in pg_backend_memory_contexts view `32d3ed816` moved the logic for setting the context's name and ident into a reusable function. I missed adding a pointer dereference after copying and pasting the code into that function. The ident parameter is a pointer to the ident variable in the calling function, so the dereference is required to correctly determine if the contents of that variable is NULL or not. In passing, adjust the if condition to include an == NULL to make it more clear that it's not checking for == '\0'. Reported-by: Tom Lane, Coverity Discussion: https://postgr.es/m/2256588.1722184287@sss.pgh.pa.us	2024-07-29 09:53:10 +12:00
Jeff Davis	c0ef1234df	Fix whitespace in commit `005c6b833f`.	2024-07-28 13:34:52 -07:00
Jeff Davis	1c461a8d8d	Refactor: make default_locale internal to pg_locale.c. Discussion: https://postgr.es/m/2228884bb1f1a02614b39f71a90c94d2cc8a3a2f.camel@j-davis.com Reviewed-by: Peter Eisentraut, Andreas Karlsson	2024-07-28 13:07:25 -07:00
Jeff Davis	005c6b833f	Change collation cache to use simplehash.h. Speeds up text comparison expressions when using a collation other than the database default collation. Does not affect larger operations such as ORDER BY, because the lookup is only done once. Discussion: https://postgr.es/m/7bb9f018d20a7b30b9a7f6231efab1b5e50c7720.camel@j-davis.com Reviewed-by: John Naylor, Andreas Karlsson	2024-07-28 12:39:57 -07:00
David Rowley	b181062aa5	Fix incorrect return value for pg_size_pretty(bigint) pg_size_pretty(bigint) would return the value in bytes rather than PB for the smallest-most bigint value. This happened due to an incorrect assumption that the absolute value of -9223372036854775808 could be stored inside a signed 64-bit type. Here we fix that by instead storing that value in an unsigned 64-bit type. This bug does exist in versions prior to 15 but the code there is sufficiently different and the bug seems sufficiently non-critical that it does not seem worth risking backpatching further. Author: Joseph Koshakow <koshy44@gmail.com> Discussion: https://postgr.es/m/CAAvxfHdTsMZPWEHUrZ=h3cky9Ccc3Mtx2whUHygY+ABP-mCmUw@mail.gmail.com Backpatch-through: 15	2024-07-28 22:22:52 +12:00
David Rowley	17a5871d9d	Optimize escaping of JSON strings There were quite a few places where we either had a non-NUL-terminated string or a text Datum which we needed to call escape_json() on. Many of these places required that a temporary string was created due to the fact that escape_json() needs a NUL-terminated cstring. For text types, those first had to be converted to cstring before calling escape_json() on them. Here we introduce two new functions to make escaping JSON more optimal: escape_json_text() can be given a text Datum to append onto the given buffer. This is more optimal as it foregoes the need to convert the text Datum into a cstring. A temporary allocation is only required if the text Datum needs to be detoasted. escape_json_with_len() can be used when the length of the cstring is already known or the given string isn't NUL-terminated. Having this allows various places which were creating a temporary NUL-terminated string to just call escape_json_with_len() without any temporary memory allocations. Discussion: https://postgr.es/m/CAApHDvpLXwMZvbCKcdGfU9XQjGCDm7tFpRdTXuB9PVgpNUYfEQ@mail.gmail.com Reviewed-by: Melih Mutlu, Heikki Linnakangas	2024-07-27 23:46:07 +12:00
Nathan Bossart	0dcaea5690	Introduce num_os_semaphores GUC. The documentation for System V IPC parameters provides complicated formulas to determine the appropriate values for SEMMNI and SEMMNS. Furthermore, these formulas have often been wrong because folks forget to update them (e.g., when adding a new auxiliary process). This commit introduces a new runtime-computed GUC named num_os_semaphores that reports the number of semaphores needed for the configured number of allowed connections, worker processes, etc. This new GUC allows us to simplify the formulas in the documentation, and it should help prevent future inaccuracies. Like the other runtime-computed GUCs, users can view it with "postgres -C" before starting the server, which is useful for preconfiguring the necessary operating system resources. Reviewed-by: Tom Lane, Sami Imseih, Andres Freund, Robert Haas Discussion: https://postgr.es/m/20240517164452.GA1914161%40nathanxps13	2024-07-26 15:28:55 -05:00
Heikki Linnakangas	20e0e7da9b	Add test for early backend startup errors The new test tests the libpq fallback behavior on an early error, which was fixed in the previous commit. This adds an IS_INJECTION_POINT_ATTACHED() macro, to allow writing injected test code alongside the normal source code. In principle, the new test could've been implemented by an extra test module with a callback that sets the FrontendProtocol global variable, but I think it's more clear to have the test code right where the injection point is, because it has pretty intimate knowledge of the surrounding context it runs in. Reviewed-by: Michael Paquier Discussion: https://www.postgresql.org/message-id/CAOYmi%2Bnwvu21mJ4DYKUa98HdfM_KZJi7B1MhyXtnsyOO-PB6Ww%40mail.gmail.com	2024-07-26 15:12:21 +03:00
Heikki Linnakangas	b9e5249c29	Fix using injection points at backend startup in EXEC_BACKEND mode Commit `86db52a506` changed the locking of injection points to use only atomic ops and spinlocks, to make it possible to define injection points in processes that don't have a PGPROC entry (yet). However, it didn't work in EXEC_BACKEND mode, because the pointer to shared memory area was not initialized until the process "attaches" to all the shared memory structs. To fix, pass the pointer to the child process along with other global variables that need to be set up early. Backpatch-through: 17	2024-07-26 15:11:50 +03:00
Amit Langote	4fc6a55560	SQL/JSON: Respect OMIT QUOTES when RETURNING domains over jsonb populate_domain() didn't take into account the omit_quotes flag passed down to json_populate_type() by ExecEvalJsonCoercion() and that led to incorrect behavior when the RETURNING type is a domain over jsonb. Fix that by passing the flag by adding a new function parameter to populate_domain(). Reported-by: Jian He <jian.universality@gmail.com> Discussion: https://postgr.es/m/CACJufxEo4sUjKCYtda0_qt9tazqqKPmF1cqhW9KBOUeJFqQd2g@mail.gmail.com Backpatch-through: 17	2024-07-26 16:08:13 +09:00
Peter Eisentraut	37c6923cf3	Fix -Wmissing-variable-declarations warnings for float.c special case This adds extern declarations for the global variables defined in float.c but not meant for external use. This is a workaround to be able to add -Wmissing-variable-declarations to the global set of warning options in the near future. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org	2024-07-25 10:40:04 +02:00
David Rowley	32d3ed8165	Add path column to pg_backend_memory_contexts view "path" provides a reliable method of determining the parent/child relationships between memory contexts. Previously this could be done in a non-reliable way by writing a recursive query and joining the "parent" and "name" columns. This wasn't reliable as the names were not unique, which could result in joining to the wrong parent. To make this reliable, "path" stores an array of numerical identifiers starting with the identifier for TopLevelMemoryContext. It contains an element for each intermediate parent between that and the current context. Incompatibility: Here we also adjust the "level" column to make it 1-based rather than 0-based. A 1-based level provides a convenient way to access elements in the "path" array. e.g. path[level] gives the identifier for the current context. Identifiers are not stable across multiple evaluations of the view. In an attempt to make these more stable for ad-hoc queries, the identifiers are assigned breadth-first. Contexts closer to TopLevelMemoryContext are less likely to change between queries and during queries. Author: Melih Mutlu <m.melihmutlu@gmail.com> Discussion: https://postgr.es/m/CAGPVpCThLyOsj3e_gYEvLoHkr5w=tadDiN_=z2OwsK3VJppeBA@mail.gmail.com Reviewed-by: Andres Freund, Stephen Frost, Atsushi Torikoshi, Reviewed-by: Michael Paquier, Robert Haas, David Rowley	2024-07-25 15:03:28 +12:00
Peter Eisentraut	774d47b6c0	Move all extern declarations for GUC variables to header files Add extern declarations in appropriate header files for global variables related to GUC. In many cases, this was handled quite inconsistently before, with some GUC variables declared in a header file and some only pulled in via ad-hoc extern declarations in various .c files. Also add PGDLLIMPORT qualifications to those variables. These were previously missing because src/tools/mark_pgdllimport.pl has only been used with header files. This also fixes -Wmissing-variable-declarations warnings for GUC variables (not yet part of the standard warning options). Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://www.postgresql.org/message-id/flat/e0a62134-83da-4ba4-8cdb-ceb0111c95ce@eisentraut.org	2024-07-24 06:31:07 +02:00
Nathan Bossart	991f8cf8ab	Detect integer overflow in array_set_slice(). When provided an empty initial array, array_set_slice() fails to check for overflow when computing the new array's dimensions. While such overflows are ordinarily caught by ArrayGetNItems(), commands with the following form are accepted: INSERT INTO t (i[-2147483648:2147483647]) VALUES ('{}'); To fix, perform the hazardous computations using overflow-detecting arithmetic routines. As with commit `18b585155a`, the added test cases generate errors that include a platform-dependent value, so we again use psql's VERBOSITY parameter to suppress printing the message text. Reported-by: Alexander Lakhin Author: Joseph Koshakow Reviewed-by: Jian He Discussion: https://postgr.es/m/31ad2cd1-db94-bdb3-f91a-65ffdb4bef95%40gmail.com Backpatch-through: 12	2024-07-23 21:59:02 -05:00
Peter Eisentraut	65504b747f	Replace remaining strtok() with strtok_r() for thread-safety in the server in the future Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: David Steele <david@pgmasters.net> Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org	2024-07-23 09:20:22 +02:00
Peter Eisentraut	5d2e1cc117	Replace some strtok() with strsep() strtok() considers adjacent delimiters to be one delimiter, which is arguably the wrong behavior in some cases. Replace with strsep(), which has the right behavior: Adjacent delimiters create an empty token. Affected by this are parsing of: - Stored SCRAM secrets ("SCRAM-SHA-256$<iterations>:<salt>$<storedkey>:<serverkey>") - ICU collation attributes ("und@colStrength=primary;colCaseLevel=yes") for ICU older than version 54 - PG_COLORS environment variable ("error=01;31:warning=01;35:note=01;36:locus=01") - pg_regress command-line options with comma-separated list arguments (--dbname, --create-role) (currently only used pg_regress_ecpg) Reviewed-by: Kyotaro Horiguchi <horikyota.ntt@gmail.com> Reviewed-by: David Steele <david@pgmasters.net> Discussion: https://www.postgresql.org/message-id/flat/79692bf9-17d3-41e6-b9c9-fc8c3944222a@eisentraut.org	2024-07-22 15:45:46 +02:00
Michael Paquier	2d8ef5e24f	Add new error code for "file name too long" This new error code, named file_name_too_long, maps internally to the errno ENAMETOOLONG to produce a proper error code rather than an internal code under errcode_for_file_access(). This error code can be reached with some SQL command patterns, like a snapshot file name. Reported-by: Alexander Lakhin Reviewed-by: Daniel Gustafsson Discussion: https://postgr.es/m/Zo4ROR9mgy8bowMo@paquier.xyz	2024-07-22 09:28:01 +09:00
Nathan Bossart	22b0ccd65d	Add overflow checks to money type. None of the arithmetic functions for the the money type handle overflow. This commit introduces several helper functions with overflow checking and makes use of them in the money type's arithmetic functions. Fixes bug #18240. Reported-by: Alexander Lakhin Author: Joseph Koshakow Discussion: https://postgr.es/m/18240-c5da758d7dc1ecf0%40postgresql.org Discussion: https://postgr.es/m/CAAvxfHdBPOyEGS7s%2Bxf4iaW0-cgiq25jpYdWBqQqvLtLe_t6tw%40mail.gmail.com Backpatch-through: 12	2024-07-19 11:52:32 -05:00
Michael Paquier	a0a5869a85	Add INJECTION_POINT_CACHED() to run injection points directly from cache This new macro is able to perform a direct lookup from the local cache of injection points (refreshed each time a point is loaded or run), without touching the shared memory state of injection points at all. This works in combination with INJECTION_POINT_LOAD(), and it is better than INJECTION_POINT() in a critical section due to the fact that it would avoid all memory allocations should a concurrent detach happen since a LOAD(), as it retrieves a callback from the backend-private memory. The documentation is updated to describe in more details how to use this new macro with a load. Some tests are added to the module injection_points based on a new SQL function that acts as a wrapper of INJECTION_POINT_CACHED(). Based on a suggestion from Heikki Linnakangas. Author: Heikki Linnakangas, Michael Paquier Discussion: https://postgr.es/m/58d588d0-e63f-432f-9181-bed29313dece@iki.fi	2024-07-18 09:50:41 +09:00
Nathan Bossart	a99cc6c6b4	Use PqMsg_* macros in more places. Commit `f4b54e1ed9`, which introduced macros for protocol characters, missed updating a few places. It also did not introduce macros for messages sent from parallel workers to their leader processes. This commit adds a new section in protocol.h for those. Author: Aleksander Alekseev Discussion: https://postgr.es/m/CAJ7c6TNTd09AZq8tGaHS3LDyH_CCnpv0oOz2wN1dGe8zekxrdQ%40mail.gmail.com Backpatch-through: 17	2024-07-17 10:51:00 -05:00
Michael Paquier	ec678692f6	Make write of pgstats file durable at shutdown This switches the pgstats write code to use durable_rename() rather than rename(). This ensures that the stats file's data is durable when the statistics are written, which is something only happening at shutdown now with the checkpointer doing the job. This could cause the statistics to be lost even after PostgreSQL is shut down, should a host failure happen, for example. Suggested-by: Konstantin Knizhnik Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/ZpDQTZ0cAz0WEbh7@paquier.xyz	2024-07-17 11:50:36 +09:00
Andres Freund	47ecbfdfcc	Fix bad indentation introduced in `43cd30bcd1` Oops. Reported-by: Nathan Bossart <nathandbossart@gmail.com> Discussion: https://postgr.es/m/ZpVZB9rH5tHllO75@nathan Backpatch: 12-, like `43cd30bcd1`	2024-07-15 15:17:04 -07:00
Andres Freund	43cd30bcd1	Fix type confusion in guc_var_compare() Before this change guc_var_compare() cast the input arguments to const struct config_generic . That's not quite right however, as the input on one side is often just a char on one side. Instead just use char *, the first field in config_generic. This fixes a -Warray-bounds warning with some versions of gcc. While the warning is only known to be triggered for <= 15, the issue the warning points out seems real, so apply the fix everywhere. Author: Nazir Bilal Yavuz <byavuz81@gmail.com> Reported-by: Erik Rijkers <er@xs4all.nl> Suggested-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/a74a1a0d-0fd2-3649-5224-4f754e8f91aa%40xs4all.nl	2024-07-15 09:26:01 -07:00
Heikki Linnakangas	86db52a506	Use atomics to avoid locking in InjectionPointRun() This allows using injection points without having a PGPROC, like early at backend startup, or in the postmaster. The injection points facility is new in v17, so backpatch there. Reviewed-by: Michael Paquier <michael@paquier.xyz> Disussion: https://www.postgresql.org/message-id/4317a7f7-8d24-435e-9e49-29b72a3dc418@iki.fi	2024-07-15 10:22:11 +03:00
Michael Paquier	734c057a89	Add assertion in pgstat_write_statsfile() about processes allowed This routine can currently only be called from the postmaster in single-user mode or the checkpointer, but there was no sanity check to make sure that this was always the case. This has proved to be useful when hacking the zone (at least to me), to make sure that the write of the pgstats file happens at shutdown, as wanted by design, in the correct process context. Discussion: https://postgr.es/m/ZnEiqAITL-VgZDoY@paquier.xyz	2024-07-12 15:09:53 +09:00
Michael Paquier	72c0b24b2d	Improve comment of pgstat_read_statsfile() The comment at the top of pgstat_read_statsfile() mentioned that the stats are read from the on-disk file into the pgstats dshash. This is incorrect for fix-numbered stats as these are loaded directly into shared memory. This commit simplifies the comment to be more general. Author: Bertrand Drouvot Discussion: https://postgr.es/m/Zo/eJIHUcqKxeSgv@ip-10-97-1-34.eu-west-3.compute.internal	2024-07-12 09:31:33 +09:00
Tom Lane	a0f1fce80c	Add min and max aggregates for composite types (records). Like min/max for arrays, these are just thin wrappers around the existing btree comparison function for records. Aleksander Alekseev Discussion: https://postgr.es/m/CAO=iB8L4WYSNxCJ8GURRjQsrXEQ2-zn3FiCsh2LMqvWq2WcONg@mail.gmail.com	2024-07-11 11:50:50 -04:00
Michael Paquier	9e4664d950	Add a new 'F' entry type for fixed-numbered stats in pgstats file This new entry type is used for all the fixed-numbered statistics, making possible support for custom pluggable stats. In short, we need to be able to detect more easily if a stats kind exists or not when reading back its data from the pgstats file without a dependency on the order of the entries read. The kind ID of the stats is added to the data written. The data is written in the same fashion as previously, with the fixed-numbered stats first and the dshash entries next. The read part becomes more flexible, loading fixed-numbered stats into shared memory based on the new entry type found. Bump PGSTAT_FILE_FORMAT_ID. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Zot5bxoPYdS7yaoy@paquier.xyz	2024-07-11 16:12:44 +09:00
Michael Paquier	21471f18e9	Add PgStat_KindInfo.init_shmem_cb This new callback gives fixed-numbered stats the possibility to take actions based on the area of shared memory allocated for them. This removes from pgstat_shmem.c any knowledge specific to the types of fixed-numbered stats, and the initializations happen in their own files. Like `b68b29bc8f`, this change is useful to make this area of the code more pluggable, so as custom fixed-numbered stats can take actions after their shared memory area is initialized. Reviewed-by: Bertrand Drouvot Discussion: https://postgr.es/m/Zot5bxoPYdS7yaoy@paquier.xyz	2024-07-11 09:21:40 +09:00
Dean Rasheed	0dcf753bd8	Improve the numeric width_bucket() computation. Formerly, the computation of the bucket index involved calling div_var() with a scale determined by select_div_scale(), and then taking the floor of the result. That involved computing anything from 16 to 1000 digits after the decimal point, only for floor_var() to throw them away. In addition, the quotient was computed with rounding in the final digit, which meant that in rare cases the whole result could round up to the wrong bucket, and could exceed count. Thus it was also necessary to clamp the result to the range [1, count], though that didn't prevent the result being in the wrong internal bucket. Instead, compute the quotient using floor division, which guarantees the correct result, as specified by the SQL spec, and doesn't need to be clamped. This is both much simpler and more efficient, since it no longer computes any quotient digits after the decimal point. In addition, it is not necessary to have separate code to handle reversed bounds, since the signs cancel out when dividing. As with `b0e9e4d76c` and `a2a0c7c29e`, no back-patch. Dean Rasheed, reviewed by Joel Jacobson. Discussion: https://postgr.es/m/CAEZATCVbJH%2BLE9EXW8Rk3AxLe%3DjbOk2yrT_AUJGGh5Rah6zoeg%40mail.gmail.com	2024-07-10 20:07:20 +01:00
Tom Lane	e7192486dd	Suppress "chunk is not well balanced" errors from libxml2. libxml2 2.13 has an entirely different rule than earlier versions about when to emit "chunk is not well balanced" errors. This causes regression test output discrepancies for three test cases that formerly provoked that error (along with others) and now don't. Closer inspection shows that at least in 2.13, this error is pretty useless because it can only be emitted after some other more-relevant error. So let's get rid of the cross-version discrepancy by just suppressing it. In case some older libxml2 version is capable of emitting this error by itself, suppress only when some other error has already been captured. Like `066e8ac6e` and `6082b3d5d`, this will need to be back-patched, but let's check the results in HEAD first. (The patch for xml_2.out, in particular, is blind since I can't test it here.) Erik Wienhold and Tom Lane, per report from Frank Streitzig. Discussion: https://postgr.es/m/trinity-b0161630-d230-4598-9ebc-7a23acdb37cb-1720186432160@3c-app-gmx-bap25 Discussion: https://postgr.es/m/trinity-361ba18b-541a-4fe7-bc63-655ae3a7d599-1720259822452@3c-app-gmx-bs01	2024-07-09 15:01:13 -04:00
Dean Rasheed	ca481d3c9a	Optimise numeric multiplication for short inputs. When either input has a small number of digits, and the exact product is requested, the speed of numeric multiplication can be increased significantly by using a faster direct multiplication algorithm. This works by fully computing each result digit in turn, starting with the least significant, and propagating the carry up. This save cycles by not requiring a temporary buffer to store digit products, not making multiple passes over the digits of the longer input, and not requiring separate carry-propagation passes. For now, this is used when the shorter input has 1-4 NBASE digits (up to 13-16 decimal digits), and the longer input is of any size, which covers a lot of common real-world cases. Also, the relative benefit increases as the size of the longer input increases. Possible future work would be to try extending the technique to larger numbers of digits in the shorter input. Joel Jacobson and Dean Rasheed. Discussion: https://postgr.es/m/44d2ffca-d560-4919-b85a-4d07060946aa@app.fastmail.com	2024-07-09 10:00:42 +01:00

... 4 5 6 7 8 ...

9018 Commits