postgres

mirror of https://github.com/postgres/postgres.git synced 2025-12-22 17:42:17 +03:00

Author	SHA1	Message	Date
David Rowley	3c569049b7	Allow left join removals and unique joins on partitioned tables This allows left join removals and unique joins to work with partitioned tables. The planner just lacked sufficient proofs that a given join would not cause any row duplication. Unique indexes currently serve as that proof, so have get_relation_info() populate the indexlist for partitioned tables too. Author: Arne Roland Reviewed-by: Alvaro Herrera, Zhihong Yu, Amit Langote, David Rowley Discussion: https://postgr.es/m/c3b2408b7a39433b8230bbcd02e9f302@index.de	2023-01-09 17:15:08 +13:00
Amit Kapila	216a784829	Perform apply of large transactions by parallel workers. Currently, for large transactions, the publisher sends the data in multiple streams (changes divided into chunks depending upon logical_decoding_work_mem), and then on the subscriber-side, the apply worker writes the changes into temporary files and once it receives the commit, it reads from those files and applies the entire transaction. To improve the performance of such transactions, we can instead allow them to be applied via parallel workers. In this approach, we assign a new parallel apply worker (if available) as soon as the xact's first stream is received and the leader apply worker will send changes to this new worker via shared memory. The parallel apply worker will directly apply the change instead of writing it to temporary files. However, if the leader apply worker times out while attempting to send a message to the parallel apply worker, it will switch to "partial serialize" mode - in this mode, the leader serializes all remaining changes to a file and notifies the parallel apply workers to read and apply them at the end of the transaction. We use a non-blocking way to send the messages from the leader apply worker to the parallel apply to avoid deadlocks. We keep this parallel apply assigned till the transaction commit is received and also wait for the worker to finish at commit. This preserves commit ordering and avoid writing to and reading from files in most cases. We still need to spill if there is no worker available. This patch also extends the SUBSCRIPTION 'streaming' parameter so that the user can control whether to apply the streaming transaction in a parallel apply worker or spill the change to disk. The user can set the streaming parameter to 'on/off', or 'parallel'. The parameter value 'parallel' means the streaming will be applied via a parallel apply worker, if available. The parameter value 'on' means the streaming transaction will be spilled to disk. The default value is 'off' (same as current behaviour). In addition, the patch extends the logical replication STREAM_ABORT message so that abort_lsn and abort_time can also be sent which can be used to update the replication origin in parallel apply worker when the streaming transaction is aborted. Because this message extension is needed to support parallel streaming, parallel streaming is not supported for publications on servers < PG16. Author: Hou Zhijie, Wang wei, Amit Kapila with design inputs from Sawada Masahiko Reviewed-by: Sawada Masahiko, Peter Smith, Dilip Kumar, Shi yu, Kuroda Hayato, Shveta Mallik Discussion: https://postgr.es/m/CAA4eK1+wyN6zpaHUkCLorEWNx75MG0xhMwcFhvjqm2KURZEAGw@mail.gmail.com	2023-01-09 07:52:45 +05:30
Alexander Korotkov	cd9479af2a	Improve GIN cost estimation GIN index scans were not taking any descent CPU-based cost into account. That made them look cheaper than other types of indexes when they shouldn't be. We use the same heuristic as for btree indexes, but multiply it by the number of searched entries. Additionally, the CPU cost for the tree was based largely on a genericcostestimate. For a GIN index, we should not charge index quals per tuple, but per entry. On top of this, charge cpu_index_tuple_cost per actual tuple. This should fix the cases where a GIN index is preferred over a btree and the ones where a memoize node is not added on top of the GIN index scan because it seemed too cheap. We don't packpatch this to evade unexpected plan changes in stable versions. Discussion: https://postgr.es/m/CABs3KGQnOkyQ42-zKQqiE7M0Ks9oWDSee%3D%2BJx3-TGq%3D68xqWYw%40mail.gmail.com Discussion: https://postgr.es/m/3188617.44csPzL39Z%40aivenronan Author: Ronan Dunklau Reported-By: Hung Nguyen Reviewed-by: Tom Lane, Alexander Korotkov	2023-01-08 22:51:43 +03:00
Alexander Korotkov	eb5c4e953b	Extract the multiplier for CPU process cost of index page into a macro B-tree, GiST and SP-GiST all charge 50.0 * cpu_operator_cost for processing an index page. Extract this to a macro to avoid repeated magic numbers. Discussion: https://mail.google.com/mail/u/0/?ik=a20b091faa&view=om&permmsgid=msg-f%3A1751459697261369543 Author: Ronan Dunklau	2023-01-08 22:37:33 +03:00
David Rowley	b82557ecc2	Fix some compiler warnings in aset.c and generation.c This fixes a couple of unused variable warnings that could be seen when compiling with MEMORY_CONTEXT_CHECKING but not USE_ASSERT_CHECKING. Defining MEMORY_CONTEXT_CHECKING without asserts is a little unusual, however, we shouldn't be producing any warnings from such a build. Author: Richard Guo Discussion: https://postgr.es/m/CAMbWs4_D-vgLEh7eO47p=73u1jWO78NWf6Qfv1FndY1kG-Q-jA@mail.gmail.com	2023-01-05 12:56:17 +13:00
Michael Paquier	33ab0a2a52	Fix typos in comments, code and documentation While on it, newlines are removed from the end of two elog() strings. The others are simple grammar mistakes. One comment in pg_upgrade referred incorrectly to sequences since `a7e5457`. Author: Justin Pryzby Discussion: https://postgr.es/m/20221230231257.GI1153@telsasoft.com Backpatch-through: 11	2023-01-03 16:26:14 +09:00
Bruce Momjian	c8e1ba736b	Update copyright for 2023 Backpatch-through: 11	2023-01-02 15:00:37 -05:00
Tom Lane	2ceea5adb0	Accept "+infinity" in date and timestamp[tz] input. The float and numeric types accept this variant spelling of "infinity", so it seems like the datetime types should too. Vik Fearing, some cosmetic mods by me Discussion: https://postgr.es/m/d0bef637-2dbd-0a5d-e539-48243b6f6c5e@postgresfriends.org	2023-01-01 14:16:07 -05:00
Michael Paquier	7aa81c61ec	Fix precision handling for some COERCE_SQL_SYNTAX functions `f193883` has been incorrectly setting up the precision used in the timestamp compilations returned by the following functions: - LOCALTIME - LOCALTIMESTAMP - CURRENT_TIME - CURRENT_TIMESTAMP Specifying an out-of-range precision for CURRENT_TIMESTAMP and LOCALTIMESTAMP was raising a WARNING without adjusting the precision, leading to a subsequent error. LOCALTIME and CURRENT_TIME raised a WARNING without an error, still the precision given to the internal routines was not correct, so let's be clean. Ian has reported the problems in timestamp.c, while I have noticed the ones in date.c. Regression tests are added for all of them with precisions high enough to provide coverage for the warnings, something that went missing up to this commit. Author: Ian Lawrence Barwick, Michael Paquier Discussion: https://postgr.es/m/CAB8KJ=jQEnn9sYG+N752spt68wMrhmT-ocHCh4oeNmHF82QMWA@mail.gmail.com	2022-12-30 20:47:57 +09:00
Peter Eisentraut	1f605b82ba	Change argument of appendBinaryStringInfo from char * to void * There is some code that uses this function to assemble some kind of packed binary layout, which requires a bunch of casts because of this. Functions taking binary data plus length should take void * instead, like memcpy() for example. Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com	2022-12-30 11:05:09 +01:00
Peter Eisentraut	33a33f0ba4	Use appendStringInfoString instead of appendBinaryStringInfo where possible For the jsonpath output, we don't need to squeeze out every bit of performance, so instead use a more robust coding style. There are similar calls in jsonb.c, which we leave alone here since there is indeed a performance impact for bulk exports. Discussion: https://www.postgresql.org/message-id/flat/a0086cfc-ff0f-2827-20fe-52b591d2666c%40enterprisedb.com	2022-12-30 11:05:09 +01:00
Peter Eisentraut	faf3750657	Add const to BufFileWrite Make data buffer argument to BufFileWrite a const pointer and bubble this up to various callers and related APIs. This makes the APIs clearer and more consistent. Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com	2022-12-30 10:12:24 +01:00
Peter Eisentraut	5f2f99c9c6	Remove unnecessary casts Some code carefully cast all data buffer arguments for data write and read function calls to void , even though the respective arguments are already void . Remove this unnecessary clutter. Discussion: https://www.postgresql.org/message-id/flat/11dda853-bb5b-59ba-a746-e168b1ce4bdb%40enterprisedb.com	2022-12-30 10:12:24 +01:00
Tom Lane	a5434c5258	Remove new locale dependency in regproc regression test. The modified error message for regcollationin failure includes the database encoding, which it should've occurred to me is a portability hazard for the regression tests. Adjust the test so the expected output doesn't include that. In passing, fix a comment typo introduced in `b8c0ffbd2`. Per buildfarm.	2022-12-27 13:06:42 -05:00
Tom Lane	3ea7329c9a	Simplify the implementations of the to_reg* functions. Given the soft-input-error feature, we can reduce these functions to be just thin wrappers around a soft-error call of the corresponding datatype input function. This means less code and more certainty that the to_reg* functions match the normal input behavior. Notably, it also means that they will accept numeric OID input, which they didn't before. It's not clear to me if that omission had more than laziness behind it, but it doesn't seem like something we need to work hard to preserve. Discussion: https://postgr.es/m/3910031.1672095600@sss.pgh.pa.us	2022-12-27 12:33:04 -05:00
Tom Lane	858e776c84	Convert the reg* input functions to report (most) errors softly. This is not really complete, but it catches most cases of practical interest. The main omissions are: * regtype, regprocedure, and regoperator parse type names by calling the main grammar, so any grammar-detected syntax error will still be a hard error. Also, if one includes a type modifier in such a type specification, errors detected by the typmodin function will be hard errors. * Lookup errors are handled just by passing missing_ok = true to the relevant catalog lookup function. Because we've used quite a restrictive definition of "missing_ok", this means that edge cases such as "the named schema exists, but you lack USAGE permission on it" are still hard errors. It would make sense to me to replace most/all missing_ok parameters with an escontext parameter and then allow these additional lookup failure cases to be trapped too. But that's a job for some other day. Discussion: https://postgr.es/m/3342239.1671988406@sss.pgh.pa.us	2022-12-27 12:26:01 -05:00
Tom Lane	78212f2101	Convert tsqueryin and tsvectorin to report errors softly. This is slightly tedious because the adjustments cascade through a couple of levels of subroutines, but it's not very hard. I chose to avoid changing function signatures more than absolutely necessary, by passing the escontext pointer in existing structs where possible. tsquery's nuisance NOTICEs about empty queries are suppressed in soft-error mode, since they're not errors and we surely don't want them to be shown to the user anyway. Maybe that whole behavior should be reconsidered. Discussion: https://postgr.es/m/3824377.1672076822@sss.pgh.pa.us	2022-12-27 12:00:31 -05:00
Tom Lane	eb8312a22a	Detect bad input for types xid, xid8, and cid. Historically these input functions just called strtoul or strtoull and returned the result, with no error detection whatever. Upgrade them to reject garbage input and out-of-range values, similarly to our other numeric input routines. To share the code for this with type oid, adjust the existing "oidin_subr" to be agnostic about the SQL name of the type it is handling, and move it to numutils.c; then clone it for 64-bit types. Because the xid types previously accepted hex and octal input by reason of calling strtoul[l] with third argument zero, I made the common subroutine do that too, with the consequence that type oid now also accepts hex and octal input. In view of `6fcda9aba`, that seems like a good thing. While at it, simplify the existing over-complicated handling of syntax errors from strtoul: we only need one ereturn not three. Discussion: https://postgr.es/m/3526121.1672000729@sss.pgh.pa.us	2022-12-27 11:40:01 -05:00
Amit Kapila	5de94a041e	Add 'logical_decoding_mode' GUC. This enables streaming or serializing changes immediately in logical decoding. This parameter is intended to be used to test logical decoding and replication of large transactions for which otherwise we need to generate the changes till logical_decoding_work_mem is reached. This helps in reducing the timing of existing tests related to logical replication of in-progress transactions and will help in writing tests for for the upcoming feature for parallelly applying large in-progress transactions. Author: Shi yu Reviewed-by: Sawada Masahiko, Shveta Mallik, Amit Kapila, Dilip Kumar, Kuroda Hayato, Kyotaro Horiguchi Discussion: https://postgr.es/m/OSZPR01MB63104E7449DBE41932DB19F1FD1B9@OSZPR01MB6310.jpnprd01.prod.outlook.com	2022-12-26 08:58:16 +05:30
Tom Lane	442e25d248	Convert enum_in() to report errors softly. I missed this in my initial survey, probably because I examined the contents of pg_type in the postgres database, which lacks any enumerated types. Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com	2022-12-25 14:32:30 -05:00
Andrew Dunstan	e37fe1db6e	Convert jsonpath's input function to report errors softly Reviewed by Tom Lane Discussion: https://postgr.es/m/a8dc5700-c341-3ba8-0507-cc09881e6200@dunslane.net	2022-12-24 15:21:20 -05:00
Tom Lane	780ec9f1b2	Make the numeric-OID cases of regprocin and friends be non-throwing. While at it, use a common subroutine already. This doesn't move the needle very far in terms of making these functions non-throwing; the only case we're now able to trap is numeric-OID-is-out-of-range. Still, it seems like a pretty non-controversial step in that direction.	2022-12-24 15:01:21 -05:00
David Rowley	ed1a88ddac	Allow window functions to adjust their frameOptions WindowFuncs such as row_number() don't care if it's called with ROWS UNBOUNDED PRECEDING AND CURRENT ROW or with RANGE UNBOUNDED PRECEDING AND CURRENT ROW. The latter is less efficient as the RANGE option requires that the executor check for peer rows, so using the ROW option instead would cause less overhead. Because RANGE is part of the default frame options for WindowClauses, it means WindowAgg is, by default, working much harder than it needs to for window functions where the ROWS / RANGE option has no effect on the window function's result. On a test query from the discussion thread, a performance improvement of 344% was seen by using ROWS instead of RANGE. Here we add a new support function node type to allow support functions to be called for window functions so that the most optimal version of the frame options can be set. The planner has been adjusted so that the frame options are changed only if all window functions sharing the same window clause agree on what the optimized frame options are. Here we give the ability for row_number(), rank(), dense_rank(), percent_rank(), cume_dist() and ntile() to alter their WindowClause's frameOptions. Reviewed-by: Vik Fearing, Erwin Brandstetter, Zhihong Yu Discussion: https://postgr.es/m/CAGHENJ7LBBszxS+SkWWFVnBmOT2oVsBhDMB1DFrgerCeYa_DyA@mail.gmail.com Discussion: https://postgr.es/m/CAApHDvohAKEtTXxq7Pc-ic2dKT8oZfbRKeEJP64M0B6+S88z+A@mail.gmail.com	2022-12-23 12:43:52 +13:00
Thomas Munro	cc15059634	Improve notation of cacheinfo table in syscache.c. Use C99 designated initializer syntax for the array elements, instead of writing the enumerator name and position in a comment. Replace nkeys and key with a local variadic macro, for a shorter notation. Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Peter Eisentraut <peter.eisentraut@enterprisedb.com> Discussion: https://postgr.es/m/CA%2BhUKGKdpDjKL2jgC-GpoL4DGZU1YPqnOFHbDqFkfRQcPaR5DQ%40mail.gmail.com	2022-12-23 10:40:18 +13:00
David Rowley	439f61757f	Add palloc_aligned() to allow aligned memory allocations This introduces palloc_aligned() and MemoryContextAllocAligned() which allow callers to obtain memory which is allocated to the given size and also aligned to the specified alignment boundary. The alignment boundaries may be any power-of-2 value. Currently, the alignment is capped at 2^26, however, we don't expect values anything like that large. The primary expected use case is to align allocations to perhaps CPU cache line size or to maybe I/O page size. Certain use cases can benefit from having aligned memory by either having better performance or more predictable performance. The alignment is achieved by requesting 'alignto' additional bytes from the underlying allocator function and then aligning the address that is returned to the requested alignment. This obviously does waste some memory, so alignments should be kept as small as what is required. It's also important to note that these alignment bytes eat into the maximum allocation size. So something like: palloc_aligned(MaxAllocSize, 64, 0); will not work as we cannot request MaxAllocSize + 64 bytes. Additionally, because we're just requesting the requested size plus the alignment requirements from the given MemoryContext, if that context is the Slab allocator, then since slab can only provide chunks of the size that's specified when the slab context is created, then this is not going to work. Slab will generate an error to indicate that the requested size is not supported. The alignment that is requested in palloc_aligned() is stored along with the allocated memory. This allows the alignment to remain intact through repalloc() calls. Author: Andres Freund, David Rowley Reviewed-by: Maxim Orlov, Andres Freund, John Naylor Discussion: https://postgr.es/m/CAApHDvpxLPUMV1mhxs6g7GNwCP6Cs6hfnYQL5ffJQTuFAuxt8A%40mail.gmail.com	2022-12-22 13:32:05 +13:00
Andrew Dunstan	33dd895ef3	Introduce float4in_internal This is the guts of float4in, callable as a routine to input floats, which will be useful in an upcoming patch for allowing soft errors in the seg module's input function. A similar operation was performed some years ago for float8in in commit `50861cd683`. Reviewed by Tom Lane Discussion: https://postgr.es/m/cee4e426-d014-c0b7-aa22-a659f2cd9130@dunslane.net	2022-12-21 16:55:52 -05:00
David Rowley	eb706fde83	Fix newly introduced bug in slab.c `d21ded75f` changed the way slab.c works but introduced a bug that meant we could end up with the slab's curBlocklistIndex pointing to the wrong list. The condition which was checking for this was failing to account for two things: 1. The curBlocklistIndex could be 0 as we've currently got no non-full blocks to put chunks on. In this case, the dlist_is_empty() check cannot be performed as there can be any number of completely full blocks at that index. 2. The curBlocklistIndex may be greater than the index we just moved the block onto. Since we need to ensure we fill up fuller blocks first, we must reset curBlocklistIndex when changing any blocklist element that's less than the curBlocklistIndex too. Reported-by: Takamichi Osumi Discussion: https://postgr.es/m/TYCPR01MB8373329C6329768D7E093D68EDEB9@TYCPR01MB8373.jpnprd01.prod.outlook.com	2022-12-22 09:57:49 +13:00
Michael Paquier	22e3b55805	Switch some system functions to use get_call_result_type() This shaves some code by replacing the combinations of CreateTemplateTupleDesc()/TupleDescInitEntry() hardcoding a mapping of the attributes listed in pg_proc.dat by get_call_result_type() to build the TupleDesc needed for the rows generated. get_call_result_type() is more expensive than the former style, but this removes some duplication with the lists of OUT parameters (pg_proc.dat and the attributes hardcoded in these code paths). This is applied to functions that are not considered as critical (aka that could be called repeatedly for monitoring purposes). Author: Bharath Rupireddy Reviewed-by: Robert Haas, Álvaro Herrera, Tom Lane, Michael Paquier Discussion: https://postgr.es/m/CALj2ACV23HW5HP5hFjd89FNS-z5X8r2jNXdMXcpN2BgTtKd87w@mail.gmail.com	2022-12-21 10:11:22 +09:00
Andrew Dunstan	8284cf5f74	Add copyright notices to meson files Discussion: https://postgr.es/m/222b43a5-2fb3-2c1b-9cd0-375d376c8246@dunslane.net	2022-12-20 07:54:39 -05:00
David Rowley	3226f47282	Add enable_presorted_aggregate GUC `1349d279` added query planner support to allow more efficient execution of aggregate functions which have an ORDER BY or a DISTINCT clause. Prior to that commit, the planner would only request that the lower planner produce a plan with the order required for the GROUP BY clause and it would be left up to nodeAgg.c to perform the final sort of records within each group so that the aggregate transition functions were called in the correct order. Now that the planner requests the lower planner produce a plan with the GROUP BY and the ORDER BY / DISTINCT aggregates in mind, there is the possibility that the planner chooses a plan which could be less efficient than what would have been produced before `1349d279`. While developing `1349d279`, I had in mind that Incremental Sort would help us in cases where an index exists only on the GROUP BY column(s). Incremental Sort would just replace the implicit tuplesorts which are being performed in nodeAgg.c. However, because the planner has the flexibility to instead choose a plan which just performs a full sort on both the GROUP BY and ORDER BY / DISTINCT aggregate columns, there is potential for the planner to make a bad choice. The costing for Incremental Sort is not perfect as it assumes an even distribution of rows to sort within each sort group. Here we add an escape hatch in the form of the enable_presorted_aggregate GUC. This will allow users to get the pre-PG16 behavior in cases where they have no other means to convince the query planner to produce a plan which only sorts on the GROUP BY column(s). Discussion: https://postgr.es/m/CAApHDvr1Sm+g9hbv4REOVuvQKeDWXcKUAhmbK5K+dfun0s9CvA@mail.gmail.com	2022-12-20 22:28:58 +13:00
David Rowley	d21ded75fd	Improve the performance of the slab memory allocator Slab has traditionally been fairly slow when compared with the AllocSet or Generation memory allocators. Part of this slowness came from having to write out an entire block when we allocate a new block in order to populate the free list indexes within the block's memory. Additional slowness came from having to move a block onto another dlist each time we palloc or pfree a chunk from it. Here we optimize both of those cases and do a little bit extra to improve the performance of the slab allocator. Here, instead of writing out the free list indexes when allocating a new block, we introduce the concept of "unused" chunks. When a block is first allocated all chunks are unused. These chunks only make it onto the free list when they are pfree'd. When allocating new chunks on an existing block, we have the choice of consuming a chunk from the free list or an unused chunk. When both exist, we opt to use one from the free list, as these have been used already and the memory of them is more likely to be cached by the CPU. Here we also reduce the number of block lists from there being one for every possible value of free chunks on a block to just having a small fixed number of block lists. We keep the 0th block list for completely full blocks and anything else stores blocks for some range of free chunks with fuller blocks appearing on lower block list array elements. This reduces how often we must move a block to another list when we allocate or free chunks, but still allows us to prefer to put new chunks on fuller blocks and perhaps allow blocks with fewer chunks to be free'd later once all their remaining chunks have been pfree'd. Additionally, we now store a list of "emptyblocks", which are blocks that no longer contain any allocated chunks. We now keep up to 10 of these around to avoid having to thrash malloc/free when allocation patterns continually cause blocks to become free of any allocated chunks only to allocate more chunks again. Now only once we have 10 of these, we free the block. This does raise the high water mark for the total memory that a slab context can consume. It does not seem entirely unreasonable that we might one day want to make this a property of SlabContext rather than a compile-time constant. Let's wait and see if there is any evidence to support that this is required before doing it. Author: Andres Freund, David Rowley Tested-by: Tomas Vondra, John Naylor Discussion: https://postgr.es/m/20210717194333.mr5io3zup3kxahfm@alap3.anarazel.de	2022-12-20 21:48:51 +13:00
John Naylor	995a9fb14f	Move variable increment to the end of the loop This is less error prone and matches the placement of other code in the file. Justin Pryzby Reviewed by Tom Lane Discussion: https://www.postgresql.org/message-id/20221123172436.GJ11463@telsasoft.com	2022-12-20 14:13:14 +07:00
Robert Haas	10ea0f924a	Expose some information about backend subxact status. A new function pg_stat_get_backend_subxact() can be used to get information about the number of subtransactions in the cache of a particular backend and whether that cache has overflowed. This can be useful for tracking down performance problems that can result from overflowed snapshots. Dilip Kumar, reviewed by Zhihong Yu, Nikolay Samokhvalov, Justin Pryzby, Nathan Bossart, Ashutosh Sharma, Julien Rouhaud. Additional design comments from Andres Freund, Tom Lane, Bruce Momjian, and David G. Johnston. Discussion: http://postgr.es/m/CAFiTN-ut0uwkRJDQJeDPXpVyTWD46m3gt3JDToE02hTfONEN=Q@mail.gmail.com	2022-12-19 14:43:09 -05:00
Tom Lane	c4939f1215	Clean up dubious error handling in wellformed_xml(). This ancient bit of code was summarily trapping any ereport longjmp whatsoever and assuming that it must represent an invalid-XML report. It's not really appropriate to handle OOM-like situations that way: maybe the input is valid or maybe not, but we couldn't find out. And it'd be a seriously bad idea to ignore, say, a query cancel error that way. (Perhaps that can't happen because there is no CHECK_FOR_INTERRUPTS anywhere within xml_parse, but even if that's true today it's obviously a very fragile assumption.) But in the wake of the previous commit, we can drop the PG_TRY here altogether, and use the soft error mechanism to catch only the kinds of errors that are legitimate to treat as invalid-XML. (This is our first use of the soft error mechanism for something not directly related to a datatype input function. It won't be the last.) xml_is_document can be converted in the same way. That one is not actively broken, because it was checking specifically for ERRCODE_INVALID_XML_DOCUMENT rather than trapping everything; but the code is still shorter and probably faster this way. Discussion: https://postgr.es/m/3564577.1671142683@sss.pgh.pa.us	2022-12-16 11:10:40 -05:00
Tom Lane	37bef842f5	Convert xml_in to report errors softly. The key idea here is that xml_parse must distinguish hard errors from soft errors. We want to throw a hard error for libxml initialization failures: those might be out-of-memory, or something else, but in any case they are not the fault of the input string. If we get to the point of parsing the input, and something goes wrong, we can fairly consider that to mean bad input. One thing that arguably does mean bad input, but I didn't trouble to handle softly, is encoding conversion failure while converting the server encoding to UTF8. This might be something to improve later, but it seems like a pretty low-probability scenario. Discussion: https://postgr.es/m/3564577.1671142683@sss.pgh.pa.us	2022-12-16 11:10:40 -05:00
Tom Lane	d35a1af468	Convert range_in and multirange_in to report errors softly. This is mostly straightforward, except that if the range type has a canonical function, that might throw an error during range input. (Such errors probably only occur for edge cases: in the in-core canonical functions, it happens only if a bound has the maximum valid value for the underlying type.) Hence, this patch extends the soft-error regime to allow canonical functions to return errors softly as well. Extensions implementing range canonical functions will need modification anyway because of the API change for range_serialize(); while at it, they might want to do something similar to what's been done here in the in-core canonical functions. Discussion: https://postgr.es/m/3284599.1671075185@sss.pgh.pa.us	2022-12-15 12:18:36 -05:00
Peter Eisentraut	75f49221c2	Static assertions cleanup Because we added StaticAssertStmt() first before StaticAssertDecl(), some uses as well as the instructions in c.h are now a bit backwards from the "native" way static assertions are meant to be used in C. This updates the guidance and moves some static assertions to better places. Specifically, since the addition of StaticAssertDecl(), we can put static assertions at the file level. This moves a number of static assertions out of function bodies, where they might have been stuck out of necessity, to perhaps better places at the file level or in header files. Also, when the static assertion appears in a position where a declaration is allowed, then using StaticAssertDecl() is more native than StaticAssertStmt(). Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Discussion: https://www.postgresql.org/message-id/flat/941a04e7-dd6f-c0e4-8cdf-a33b3338cbda%40enterprisedb.com	2022-12-15 10:10:32 +01:00
Tom Lane	3b9d2deb67	Convert a few more datatype input functions to report errors softly. Convert the remaining string-category input functions (bpcharin, varcharin, byteain) to the new style. Discussion: https://postgr.es/m/3038346.1671060258@sss.pgh.pa.us	2022-12-14 19:42:05 -05:00
Tom Lane	90161dad4d	Convert a few more datatype input functions to report errors softly. Convert cash_in and uuid_in to the new style. Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com	2022-12-14 18:03:11 -05:00
Tom Lane	47f3f97fcd	Convert a few more datatype input functions to report errors softly. Convert assorted internal-ish datatypes, namely aclitemin, int2vectorin, oidin, oidvectorin, pg_lsn_in, pg_snapshot_in, and tidin to the new style. (Some others you might expect to find in this group, such as cidin and xidin, need no changes because they never throw errors at all. That seems a little cheesy ... but it is not in the charter of this patch series to add new error conditions.) Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com	2022-12-14 17:50:24 -05:00
Tom Lane	332741e739	Convert the geometric input functions to report errors softly. Convert box_in, circle_in, line_in, lseg_in, path_in, point_in, and poly_in to the new style. line_in still throws hard errors for overflows/underflows that can occur when the input is specified as two points rather than in the canonical "Ax + By + C = 0" style. I'm not too concerned about that: it won't be reached in normal dump/restore cases, and it's fairly debatable that such conversion should ever have been made part of a type input function in the first place. But in any case, I don't want to extend the soft error conventions into float.h without more discussion than this patch has had. Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com	2022-12-14 16:10:20 -05:00
Tom Lane	17407a8eaa	Convert a few more datatype input functions to report errors softly. Convert bit_in, varbit_in, inet_in, cidr_in, macaddr_in, and macaddr8_in to the new style. Amul Sul, minor mods by me Discussion: https://postgr.es/m/CAAJ_b97KeDWUdpTKGOaFYPv0OicjOu6EW+QYWj-Ywrgj_aEy1g@mail.gmail.com	2022-12-14 13:22:08 -05:00
Peter Eisentraut	b18c2decd7	Rearrange some static assertions for consistency Put lengthof first. Reported-by: Peter Smith <smithpb2250@gmail.com> Discussion: https://www.postgresql.org/message-id/CAHut+PsUDMySVRuRc=h+P5N3+=TGvj4W_mi32XXg9dt4o-BXbA@mail.gmail.com	2022-12-14 16:08:13 +01:00
Peter Eisentraut	6fcda9aba8	Non-decimal integer literals Add support for hexadecimal, octal, and binary integer literals: 0x42F 0o273 0b100101 per SQL:202x draft. This adds support in the lexer as well as in the integer type input functions. Reviewed-by: John Naylor <john.naylor@enterprisedb.com> Reviewed-by: Zhihong Yu <zyu@yugabyte.com> Reviewed-by: David Rowley <dgrowleyml@gmail.com> Reviewed-by: Dean Rasheed <dean.a.rasheed@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/b239564c-cad0-b23e-c57e-166d883cb97d@enterprisedb.com	2022-12-14 06:17:07 +01:00
Jeff Davis	60684dd834	Add grantable MAINTAIN privilege and pg_maintain role. Allows VACUUM, ANALYZE, REINDEX, REFRESH MATERIALIZED VIEW, CLUSTER, and LOCK TABLE. Effectively reverts `4441fc704d`. Instead of creating separate privileges for VACUUM, ANALYZE, and other maintenance commands, group them together under a single MAINTAIN privilege. Author: Nathan Bossart Discussion: https://postgr.es/m/20221212210136.GA449764@nathanxps13 Discussion: https://postgr.es/m/45224.1670476523@sss.pgh.pa.us	2022-12-13 17:33:28 -08:00
Tom Lane	b0feda79fd	Fix jsonb subscripting to cope with toasted subscript values. jsonb_get_element() was incautious enough to use VARDATA() and VARSIZE() directly on an arbitrary text Datum. That of course fails if the Datum is short-header, compressed, or out-of-line. The typical result would be failing to match any element of a jsonb object, though matching the wrong one seems possible as well. setPathObject() was slightly brighter, in that it used VARDATA_ANY and VARSIZE_ANY_EXHDR, but that only kept it out of trouble for short-header Datums. push_path() had the same issue. This could result in faulty subscripted insertions, though keys long enough to cause a problem are likely rare in the wild. Having seen these, I looked around for unsafe usages in the rest of the adt/json* files. There are a couple of places where it's not immediately obvious that the Datum can't be compressed or out-of-line, so I added pg_detoast_datum_packed() to cope if it is. Also, remove some other usages of VARDATA/VARSIZE on Datums we just extracted from a text array. Those aren't actively broken, but they will become so if we ever start allowing short-header array elements, which does not seem like a terribly unreasonable thing to do. In any case they are not great coding examples, and they could also do with comments pointing out that we're assuming we don't need pg_detoast_datum_packed. Per report from exe-dealer@yandex.ru. Patch by me, but thanks to David Johnston for initial investigation. Back-patch to v14 where jsonb subscripting was introduced. Discussion: https://postgr.es/m/205321670615953@mail.yandex.ru	2022-12-12 16:17:54 -05:00
Tom Lane	b8c0ffbd2c	Convert domain_in to report errors softly. This is straightforward as far as it goes. However, it does not attempt to trap errors occurring during the execution of domain CHECK constraints. Since those are general user-defined expressions, the only way to do that would involve starting up a subtransaction for each check. Of course the entire point of the soft-errors feature is to not need subtransactions, so that would be self-defeating. For now, we'll rely on the assumption that domain checks are written to avoid throwing errors. Discussion: https://postgr.es/m/1181028.1670635727@sss.pgh.pa.us	2022-12-11 12:56:54 -05:00
Tom Lane	c60c9badba	Convert json_in and jsonb_in to report errors softly. This requires a bit of further infrastructure-extension to allow trapping errors reported by numeric_in and pg_unicode_to_server, but otherwise it's pretty straightforward. In the case of jsonb_in, we are only capturing errors reported during the initial "parse" phase. The value-construction phase (JsonbValueToJsonb) can also throw errors if assorted implementation limits are exceeded. We should improve that, but it seems like a separable project. Andrew Dunstan and Tom Lane Discussion: https://postgr.es/m/3bac9841-fe07-713d-fa42-606c225567d6@dunslane.net	2022-12-11 11:28:15 -05:00
Tom Lane	50428a301d	Change JsonSemAction to allow non-throw error reporting. Formerly, semantic action functions for the JSON parser returned void, so that there was no way for them to affect the parser's behavior. That means in particular that they can't force an error exit except by longjmp'ing. That won't do in the context of our project to make input functions return errors softly. Hence, change them to return the same JsonParseErrorType enum value as the parser itself uses. If an action function returns anything besides JSON_SUCCESS, the parse is abandoned and that error code is returned. Action functions can thus easily return the same error conditions that the parser already knows about. As an escape hatch for expansion, also invent a code JSON_SEM_ACTION_FAILED that the core parser does not know the exact meaning of. When returning this code, an action function must use some out-of-band mechanism for reporting the error details. This commit simply makes the API change and causes all the existing action functions to return JSON_SUCCESS, so that there is no actual change in behavior here. This is long enough and boring enough that it seemed best to commit it separately from the changes that make real use of the new mechanism. In passing, remove a duplicate assignment of transform_string_values_scalar. Discussion: https://postgr.es/m/1436686.1670701118@sss.pgh.pa.us	2022-12-11 10:39:05 -05:00
Tom Lane	d02ef65bce	Standardize error reports in unimplemented I/O functions. We chose a specific wording of the not-implemented errors for pseudotype I/O functions and other cases where there's little value in implementing input and/or output. gtsvectorin never got that memo though, nor did most of contrib. Make these all fall in line, mostly because I'm a neatnik but also to remove unnecessary translatable strings. gbtreekey_in needs a bit of extra love since it supports multiple SQL types. Sadly, gbtreekey_out doesn't have the ability to do that, but I think it's unreachable anyway. Noted while surveying datatype input functions to see what we have left to fix.	2022-12-10 18:26:43 -05:00

1 2 3 4 5 ...

8107 Commits