postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-19 13:42:17 +03:00

Author	SHA1	Message	Date
Alexander Korotkov	033dd02db2	Put abbreviation logic into puttuple_common() Abbreviation code is very similar along tuplesort_put() functions. This commit unifies that code and puts it into puttuple_common(). tuplesort_put() functions differs in the abbreviation condition, so it has been added as an argument to the puttuple_common() function. Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent Reviewed-by: Andres Freund, John Naylor	2022-07-27 08:27:46 +03:00
Alexander Korotkov	cadfdd1edf	Add new Tuplesortstate.removeabbrev function This commit is the preparation to move abbreviation logic into puttuple_common(). The new removeabbrev function turns datum1 representation of SortTuple's from the abbreviated key to the first column value. Therefore, it encapsulates the differential part of abbreviation handling code in tuplesort_put*() functions, making these functions similar. Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent Reviewed-by: Andres Freund, John Naylor	2022-07-27 08:27:29 +03:00
Alexander Korotkov	d47da3162b	Remove Tuplesortstate.copytup function It's currently unclear how do we split functionality between Tuplesortstate.copytup() function and tuplesort_put() functions. For instance, copytup_index() and copytup_datum() return error while tuplesort_putindextuplevalues() and tuplesort_putdatum() do their work. This commit removes Tuplesortstate.copytup() altogether, putting the corresponding code into tuplesort_put(). Discussion: https://postgr.es/m/CAPpHfdvjix0Ahx-H3Jp1M2R%2B_74P-zKnGGygx4OWr%3DbUQ8BNdw%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Pavel Borisov, Maxim Orlov, Matthias van de Meent Reviewed-by: Andres Freund, John Naylor	2022-07-27 08:26:53 +03:00
Tom Lane	e64cdab003	Invent qsort_interruptible(). Justin Pryzby reported that some scenarios could cause gathering of extended statistics to spend many seconds in an un-cancelable qsort() operation. To fix, invent qsort_interruptible(), which is just like qsort_arg() except that it will also do CHECK_FOR_INTERRUPTS every so often. This bloats the backend by a couple of kB, which seems like a good investment. (We considered just enabling CHECK_FOR_INTERRUPTS in the existing qsort and qsort_arg functions, but there are some callers for which that'd demonstrably be unsafe. Opt-in seems like a better way.) For now, just apply qsort_interruptible() in statistics collection. There's probably more places where it could be useful, but we can always change other call sites as we find problems. Back-patch to v14. Before that we didn't have extended stats on expressions, so that the problem was less severe. Also, this patch depends on the sort_template infrastructure introduced in v14. Tom Lane and Justin Pryzby Discussion: https://postgr.es/m/20220509000108.GQ28830@telsasoft.com	2022-07-12 16:30:36 -04:00
David Rowley	0229106afa	Overload index_form_tuple to allow the memory context to be supplied `40af10b57` changed things so we make use of a generation memory context for storing tuples to be sorted by tuplesort.c. That change does not play nicely with the changes made in `9f03ca915` (back in 2014). That commit changed things so that index_form_tuple() is called while switched into the tuplestore's tuplecontext. In order to fetch the tuple from the index, index_form_tuple() must do various memory allocations which are unrelated to the storage of the final returned tuple. Although all of these allocations are pfree'd, the fact that we now use a generation context means that the memory for these pfree'd allocations won't be used again by any other allocation due to generation.c's lack of freelists. This could result in sorts used for building indexes exceeding maintenance_work_mem by a very large amount. Here we fix it so we no longer allocate anything apart from the tuple itself into the generation context by adding a new version of index_form_tuple named index_form_tuple_context, which can be called to specify the MemoryContext to allocate the tuple into. Discussion: https://postgr.es/m/CAApHDvrHQkiFRHiGiAS-LMOvJN-eK-s762=tVzBz8ZqUea-a_A@mail.gmail.com Backpatch-through: 15, where `40af10b57` was added.	2022-07-07 08:14:00 +12:00
John Naylor	6e647ef0e7	Remove debug messages from tuplesort_sort_memtuples() These were of value only during development. Reported by Justin Pryzby Discussion: https://www.postgresql.org/message-id/20220519201254.GU19626%40telsasoft.com	2022-05-23 13:11:43 +07:00
Tom Lane	23e7b38bfe	Pre-beta mechanical code beautification. Run pgindent, pgperltidy, and reformat-dat-files. I manually fixed a couple of comments that pgindent uglified.	2022-05-12 15:17:30 -04:00
David Rowley	c90c16591c	Fix some incorrect preprocessor tests in tuplesort specializations `697492434` added 3 new quicksort specialization functions for common datatypes. That commit was not very consistent in how it would determine if we're compiling for 32-bit or 64-bit machines. It would sometimes use USE_FLOAT8_BYVAL and at other times check if SIZEOF_DATUM == 8. This could cause theoretical problems due to the way USE_FLOAT8_BYVAL is now defined based on SIZEOF_VOID_P >= 8. If pointers for some reason were ever larger than 8-bytes then we'd end up doing 32-bit comparisons mistakenly. Let's just always check SIZEOF_DATUM >= 8. It also seems that ssup_datum_signed_cmp is just never used on 32-bit builds, so let's just ifdef that out to make sure we never accidentally use that comparison function on such machines. This also allows us to ifdef out 1 of the 3 new specialization quicksort functions in 32-bit builds which seems to shrink down the binary by over 4KB on my machine. In passing, also add the missing DatumGetInt32() / DatumGetInt64() macros in the comparison functions. Discussion: https://postgr.es/m/CAApHDvqcQExRhtRa9hJrJB_5egs3SUfOcutP3m+3HO8A+fZTPA@mail.gmail.com Reviewed-by: John Naylor	2022-05-11 11:38:13 +12:00
David Rowley	99c754129d	Fix performance regression in tuplesort specializations `697492434` added 3 new qsort specialization functions aimed to improve the performance of sorting many of the common pass-by-value data types when they're the leading or only sort key. Unfortunately, that has caused a performance regression when sorting datasets where many of the values being compared were equal. What was happening here was that we were falling back to the standard sort comparison function to handle tiebreaks. When the two given Datums compared equally we would incur both the overhead of an indirect function call to the standard comparer to perform the tiebreak and also the standard comparer function would go and compare the leading key needlessly all over again. Here improve the situation in the 3 new comparison functions. We now return 0 directly when the two Datums compare equally and we're performing a 1-key sort. Here we don't do anything to help the multi-key sort case where the leading key uses one of the sort specializations functions. On testing this case, even when the leading key's values are all equal, there appeared to be no performance regression. Let's leave it up to future work to optimize that case so that the tiebreak function no longer re-compares the leading key over again. Another possible fix for this would have been to add 3 additional sort specialization functions to handle single-key sorts for these pass-by-value types. The reason we didn't do that here is that we may deem some other sort specialization to be more useful than single-key sorts. It may be impractical to have sort specialization functions for every single combination of what may be useful and it was already decided that further analysis into which ones are the most useful would be delayed until the v16 cycle. Let's not let this regression force our hand into trying to make that decision for v15. Author: David Rowley Reviewed-by: John Naylor Discussion: https://postgr.es/m/CA+hUKGJRbzaAOUtBUcjF5hLtaSHnJUqXmtiaLEoi53zeWSizeA@mail.gmail.com	2022-04-22 16:02:15 +12:00
Peter Geoghegan	8ab0ebb9a8	Fix CLUSTER tuplesorts on abbreviated expressions. CLUSTER sort won't use the datum1 SortTuple field when clustering against an index whose leading key is an expression. This makes it unsafe to use the abbreviated keys optimization, which was missed by the logic that sets up SortSupport state. Affected tuplesorts output tuples in a completely bogus order as a result (the wrong SortSupport based comparator was used for the leading attribute). This issue is similar to the bug fixed on the master branch by recent commit `cc58eecc5d`. But it's a far older issue, that dates back to the introduction of the abbreviated keys optimization by commit `4ea51cdfe8`. Backpatch to all supported versions. Author: Peter Geoghegan <pg@bowt.ie> Author: Thomas Munro <thomas.munro@gmail.com> Discussion: https://postgr.es/m/CA+hUKG+bA+bmwD36_oDxAoLrCwZjVtST2fqe=b4=qZcmU7u89A@mail.gmail.com Backpatch: 10-	2022-04-20 17:17:43 -07:00
Alvaro Herrera	24d2b2680a	Remove extraneous blank lines before block-closing braces These are useless and distracting. We wouldn't have written the code with them to begin with, so there's no reason to keep them. Author: Justin Pryzby <pryzby@telsasoft.com> Discussion: https://postgr.es/m/20220411020336.GB26620@telsasoft.com Discussion: https://postgr.es/m/attachment/133167/0016-Extraneous-blank-lines.patch	2022-04-13 19:16:02 +02:00
David Rowley	40af10b571	Use Generation memory contexts to store tuples in sorts The general usage pattern when we store tuples in tuplesort.c is that we store a series of tuples one by one then either perform a sort or spill them to disk. In the common case, there is no pfreeing of already stored tuples. For the common case since we do not individually pfree tuples, we have very little need for aset.c memory allocation behavior which maintains freelists and always rounds allocation sizes up to the next power of 2 size. Here we conditionally use generation.c contexts for storing tuples in tuplesort.c when the sort will never be bounded. Unfortunately, the memory context to store tuples is already created by the time any calls would be made to tuplesort_set_bound(), so here we add a new sort option that allows callers to specify if they're going to need a bounded sort or not. We'll use a standard aset.c allocator when this sort option is not set. Extension authors must ensure that the TUPLESORT_ALLOWBOUNDED flag is used when calling tuplesort_begin_* for any sorts that make a call to tuplesort_set_bound(). Author: David Rowley Reviewed-by: Andy Fan Discussion: https://postgr.es/m/CAApHDvoH4ASzsAOyHcxkuY01Qf++8JJ0paw+03dk+W25tQEcNQ@mail.gmail.com	2022-04-04 22:52:35 +12:00
David Rowley	77bae396df	Adjust tuplesort API to have bitwise option flags This replaces the bool flag for randomAccess. An upcoming patch requires adding another option, so instead of breaking the API for that, then breaking it again one day if we add more options, let's just break it once. Any boolean options we add in the future will just make use of an unused bit in the flags. Any extensions making use of tuplesorts will need to update their code to pass TUPLESORT_RANDOMACCESS instead of true for randomAccess. TUPLESORT_NONE can be used for a set of empty options. Author: David Rowley Reviewed-by: Justin Pryzby Discussion: https://postgr.es/m/CAApHDvoH4ASzsAOyHcxkuY01Qf%2B%2B8JJ0paw%2B03dk%2BW25tQEcNQ%40mail.gmail.com	2022-04-04 22:24:59 +12:00
Thomas Munro	cc58eecc5d	Fix tuplesort optimization for CLUSTER-on-expression. When dispatching sort operations to specialized variants, commit `69749243` failed to handle the case where CLUSTER-sort decides not to initialize datum1 and isnull1. Fix by hoisting that decision up a level and advertising whether datum1 can be relied on, in the Tuplesortstate object. Per reports from UBsan and Valgrind build farm animals, while running the cluster.sql test. Reviewed-by: Andres Freund <andres@anarazel.de> Discussion: https://postgr.es/m/CAFBsxsF1TeK5Fic0M%2BTSJXzbKsY6aBqJGNj6ptURuB09ZF6k_w%40mail.gmail.com	2022-04-04 10:52:02 +12:00
John Naylor	6974924347	Specialize tuplesort routines for different kinds of abbreviated keys Previously, the specialized tuplesort routine inlined handling for reverse-sort and NULLs-ordering but called the datum comparator via a pointer in the SortSupport struct parameter. Testing has showed that we can get a useful performance gain by specializing datum comparison for the different representations of abbreviated keys -- signed and unsigned 64-bit integers and signed 32-bit integers. Almost all abbreviatable data types will benefit -- the only exception for now is numeric, since the datum comparison is more complex. The performance gain depends on data type and input distribution, but often falls in the range of 10-20% faster. Thomas Munro Reviewed by Peter Geoghegan, review and performance testing by me Discussion: https://www.postgresql.org/message-id/CA%2BhUKGKKYttZZk-JMRQSVak%3DCXSJ5fiwtirFf%3Dn%3DPAbumvn1Ww%40mail.gmail.com	2022-04-02 15:22:25 +07:00
Peter Eisentraut	94aa7cc5f7	Add UNIQUE null treatment option The SQL standard has been ambiguous about whether null values in unique constraints should be considered equal or not. Different implementations have different behaviors. In the SQL:202x draft, this has been formalized by making this implementation-defined and adding an option on unique constraint definitions UNIQUE [ NULLS [NOT] DISTINCT ] to choose a behavior explicitly. This patch adds this option to PostgreSQL. The default behavior remains UNIQUE NULLS DISTINCT. Making this happen in the btree code is pretty easy; most of the patch is just to carry the flag around to all the places that need it. The CREATE UNIQUE INDEX syntax extension is not from the standard, it's my own invention. I named all the internal flags, catalog columns, etc. in the negative ("nulls not distinct") so that the default PostgreSQL behavior is the default if the flag is false. Reviewed-by: Maxim Orlov <orlovmg@gmail.com> Reviewed-by: Pavel Borisov <pashkin.elfe@gmail.com> Discussion: https://www.postgresql.org/message-id/flat/84e5ee1b-387e-9a54-c326-9082674bde78@enterprisedb.com	2022-02-03 11:48:21 +01:00
Bruce Momjian	27b77ecf9f	Update copyright for 2022 Backpatch-through: 10	2022-01-07 19:04:57 -05:00
Tom Lane	a2ff18e89f	Improve sift up/down code in binaryheap.c and logtape.c. Borrow the logic that's long been used in tuplesort.c: instead of physically swapping the data in two heap entries, keep the value that's being sifted up or down in a local variable, and just move the other values as necessary. This makes the code shorter as well as faster. It's not clear that any current callers are really time-critical enough to notice, but we might as well code heap maintenance the same way everywhere. Ma Liangzhu and Tom Lane Discussion: https://postgr.es/m/17336-fc4e522d26a750fd@postgresql.org	2021-12-14 13:35:22 -05:00
Tom Lane	2de3c1015c	Fix datatype confusion in logtape.c's right_offset(). This could only matter if (a) long is wider than int, and (b) the heap of free blocks exceeds UINT_MAX entries, which seems pretty unlikely. Still, it's a theoretical bug, so backpatch to v13 where the typo came in (in commit `c02fdc922`). In passing, also make swap_nodes() use consistent datatypes. Ma Liangzhu Discussion: https://postgr.es/m/17336-fc4e522d26a750fd@postgresql.org	2021-12-14 11:46:36 -05:00
Heikki Linnakangas	166f94377c	Clarify the logic in a few places in the new balanced merge code. In selectnewtape(), use 'nOutputTapes' rather than 'nOutputRuns' in the check for whether to start a new tape or to append a new run to an existing tape. Until 'maxTapes' is reached, nOutputTapes is always equal to nOutputRuns, so it doesn't change the logic, but it seems more logical to compare # of tapes with # of tapes. Also, currently maxTapes is never modified after the merging begins, but written this way, the code would still work if it was. (Although the nOutputRuns == nOutputTapes assertion would need to be removed and using nOutputRuns % nOutputTapes to distribute the runs evenly across the tapes wouldn't do a good job anymore). Similarly in mergeruns(), change to USEMEM(state->tape_buffer_mem) to account for the memory used for tape buffers. It's equal to availMem currently, but tape_buffer_mem is more direct and future-proof. For example, if we changed the logic to only allocate half of the remaining memory to tape buffers, USEMEM(state->tape_buffer_mem) would still be correct. Coverity complained about these. Hopefully this patch helps it to understand the logic better. Thanks to Tom Lane for initial analysis.	2021-10-25 09:30:49 +03:00
Heikki Linnakangas	fc0f3b4cb0	Fix parallel sort, broken by the balanced merge patch. The code for initializing the tapes on each merge iteration was skipped in a parallel worker. I put the !WORKER(state) check in wrong place while rebasing the patch. That caused failures in the index build in 'multiple-row-versions' isolation test, in multiple buildfarm members. On my laptop it was easier to reproduce by building an index on a larger table, so that you got a parallel sort more reliably.	2021-10-18 20:42:10 +03:00
Heikki Linnakangas	aa3ac6453b	Fix duplicate typedef LogicalTape. To make buildfarm member locust happy.	2021-10-18 17:02:01 +03:00
Heikki Linnakangas	0bd65a3905	Fix format modifier used in elog. The previous commit `65014000b3` changed the variable passed to elog from an int64 to a size_t variable, but neglected to change the modifier in the format string accordingly. Per failure on buildfarm member lapwing.	2021-10-18 16:15:44 +03:00
Heikki Linnakangas	65014000b3	Replace polyphase merge algorithm with a simple balanced k-way merge. The advantage of polyphase merge is that it can reuse the input tapes as output tapes efficiently, but that is irrelevant on modern hardware, when we can easily emulate any number of tape drives. The number of input tapes we can/should use during merging is limited by work_mem, but output tapes that we are not currently writing to only cost a little bit of memory, so there is no need to skimp on them. This makes sorts that need multiple merge passes faster. Discussion: https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83%40iki.fi Reviewed-by: Peter Geoghegan, Zhihong Yu, John Naylor	2021-10-18 14:46:01 +03:00
Heikki Linnakangas	c4649cce39	Refactor LogicalTapeSet/LogicalTape interface. All the tape functions, like LogicalTapeRead and LogicalTapeWrite, now take a LogicalTape as argument, instead of LogicalTapeSet+tape number. You can create any number of LogicalTapes in a single LogicalTapeSet, and you don't need to decide the number upfront, when you create the tape set. This makes the tape management in hash agg spilling in nodeAgg.c simpler. Discussion: https://www.postgresql.org/message-id/420a0ec7-602c-d406-1e75-1ef7ddc58d83%40iki.fi Reviewed-by: Peter Geoghegan, Zhihong Yu, John Naylor	2021-10-18 14:46:01 +03:00
Amit Kapila	31c389d8de	Optimize fileset usage in apply worker. Use one fileset for the entire worker lifetime instead of using separate filesets for each streaming transaction. Now, the changes/subxacts files for every streaming transaction will be created under the same fileset and the files will be deleted after the transaction is completed. This patch extends the BufFileOpenFileSet and BufFileDeleteFileSet APIs to allow users to specify whether to give an error on missing files. Author: Dilip Kumar, based on suggestion by Thomas Munro Reviewed-by: Hou Zhijie, Masahiko Sawada, Amit Kapila Discussion: https://postgr.es/m/E1mCC6U-0004Ik-Fs@gemulon.postgresql.org	2021-09-02 08:13:46 +05:30
Amit Kapila	dcac5e7ac1	Refactor sharedfileset.c to separate out fileset implementation. Move fileset related implementation out of sharedfileset.c to allow its usage by backends that don't want to share filesets among different processes. After this split, fileset infrastructure is used by both sharedfileset.c and worker.c for the named temporary files that survive across transactions. Author: Dilip Kumar, based on suggestion by Andres Freund Reviewed-by: Hou Zhijie, Masahiko Sawada, Amit Kapila Discussion: https://postgr.es/m/E1mCC6U-0004Ik-Fs@gemulon.postgresql.org	2021-08-30 08:48:15 +05:30
David Rowley	5bd38d2f28	Robustify tuplesort's free_sort_tuple function `41469253e` went to the trouble of removing a theoretical bug from free_sort_tuple by checking if the tuple was NULL before freeing it. Let's make this a little more robust by also setting the tuple to NULL so that should we be called again we won't end up doing a pfree on the already pfree'd tuple. Per advice from Tom Lane. Discussion: https://postgr.es/m/3188192.1626136953@sss.pgh.pa.us Backpatch-through: 9.6, same as `41469253e`	2021-07-13 13:27:05 +12:00
David Rowley	41469253e9	Fix theoretical bug in tuplesort This fixes a theoretical bug in tuplesort.c which, if a bounded sort was used in combination with a byval Datum sort (tuplesort_begin_datum), when switching the sort to a bounded heap in make_bounded_heap(), we'd call free_sort_tuple(). The problem was that when sorting Datums of a byval type, the tuple is NULL and free_sort_tuple() would free the memory for it regardless of that. This would result in a crash. Here we fix that simply by adding a check to see if the tuple is NULL before trying to disassociate and free any memory belonging to it. The reason this bug is only theoretical is that nowhere in the current code base do we do tuplesort_set_bound() when performing a Datum sort. However, let's backpatch a fix for this as if any extension uses the code in this way then it's likely to cause problems. Author: Ronan Dunklau Discussion: https://postgr.es/m/CAApHDvpdoqNC5FjDb3KUTSMs5dg6f+XxH4Bg_dVcLi8UYAG3EQ@mail.gmail.com Backpatch-through: 9.6, oldest supported version	2021-07-13 12:40:16 +12:00
Tom Lane	def5b065ff	Initial pgindent and pgperltidy run for v14. Also "make reformat-dat-files". The only change worthy of note is that pgindent messed up the formatting of launcher.c's struct LogicalRepWorkerId, which led me to notice that that struct wasn't used at all anymore, so I just took it out.	2021-05-12 13:14:10 -04:00
Thomas Munro	8eda3eba30	Use sort_template.h for qsort_tuple() and qsort_ssup(). Replace the Perl code previously used to generate specialized sort functions with sort_template.h. Reviewed-by: Daniel Gustafsson <daniel@yesql.se> Discussion: https://postgr.es/m/CA%2BhUKGJ2-eaDqAum5bxhpMNhvuJmRDZxB_Tow0n-gse%2BHG0Yig%40mail.gmail.com	2021-03-03 17:02:32 +13:00
Bruce Momjian	ca3b37487b	Update copyright for 2021 Backpatch-through: 9.5	2021-01-02 13:06:25 -05:00
Tom Lane	f859c2ffa0	Fix a few more generator scripts to produce pgindent-clean output. This completes the project of making all our derived files be pgindent-clean (or else explicitly excluded from indentation), so that no surprises result when running pgindent in a built-out development tree. Discussion: https://postgr.es/m/79ed5348-be7a-b647-dd40-742207186a22@2ndquadrant.com	2020-09-21 13:58:26 -04:00
Heikki Linnakangas	16fa9b2b30	Add support for building GiST index by sorting. This adds a new optional support function to the GiST access method: sortsupport. If it is defined, the GiST index is built by sorting all data to the order defined by the sortsupport's comparator function, and packing the tuples in that order to GiST pages. This is similar to how B-tree index build works, and is much faster than inserting the tuples one by one. The resulting index is smaller too, because the pages are packed more tightly, upto 'fillfactor'. The normal build method works by splitting pages, which tends to lead to more wasted space. The quality of the resulting index depends on how good the opclass-defined sort order is. A good order preserves locality of the input data. As the first user of this facility, add 'sortsupport' function to the point_ops opclass. It sorts the points in Z-order (aka Morton Code), by interleaving the bits of the X and Y coordinates. Author: Andrey Borodin Reviewed-by: Pavel Borisov, Thomas Munro Discussion: https://www.postgresql.org/message-id/1A36620E-CAD8-4267-9067-FB31385E7C0D%40yandex-team.ru	2020-09-17 11:33:40 +03:00
Jeff Davis	c8aeaf3ab3	Change LogicalTapeSetBlocks() to use nBlocksWritten. Previously, it was based on nBlocksAllocated to account for tapes with open write buffers that may not have made it to the BufFile yet. That was unnecessary, because callers do not need to get the number of blocks while a tape has an open write buffer; and it also conflicted with the preallocation logic added for HashAgg. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/ce5af05900fdbd0e9185747825a7423c48501964.camel@j-davis.com Backpatch-through: 13	2020-09-15 21:42:25 -07:00
Peter Eisentraut	3e0242b24c	Message fixes and style improvements	2020-09-14 06:42:30 +02:00
Jeff Davis	0758964963	logtape.c: do not preallocate for tapes when sorting The preallocation logic is only useful for HashAgg, so disable it when sorting. Also, adjust an out-of-date comment. Reviewed-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wzn_o7tE2+hRVvwSFghRb75AJ5g-nqGzDUqLYMexjOAe=g@mail.gmail.com Backpatch-through: 13	2020-09-11 17:10:02 -07:00
Jeff Davis	0852006a94	Fix bogus MaxAllocSize check in logtape.c. Reported-by: Peter Geoghegan Discussion: https://postgr.es/m/CAH2-Wz=NZPZc3-fkdmvu=w2itx0PiB-G6QpxHXZOjuvFAzPdZw@mail.gmail.com Backpatch-through: 13	2020-09-04 12:09:52 -07:00
Amit Kapila	808e13b282	Extend the BufFile interface. Allow BufFile to support temporary files that can be used by the single backend when the corresponding files need to be survived across the transaction and need to be opened and closed multiple times. Such files need to be created as a member of a SharedFileSet. Additionally, this commit implements the interface for BufFileTruncate to allow files to be truncated up to a particular offset and extends the BufFileSeek API to support the SEEK_END case. This also adds an option to provide a mode while opening the shared BufFiles instead of always opening in read-only mode. These enhancements in BufFile interface are required for the upcoming patch to allow the replication apply worker, to handle streamed in-progress transactions. Author: Dilip Kumar, Amit Kapila Reviewed-by: Amit Kapila Tested-by: Neha Sharma Discussion: https://postgr.es/m/688b0b7f-2f6c-d827-c27b-216a8e3ea700@2ndquadrant.com	2020-08-26 07:36:43 +05:30
Thomas Munro	7897e3bb90	Fix buffile.c error handling. Convert buffile.c error handling to use ereport. This fixes cases where I/O errors were indistinguishable from EOF or not reported. Also remove "%m" from error messages where errno would be bogus. While we're modifying those strings, add block numbers and short read byte counts where appropriate. Back-patch to all supported releases. Reported-by: Amit Khandekar <amitdkhan.pg@gmail.com> Reviewed-by: Melanie Plageman <melanieplageman@gmail.com> Reviewed-by: Alvaro Herrera <alvherre@2ndquadrant.com> Reviewed-by: Robert Haas <robertmhaas@gmail.com> Reviewed-by: Ibrar Ahmed <ibrar.ahmad@gmail.com> Reviewed-by: Michael Paquier <michael@paquier.xyz> Discussion: https://postgr.es/m/CA%2BhUKGJE04G%3D8TLK0DLypT_27D9dR8F1RQgNp0jK6qR0tZGWOw%40mail.gmail.com	2020-06-16 16:59:07 +12:00
Tom Lane	b5d69b7c22	pgindent run prior to branching v13. pgperltidy and reformat-dat-files too, though those didn't find anything to change.	2020-06-07 16:57:08 -04:00
Jeff Davis	1fbb6c93df	Fix platform-specific performance regression in logtape.c. Commit `24d85952` made a change that indirectly caused a performance regression by triggering a change in the way GCC optimizes memcpy() on some platforms. The behavior seemed to contradict a GCC document, so I filed a report: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95556 This patch implements a narrow workaround which eliminates the regression I observed. The workaround is benign enough that it seems unlikely to cause a different regression on another platform. Discussion: https://postgr.es/m/99b2eab335c1592c925d8143979c8e9e81e1575f.camel@j-davis.com	2020-06-07 09:25:55 -07:00
Jeff Davis	896ddf9b3c	Avoid fragmentation of logical tapes when writing concurrently. Disk-based HashAgg relies on writing to multiple tapes concurrently. Avoid fragmentation of the tapes' blocks by preallocating many blocks for a tape at once. No file operations are performed during preallocation; only the block numbers are reserved. Reviewed-by: Tomas Vondra Discussion: https://postgr.es/m/20200519151202.u2p2gpiawoaznsv2%40development	2020-05-26 16:49:43 -07:00
Tom Lane	5cbfce562f	Initial pgindent and pgperltidy run for v13. Includes some manual cleanup of places that pgindent messed up, most of which weren't per project style anyway. Notably, it seems some people didn't absorb the style rules of commit `c9d297751`, because there were a bunch of new occurrences of function calls with a newline just after the left paren, all with faulty expectations about how the rest of the call would get indented.	2020-05-14 13:06:50 -04:00
Tomas Vondra	1a40d37a9f	Fix typos and improve incremental sort comments Author: Justin Pryzby, James Coleman Discussion: https://postgr.es/m/20200419023625.GP26953@telsasoft.com	2020-05-12 19:37:13 +02:00
Tom Lane	0da06d9faf	Get rid of trailing semicolons in C macro definitions. Writing a trailing semicolon in a macro is almost never the right thing, because you almost always want to write a semicolon after each macro call instead. (Even if there was some reason to prefer not to, pgindent would probably make a hash of code formatted that way; so within PG the rule should basically be "don't do it".) Thus, if we have a semi inside the macro, the compiler sees "something;;". Much of the time the extra empty statement is harmless, but it could lead to mysterious syntax errors at call sites. In perhaps an overabundance of neatnik-ism, let's run around and get rid of the excess semicolons whereever possible. The only thing worse than a mysterious syntax error is a mysterious syntax error that only happens in the back branches; therefore, backpatch these changes where relevant, which is most of them because most of these mistakes are old. (The lack of reported problems shows that this is largely a hypothetical issue, but still, it could bite us in some future patch.) John Naylor and Tom Lane Discussion: https://postgr.es/m/CACPNZCs0qWTqJ2QUSGJ07B7uvAvzMb-KbG2q+oo+J3tsWN5cqw@mail.gmail.com	2020-05-01 17:28:00 -04:00
Jeff Davis	0cacb2b79d	Fix missing pfree() in logtape.c, missed by `24d85952`.	2020-04-19 10:33:06 -07:00
Michael Paquier	8128b0c152	Fix collection of typos and grammar mistakes in the tree, volume 2 This fixes some comments and documentation new as of Postgres 13, and is a follow-up of the work done in `dd0f37e`. Author: Justin Pryzby Discussion: https://postgr.es/m/20200408165653.GF2228@telsasoft.com	2020-04-14 14:45:43 +09:00
Andrew Dunstan	7be5d8df1f	Use perl warnings pragma consistently We've had a mixture of the warnings pragma, the -w switch on the shebang line, and no warnings at all. This patch removes the -w swicth and add the warnings pragma to all perl sources missing it. It raises the severity of the TestingAndDebugging::RequireUseWarnings perlcritic policy to level 5, so that we catch any future violations. Discussion: https://postgr.es/m/20200412074245.GB623763@rfd.leadboat.com	2020-04-13 11:55:45 -04:00
Tomas Vondra	d2d8a229bc	Implement Incremental Sort Incremental Sort is an optimized variant of multikey sort for cases when the input is already sorted by a prefix of the requested sort keys. For example when the relation is already sorted by (key1, key2) and we need to sort it by (key1, key2, key3) we can simply split the input rows into groups having equal values in (key1, key2), and only sort/compare the remaining column key3. This has a number of benefits: - Reduced memory consumption, because only a single group (determined by values in the sorted prefix) needs to be kept in memory. This may also eliminate the need to spill to disk. - Lower startup cost, because Incremental Sort produce results after each prefix group, which is beneficial for plans where startup cost matters (like for example queries with LIMIT clause). We consider both Sort and Incremental Sort, and decide based on costing. The implemented algorithm operates in two different modes: - Fetching a minimum number of tuples without check of equality on the prefix keys, and sorting on all columns when safe. - Fetching all tuples for a single prefix group and then sorting by comparing only the remaining (non-prefix) keys. We always start in the first mode, and employ a heuristic to switch into the second mode if we believe it's beneficial - the goal is to minimize the number of unnecessary comparions while keeping memory consumption below work_mem. This is a very old patch series. The idea was originally proposed by Alexander Korotkov back in 2013, and then revived in 2017. In 2018 the patch was taken over by James Coleman, who wrote and rewrote most of the current code. There were many reviewers/contributors since 2013 - I've done my best to pick the most active ones, and listed them in this commit message. Author: James Coleman, Alexander Korotkov Reviewed-by: Tomas Vondra, Andreas Karlsson, Marti Raudsepp, Peter Geoghegan, Robert Haas, Thomas Munro, Antonin Houska, Andres Freund, Alexander Kuzmenkov Discussion: https://postgr.es/m/CAPpHfdscOX5an71nHd8WSUH6GNOCf=V7wgDaTXdDd9=goN-gfA@mail.gmail.com Discussion: https://postgr.es/m/CAPpHfds1waRZ=NOmueYq0sx1ZSCnt+5QJvizT8ndT2=etZEeAQ@mail.gmail.com	2020-04-06 21:35:10 +02:00

1 2 3 4 5 ...

391 Commits