postgres

mirror of https://github.com/postgres/postgres.git synced 2025-07-18 17:42:25 +03:00

Author	SHA1	Message	Date
Tom Lane	c9414e7867	When updating reltuples after ANALYZE, just extrapolate from our sample. The existing logic for updating pg_class.reltuples trusted the sampling results only for the pages ANALYZE actually visited, preferring to believe the previous tuple density estimate for all the unvisited pages. While there's some rationale for doing that for VACUUM (first that VACUUM is likely to visit a very nonrandom subset of pages, and second that we know for sure that the unvisited pages did not change), there's no such rationale for ANALYZE: by assumption, it's looked at an unbiased random sample of the table's pages. Furthermore, in a very large table ANALYZE will have examined only a tiny fraction of the table's pages, meaning it cannot slew the overall density estimate very far at all. In a table that is physically growing, this causes reltuples to increase nearly proportionally to the change in relpages, regardless of what is actually happening in the table. This has been observed to cause reltuples to become so much larger than reality that it effectively shuts off autovacuum, whose threshold for doing anything is a fraction of reltuples. (Getting to the point where that would happen seems to require some additional, not well understood, conditions. But it's undeniable that if reltuples is seriously off in a large table, ANALYZE alone will not fix it in any reasonable number of iterations, especially not if the table is continuing to grow.) Hence, restrict the use of vac_estimate_reltuples() to VACUUM alone, and in ANALYZE, just extrapolate from the sample pages on the assumption that they provide an accurate model of the whole table. If, by very bad luck, they don't, at least another ANALYZE will fix it; in the old logic a single bad estimate could cause problems indefinitely. In HEAD, let's remove vac_estimate_reltuples' is_analyze argument altogether; it was never used for anything and now it's totally pointless. But keep it in the back branches, in case any third-party code is calling this function. Per bug #15005. Back-patch to all supported branches. David Gould, reviewed by Alexander Kuzmenkov, cosmetic changes by me Discussion: https://postgr.es/m/20180117164916.3fdcf2e9@engels	2018-03-13 13:24:27 -04:00
Tom Lane	bd797eaa9a	Improve wording of error message added in commit `714805010`. Per suggestions from Peter Eisentraut and David Johnston. Back-patch, like the previous commit. Discussion: https://postgr.es/m/E1dv9jI-0006oT-Fn@gemulon.postgresql.org	2017-09-26 15:25:56 -04:00
Tom Lane	122289a66b	Give a better error for duplicate entries in VACUUM/ANALYZE column list. Previously, the code didn't think about this case and would just try to analyze such a column twice. That would fail at the point of inserting the second version of the pg_statistic row, with obscure error messsages like "duplicate key value violates unique constraint" or "tuple already updated by self", depending on context and PG version. We could allow the case by ignoring duplicate column specifications, but it seems better to reject it explicitly. The bogus error messages seem like arguably a bug, so back-patch to all supported versions. Nathan Bossart, per a report from Michael Paquier, and whacked around a bit by me. Discussion: https://postgr.es/m/E061A8E3-5E3D-494D-94F0-E8A9B312BBFC@amazon.com	2017-09-21 18:13:11 -04:00
Tom Lane	cb5c14984a	Fix misestimation of n_distinct for a nearly-unique column with many nulls. If ANALYZE found no repeated non-null entries in its sample, it set the column's stadistinct value to -1.0, intending to indicate that the entries are all distinct. But what this value actually means is that the number of distinct values is 100% of the table's rowcount, and thus it was overestimating the number of distinct values by however many nulls there are. This could lead to very poor selectivity estimates, as for example in a recent report from Andreas Joseph Krogh. We should discount the stadistinct value by whatever we've estimated the nulls fraction to be. (That is what will happen if we choose to use a negative stadistinct for a column that does have repeated entries, so this code path was just inconsistent.) In addition to fixing the stadistinct entries stored by several different ANALYZE code paths, adjust the logic where get_variable_numdistinct() forces an "all distinct" estimate on the basis of finding a relevant unique index. Unique indexes don't reject nulls, so there's no reason to assume that the null fraction doesn't apply. Back-patch to all supported branches. Back-patching is a bit of a judgment call, but this problem seems to affect only a few users (else we'd have identified it long ago), and it's bad enough when it does happen that destabilizing plan choices in a worse direction seems unlikely. Patch by me, with documentation wording suggested by Dean Rasheed Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena> Discussion: <16143.1470350371@sss.pgh.pa.us>	2016-08-07 18:52:02 -04:00
Tom Lane	5acc58c5e5	Don't reset changes_since_analyze after a selective-columns ANALYZE. If we ANALYZE only selected columns of a table, we should not postpone auto-analyze because of that; other columns may well still need stats updates. As committed, the counter is left alone if a column list is given, whether or not it includes all analyzable columns of the table. Per complaint from Tomasz Ostrowski. It's been like this a long time, so back-patch to all supported branches. Report: <ef99c1bd-ff60-5f32-2733-c7b504eb960c@ato.waw.pl>	2016-06-06 17:44:17 -04:00
Tom Lane	cfb2024ae4	Make ANALYZE compute basic statistics even for types with no "=" operator. Previously, ANALYZE simply ignored columns of datatypes that have neither a btree nor hash opclass (which means they have no recognized equality operator). Without a notion of equality, we can't identify most-common values nor estimate the number of distinct values. But we can still count nulls and compute the average physical column width, and those stats might be of value. Moreover there are some tools out there that don't work so well if rows are missing from pg_statistic. So let's add suitable logic for this case. While this is arguably a bug fix, it also has the potential to change query plans, and the gain seems not worth taking a risk of that in stable branches. So back-patch into 9.5 but not further. Oleksandr Shulgin, rewritten a bit by me.	2015-09-23 18:26:58 -04:00
Bruce Momjian	807b9e0dff	pgindent run for 9.5	2015-05-23 21:35:49 -04:00
Simon Riggs	f6d208d6e5	TABLESAMPLE, SQL Standard and extensible Add a TABLESAMPLE clause to SELECT statements that allows user to specify random BERNOULLI sampling or block level SYSTEM sampling. Implementation allows for extensible sampling functions to be written, using a standard API. Basic version follows SQLStandard exactly. Usable concrete use cases for the sampling API follow in later commits. Petr Jelinek Reviewed by Michael Paquier and Simon Riggs	2015-05-15 14:37:10 -04:00
Simon Riggs	83e176ec18	Separate block sampling functions Refactoring ahead of tablesample patch Requested and reviewed by Michael Paquier Petr Jelinek	2015-05-15 04:02:54 +02:00
Alvaro Herrera	4ff695b17d	Add log_min_autovacuum_duration per-table option This is useful to control autovacuum log volume, for situations where monitoring only a set of tables is necessary. Author: Michael Paquier Reviewed by: A team led by Naoya Anzai (also including Akira Kurosawa, Taiki Kondo, Huong Dangminh), Fujii Masao.	2015-04-03 11:55:50 -03:00
Tom Lane	e4cbfd673d	Add vacuum_delay_point call in compute_index_stats's per-sample-row loop. Slow functions in index expressions might cause this loop to take long enough to make it worth being cancellable. Probably it would be enough to call CHECK_FOR_INTERRUPTS here, but for consistency with other per-sample-row loops in this file, let's use vacuum_delay_point. Report and patch by Jeff Janes. Back-patch to all supported branches.	2015-03-29 15:04:09 -04:00
Tom Lane	cb1ca4d800	Allow foreign tables to participate in inheritance. Foreign tables can now be inheritance children, or parents. Much of the system was already ready for this, but we had to fix a few things of course, mostly in the area of planner and executor handling of row locks. As side effects of this, allow foreign tables to have NOT VALID CHECK constraints (and hence to accept ALTER ... VALIDATE CONSTRAINT), and to accept ALTER SET STORAGE and ALTER SET WITH/WITHOUT OIDS. Continuing to disallow these things would've required bizarre and inconsistent special cases in inheritance behavior. Since foreign tables don't enforce CHECK constraints anyway, a NOT VALID one is a complete no-op, but that doesn't mean we shouldn't allow it. And it's possible that some FDWs might have use for SET STORAGE or SET WITH OIDS, though doubtless they will be no-ops for most. An additional change in support of this is that when a ModifyTable node has multiple target tables, they will all now be explicitly identified in EXPLAIN output, for example: Update on pt1 (cost=0.00..321.05 rows=3541 width=46) Update on pt1 Foreign Update on ft1 Foreign Update on ft2 Update on child3 -> Seq Scan on pt1 (cost=0.00..0.00 rows=1 width=46) -> Foreign Scan on ft1 (cost=100.00..148.03 rows=1170 width=46) -> Foreign Scan on ft2 (cost=100.00..148.03 rows=1170 width=46) -> Seq Scan on child3 (cost=0.00..25.00 rows=1200 width=46) This was done mainly to provide an unambiguous place to attach "Remote SQL" fields, but it is useful for inherited updates even when no foreign tables are involved. Shigeru Hanada and Etsuro Fujita, reviewed by Ashutosh Bapat and Kyotaro Horiguchi, some additional hacking by me	2015-03-22 13:53:21 -04:00
Alvaro Herrera	0d83138974	Rationalize vacuuming options and parameters We were involving the parser too much in setting up initial vacuuming parameters. This patch moves that responsibility elsewhere to simplify code, and also to make future additions easier. To do this, create a new struct VacuumParams which is filled just prior to vacuum execution, instead of at parse time; for user-invoked vacuuming this is set up in a new function ExecVacuum, while autovacuum sets it up by itself. While at it, add a new member VACOPT_SKIPTOAST to enum VacuumOption, only set by autovacuum, which is used to disable vacuuming of the toast table instead of the old do_toast parameter; this relieves the argument list of vacuum() and some callees a bit. This partially makes up for having added more arguments in an effort to avoid having autovacuum from constructing a VacuumStmt parse node. Author: Michael Paquier. Some tweaks by Álvaro Reviewed by: Robert Haas, Stephen Frost, Álvaro Herrera	2015-03-18 11:52:33 -03:00
Robert Haas	4ea51cdfe8	Use abbreviated keys for faster sorting of text datums. This commit extends the SortSupport infrastructure to allow operator classes the option to provide abbreviated representations of Datums; in the case of text, we abbreviate by taking the first few characters of the strxfrm() blob. If the abbreviated comparison is insufficent to resolve the comparison, we fall back on the normal comparator. This can be much faster than the old way of doing sorting if the first few bytes of the string are usually sufficient to resolve the comparison. There is the potential for a performance regression if all of the strings to be sorted are identical for the first 8+ characters and differ only in later positions; therefore, the SortSupport machinery now provides an infrastructure to abort the use of abbreviation if it appears that abbreviation is producing comparatively few distinct keys. HyperLogLog, a streaming cardinality estimator, is included in this commit and used to make that determination for text. Peter Geoghegan, reviewed by me.	2015-01-19 15:28:27 -05:00
Bruce Momjian	4baaf863ec	Update copyright for 2015 Backpatch certain files through 9.0	2015-01-06 11:43:47 -05:00
Simon Riggs	0f66d21201	Emit msg re skipping ANALYZE for absent inh tree When checking a table that has an inheritance tree marked, if no child tables remain, we skip ANALYZE. This patch emits a message to show that the action has been skipped. Author: Etsuro Fujita Reviewer: Furuya Osamu	2014-11-15 22:49:54 +00:00
Tom Lane	fd0f651a86	Test IsInTransactionChain, not IsTransactionBlock, in vac_update_relstats. As noted by Noah Misch, my initial cut at fixing bug #11638 didn't cover all cases where ANALYZE might be invoked in an unsafe context. We need to test the result of IsInTransactionChain not IsTransactionBlock; which is notationally a pain because IsInTransactionChain requires an isTopLevel flag, which would have to be passed down through several levels of callers. I chose to pass in_outer_xact (ie, the result of IsInTransactionChain) rather than isTopLevel per se, as that seemed marginally more apropos for the intermediate functions to know about.	2014-10-30 13:04:06 -04:00
Bruce Momjian	0a78320057	pgindent run for 9.4 This includes removing tabs after periods in C comments, which was applied to back branches, so this change should not effect backpatching.	2014-05-06 12:12:18 -04:00
Robert Haas	b89e151054	Introduce logical decoding. This feature, building on previous commits, allows the write-ahead log stream to be decoded into a series of logical changes; that is, inserts, updates, and deletes and the transactions which contain them. It is capable of handling decoding even across changes to the schema of the effected tables. The output format is controlled by a so-called "output plugin"; an example is included. To make use of this in a real replication system, the output plugin will need to be modified to produce output in the format appropriate to that system, and to perform filtering. Currently, information can be extracted from the logical decoding system only via SQL; future commits will add the ability to stream changes via walsender. Andres Freund, with review and other contributions from many other people, including Álvaro Herrera, Abhijit Menon-Sen, Peter Gheogegan, Kevin Grittner, Robert Haas, Heikki Linnakangas, Fujii Masao, Abhijit Menon-Sen, Michael Paquier, Simon Riggs, Craig Ringer, and Steve Singer.	2014-03-03 16:32:18 -05:00
Tom Lane	6286526207	Fix compute_scalar_stats() for case that all values exceed WIDTH_THRESHOLD. The standard typanalyze functions skip over values whose detoasted size exceeds WIDTH_THRESHOLD (1024 bytes), so as to limit memory bloat during ANALYZE. However, we (I think I, actually :-() failed to consider the possibility that every non-null value in a column is too wide. While compute_minimal_stats() seems to behave reasonably anyway in such a case, compute_scalar_stats() just fell through and generated no pg_statistic entry at all. That's unnecessarily pessimistic: we can still produce valid stanullfrac and stawidth values in such cases, since we do include too-wide values in the average-width calculation. Furthermore, since the general assumption in this code is that too-wide values are probably all distinct from each other, it seems reasonable to set stadistinct to -1 ("all distinct"). Per complaint from Kadri Raudsepp. This has been like this since roughly neolithic times, so back-patch to all supported branches.	2014-01-11 13:42:42 -05:00
Bruce Momjian	7e04792a1c	Update copyright for 2014 Update all files in head, and files COPYRIGHT and legal.sgml in all back branches.	2014-01-07 16:05:30 -05:00
Robert Haas	0518eceec3	Adjust HeapTupleSatisfies* routines to take a HeapTuple. Previously, these functions took a HeapTupleHeader, but upcoming patches for logical replication will introduce new a new snapshot type under which the tuple's TID will be used to lookup (CMIN, CMAX) for visibility determination purposes. This makes that information available. Code churn is minimal since HeapTupleSatisfiesVisibility took the HeapTuple anyway, and deferenced it before calling the satisfies function. Independently of logical replication, this allows t_tableOid and t_self to be cross-checked via assertions in tqual.c. This seems like a useful way to make sure that all callers are setting these values properly, which has been previously put forward as desirable. Andres Freund, reviewed by Álvaro Herrera	2013-07-22 13:38:44 -04:00
Tom Lane	1908abc4a3	Arrange to cache FdwRoutine structs in foreign tables' relcache entries. This saves several catalog lookups per reference. It's not all that exciting right now, because we'd managed to minimize the number of places that need to fetch the data; but the upcoming writable-foreign-tables patch needs this info in a lot more places.	2013-03-06 23:48:09 -05:00
Kevin Grittner	3bf3ab8c56	Add a materialized view relations. A materialized view has a rule just like a view and a heap and other physical properties like a table. The rule is only used to populate the table, references in queries refer to the materialized data. This is a minimal implementation, but should still be useful in many cases. Currently data is only populated "on demand" by the CREATE MATERIALIZED VIEW and REFRESH MATERIALIZED VIEW statements. It is expected that future releases will add incremental updates with various timings, and that a more refined concept of defining what is "fresh" data will be developed. At some point it may even be possible to have queries use a materialized in place of references to underlying tables, but that requires the other above-mentioned features to be working first. Much of the documentation work by Robert Haas. Review by Noah Misch, Thom Brown, Robert Haas, Marko Tiikkaja Security review by KaiGai Kohei, with a decision on how best to implement sepgsql still pending.	2013-03-03 18:23:31 -06:00
Alvaro Herrera	0ac5ad5134	Improve concurrency of foreign key locking This patch introduces two additional lock modes for tuples: "SELECT FOR KEY SHARE" and "SELECT FOR NO KEY UPDATE". These don't block each other, in contrast with already existing "SELECT FOR SHARE" and "SELECT FOR UPDATE". UPDATE commands that do not modify the values stored in the columns that are part of the key of the tuple now grab a SELECT FOR NO KEY UPDATE lock on the tuple, allowing them to proceed concurrently with tuple locks of the FOR KEY SHARE variety. Foreign key triggers now use FOR KEY SHARE instead of FOR SHARE; this means the concurrency improvement applies to them, which is the whole point of this patch. The added tuple lock semantics require some rejiggering of the multixact module, so that the locking level that each transaction is holding can be stored alongside its Xid. Also, multixacts now need to persist across server restarts and crashes, because they can now represent not only tuple locks, but also tuple updates. This means we need more careful tracking of lifetime of pg_multixact SLRU files; since they now persist longer, we require more infrastructure to figure out when they can be removed. pg_upgrade also needs to be careful to copy pg_multixact files over from the old server to the new, or at least part of multixact.c state, depending on the versions of the old and new servers. Tuple time qualification rules (HeapTupleSatisfies routines) need to be careful not to consider tuples with the "is multi" infomask bit set as being only locked; they might need to look up MultiXact values (i.e. possibly do pg_multixact I/O) to find out the Xid that updated a tuple, whereas they previously were assured to only use information readily available from the tuple header. This is considered acceptable, because the extra I/O would involve cases that would previously cause some commands to block waiting for concurrent transactions to finish. Another important change is the fact that locking tuples that have previously been updated causes the future versions to be marked as locked, too; this is essential for correctness of foreign key checks. This causes additional WAL-logging, also (there was previously a single WAL record for a locked tuple; now there are as many as updated copies of the tuple there exist.) With all this in place, contention related to tuples being checked by foreign key rules should be much reduced. As a bonus, the old behavior that a subtransaction grabbing a stronger tuple lock than the parent (sub)transaction held on a given tuple and later aborting caused the weaker lock to be lost, has been fixed. Many new spec files were added for isolation tester framework, to ensure overall behavior is sane. There's probably room for several more tests. There were several reviewers of this patch; in particular, Noah Misch and Andres Freund spent considerable time in it. Original idea for the patch came from Simon Riggs, after a problem report by Joel Jacobson. Most code is from me, with contributions from Marti Raudsepp, Alexander Shulgin, Noah Misch and Andres Freund. This patch was discussed in several pgsql-hackers threads; the most important start at the following message-ids: AANLkTimo9XVcEzfiBR-ut3KVNDkjm2Vxh+t8kAmWjPuv@mail.gmail.com 1290721684-sup-3951@alvh.no-ip.org 1294953201-sup-2099@alvh.no-ip.org 1320343602-sup-2290@alvh.no-ip.org 1339690386-sup-8927@alvh.no-ip.org 4FE5FF020200002500048A3D@gw.wicourts.gov 4FEAB90A0200002500048B7D@gw.wicourts.gov	2013-01-23 12:04:59 -03:00
Bruce Momjian	bd61a623ac	Update copyrights for 2013 Fully update git head, and update back branches in ./COPYRIGHT and legal.sgml files.	2013-01-01 17:15:01 -05:00
Bruce Momjian	927d61eeff	Run pgindent on 9.2 source tree in preparation for first 9.3 commit-fest.	2012-06-10 15:20:04 -04:00
Heikki Linnakangas	9e4637bf89	Update comments that became out-of-date with the PGXACT struct. When the "hot" members of PGPROC were split off to separate PGXACT structs, many PGPROC fields referred to in comments were moved to PGXACT, but the comments were neglected in the commit. Mostly this is just a search/replace of PGPROC with PGXACT, but the way the dummy PGPROC entries are created for prepared transactions changed more, making some of the comments totally bogus. Noah Misch	2012-05-14 10:28:55 +03:00
Tom Lane	cea49fe82f	Dept of second thoughts: improve the API for AnalyzeForeignTable. If we make the initially-called function return the table physical-size estimate, acquire_inherited_sample_rows will be able to use that to allocate numbers of samples among child tables, when the day comes that we want to support foreign tables in inheritance trees.	2012-04-06 16:04:10 -04:00
Tom Lane	263d9de66b	Allow statistics to be collected for foreign tables. ANALYZE now accepts foreign tables and allows the table's FDW to control how the sample rows are collected. (But only manual ANALYZEs will touch foreign tables, for the moment, since among other things it's not very clear how to handle remote permissions checks in an auto-analyze.) contrib/file_fdw is extended to support this. Etsuro Fujita, reviewed by Shigeru Hanada, some further tweaking by me.	2012-04-06 15:02:35 -04:00
Tom Lane	0e5e167aae	Collect and use element-frequency statistics for arrays. This patch improves selectivity estimation for the array <@, &&, and @> (containment and overlaps) operators. It enables collection of statistics about individual array element values by ANALYZE, and introduces operator-specific estimators that use these stats. In addition, ScalarArrayOpExpr constructs of the forms "const = ANY/ALL (array_column)" and "const <> ANY/ALL (array_column)" are estimated by treating them as variants of the containment operators. Since we still collect scalar-style stats about the array values as a whole, the pg_stats view is expanded to show both these stats and the array-style stats in separate columns. This creates an incompatible change in how stats for tsvector columns are displayed in pg_stats: the stats about lexemes are now displayed in the array-related columns instead of the original scalar-related columns. There are a few loose ends here, notably that it'd be nice to be able to suppress either the scalar-style stats or the array-element stats for columns for which they're not useful. But the patch is in good enough shape to commit for wider testing. Alexander Korotkov, reviewed by Noah Misch and Nathan Boley	2012-03-03 20:20:57 -05:00
Bruce Momjian	e126958c2e	Update copyright notices for year 2012.	2012-01-01 18:01:58 -05:00
Tom Lane	c6e3ac11b6	Create a "sort support" interface API for faster sorting. This patch creates an API whereby a btree index opclass can optionally provide non-SQL-callable support functions for sorting. In the initial patch, we only use this to provide a directly-callable comparator function, which can be invoked with a bit less overhead than the traditional SQL-callable comparator. While that should be of value in itself, the real reason for doing this is to provide a datatype-extensible framework for more aggressive optimizations, as in Peter Geoghegan's recent work. Robert Haas and Tom Lane	2011-12-07 00:19:39 -05:00
Robert Haas	ed0b409d22	Move "hot" members of PGPROC into a separate PGXACT array. This speeds up snapshot-taking and reduces ProcArrayLock contention. Also, the PGPROC (and PGXACT) structures used by two-phase commit are now allocated as part of the main array, rather than in a separate array, and we keep ProcArray sorted in pointer order. These changes are intended to minimize the number of cache lines that must be pulled in to take a snapshot, and testing shows a substantial increase in performance on both read and write workloads at high concurrencies. Pavan Deolasee, Heikki Linnakangas, Robert Haas	2011-11-25 08:02:10 -05:00
Tom Lane	e6858e6657	Measure the number of all-visible pages for use in index-only scan costing. Add a column pg_class.relallvisible to remember the number of pages that were all-visible according to the visibility map as of the last VACUUM (or ANALYZE, or some other operations that update pg_class.relpages). Use relallvisible/relpages, instead of an arbitrary constant, to estimate how many heap page fetches can be avoided during an index-only scan. This is pretty primitive and will no doubt see refinements once we've acquired more field experience with the index-only scan mechanism, but it's way better than using a constant. Note: I had to adjust an underspecified query in the window.sql regression test, because it was changing answers when the plan changed to use an index-only scan. Some of the adjacent tests perhaps should be adjusted as well, but I didn't do that here.	2011-10-14 17:23:46 -04:00
Peter Eisentraut	1b81c2fe6e	Remove many -Wcast-qual warnings This addresses only those cases that are easy to fix by adding or moving a const qualifier or removing an unnecessary cast. There are many more complicated cases remaining.	2011-09-11 21:54:32 +03:00
Tom Lane	a7801b62f2	Move Timestamp/Interval typedefs and basic macros into datatype/timestamp.h. As per my recent proposal, this refactors things so that these typedefs and macros are available in a header that can be included in frontend-ish code. I also changed various headers that were undesirably including utils/timestamp.h to include datatype/timestamp.h instead. Unsurprisingly, this showed that half the system was getting utils/timestamp.h by way of xlog.h. No actual code changes here, just header refactoring.	2011-09-09 13:23:41 -04:00
Tom Lane	780a342c90	Avoid possibly accessing off the end of memory in examine_attribute(). Since the last couple of columns of pg_type are often NULL, sizeof(FormData_pg_type) can be an overestimate of the actual size of the tuple data part. Therefore memcpy'ing that much out of the catalog cache, as analyze.c was doing, poses a small risk of copying past the end of memory and incurring SIGSEGV. No such crash has been identified in the field, but we've certainly seen the equivalent happen in other code paths, so patch this one all the way back. Per valgrind testing by Noah Misch, though this is not his proposed patch. I chose to use SearchSysCacheCopy1 rather than inventing special-purpose infrastructure for copying only the minimal part of a pg_type tuple.	2011-09-06 14:37:22 -04:00
Tom Lane	1609797c25	Clean up the #include mess a little. walsender.h should depend on xlog.h, not vice versa. (Actually, the inclusion was circular until a couple hours ago, which was even sillier; but Bruce broke it in the expedient rather than logically correct direction.) Because of that poor decision, plus blind application of pgrminclude, we had a situation where half the system was depending on xlog.h to include such unrelated stuff as array.h and guc.h. Clean up the header inclusion, and manually revert a lot of what pgrminclude had done so things build again. This episode reinforces my feeling that pgrminclude should not be run without adult supervision. Inclusion changes in header files in particular need to be reviewed with great care. More generally, it'd be good if we had a clearer notion of module layering to dictate which headers can sanely include which others ... but that's a big task for another day.	2011-09-04 01:13:16 -04:00
Tom Lane	5b562644fe	Teach ANALYZE to clear pg_class.relhassubclass when appropriate. In the past, relhassubclass always remained true if a relation had ever had child relations, even if the last subclass was long gone. While this had only marginal performance implications in most cases, it was annoying, and I'm now considering some planner changes that would raise the cost of a false positive. It was previously impractical to fix this because of race condition concerns. However, given the recent change that made tablecmds.c take ShareExclusiveLock on relations that are gaining a child (commit `fbcf4b92aa`), we can now allow ANALYZE to clear the flag when it's no longer relevant. There is no additional locking cost to do so, since ANALYZE takes ShareExclusiveLock anyway.	2011-09-02 14:29:31 -04:00
Bruce Momjian	6416a82a62	Remove unnecessary #include references, per pgrminclude script.	2011-09-01 10:04:27 -04:00
Tom Lane	63513b207d	Fix thinko in previous patch to always update pg_class.reltuples/relpages. I mis-simplified the test where ANALYZE decided if it could get away without doing anything: under the new regime, that's never allowed. Per bug #6068 from Jeff Janes. Back-patch to 8.4, just like previous patch.	2011-06-19 14:00:48 -04:00
Tom Lane	bfcb9328e5	Index tuple data arrays using Anum_xxx symbolic constants instead of "i++". We had already converted most places to this style, but this patch gets the last few that were still doing it the old way. The main advantage is that this exposes a greppable name for each target column, rather than having to rely on comments (which a couple of places failed to provide anyhow). Richard Hopkins, additional work by me to clean up update_attstats() too	2011-06-16 17:04:40 -04:00
Bruce Momjian	6560407c7d	Pgindent run before 9.1 beta2.	2011-06-09 14:32:50 -04:00
Tom Lane	b4b6923e03	Fix VACUUM so that it always updates pg_class.reltuples/relpages. When we added the ability for vacuum to skip heap pages by consulting the visibility map, we made it just not update the reltuples/relpages statistics if it skipped any pages. But this could leave us with extremely out-of-date stats for a table that contains any unchanging areas, especially for TOAST tables which never get processed by ANALYZE. In particular this could result in autovacuum making poor decisions about when to process the table, as in recent report from Florian Helmberger. And in general it's a bad idea to not update the stats at all. Instead, use the previous values of reltuples/relpages as an estimate of the tuple density in unvisited pages. This approach results in a "moving average" estimate of reltuples, which should converge to the correct value over multiple VACUUM and ANALYZE cycles even when individual measurements aren't very good. This new method for updating reltuples is used by both VACUUM and ANALYZE, with the result that we no longer need the grotty interconnections that caused ANALYZE to not update the stats depending on what had happened in the parent VACUUM command. Also, fix the logic for skipping all-visible pages during VACUUM so that it looks ahead rather than behind to decide what to do, as per a suggestion from Greg Stark. This eliminates useless scanning of all-visible pages at the start of the relation or just after a not-all-visible page. In particular, the first few pages of the relation will not be invariably included in the scanned pages, which seems to help in not overweighting them in the reltuples estimate. Back-patch to 8.4, where the visibility map was introduced.	2011-05-30 17:06:52 -04:00
Tom Lane	d64713df7e	Pass collations to functions in FunctionCallInfoData, not FmgrInfo. Since collation is effectively an argument, not a property of the function, FmgrInfo is really the wrong place for it; and this becomes critical in cases where a cached FmgrInfo is used for varying purposes that might need different collation settings. Fix by passing it in FunctionCallInfoData instead. In particular this allows a clean fix for bug #5970 (record_cmp not working). This requires touching a bit more code than the original method, but nobody ever thought that collations would not be an invasive patch...	2011-04-12 19:19:24 -04:00
Bruce Momjian	bf50caf105	pgindent run before PG 9.1 beta 1.	2011-04-10 11:42:00 -04:00
Tom Lane	b310b6e31c	Revise collation derivation method and expression-tree representation. All expression nodes now have an explicit output-collation field, unless they are known to only return a noncollatable data type (such as boolean or record). Also, nodes that can invoke collation-aware functions store a separate field that is the collation value to pass to the function. This avoids confusion that arises when a function has collatable inputs and noncollatable output type, or vice versa. Also, replace the parser's on-the-fly collation assignment method with a post-pass over the completed expression tree. This allows us to use a more complex (and hopefully more nearly spec-compliant) assignment rule without paying for it in extra storage in every expression node. Fix assorted bugs in the planner's handling of collations by making collation one of the defining properties of an EquivalenceClass and by converting CollateExprs into discardable RelabelType nodes during expression preprocessing.	2011-03-19 20:30:08 -04:00
Tom Lane	696d1f7f06	Make all comparisons done for/with statistics use the default collation. While this will give wrong answers when estimating selectivity for a comparison operator that's using a non-default collation, the estimation error probably won't be large; and anyway the former approach created estimation errors of its own by trying to use a histogram that might have been computed with some other collation. So we'll adopt this simplified approach for now and perhaps improve it sometime in the future. This patch incorporates changes from Andres Freund to make sure that selfuncs.c passes a valid collation OID to any datatype-specific function it calls, in case that function wants collation information. Said OID will now always be DEFAULT_COLLATION_OID, but at least we won't get errors.	2011-03-12 16:30:36 -05:00
Peter Eisentraut	414c5a2ea6	Per-column collation support This adds collation support for columns and domains, a COLLATE clause to override it per expression, and B-tree index support. Peter Eisentraut reviewed by Pavel Stehule, Itagaki Takahiro, Robert Haas, Noah Misch	2011-02-08 23:04:18 +02:00

1 2 3 4 5

209 Commits