1
0
mirror of https://github.com/postgres/postgres.git synced 2025-11-24 00:23:06 +03:00
Tom Lane 057012b205 Speed up eqjoinsel() with lots of MCV entries.
If both sides of the operator have most-common-value statistics,
eqjoinsel wants to check which MCVs have matches on the other side.
Formerly it did this with a dumb compare-all-the-entries loop,
which had O(N^2) behavior for long MCV lists.  When that code was
written, twenty-plus years ago, that seemed tolerable; but nowadays
people frequently use much larger statistics targets, so that the
O(N^2) behavior can hurt quite a bit.

To add insult to injury, when asked for semijoin semantics, the
entire comparison loop was done over, even though we frequently
know that it will yield exactly the same results.

To improve matters, switch to using a hash table to perform the
matching.  Testing suggests that depending on the data type, we may
need up to about 100 MCVs on each side to amortize the extra costs
of setting up the hash table and performing hash-value computations;
so continue to use the old looping method when there are fewer MCVs
than that.

Also, refactor so that we don't repeat the matching work unless
we really need to, which occurs only in the uncommon case where
eqjoinsel_semi decides to truncate the set of inner MCVs it
considers.  The refactoring also got rid of the need to use the
presented operator's commutator.  Real-world operators that are
using eqjoinsel should pretty much always have commutators, but
at the very least this saves a few syscache lookups.

Author: Ilia Evdokimov <ilya.evdokimov@tantorlabs.com>
Co-authored-by: David Geier <geidav.pg@gmail.com>
Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us>
Discussion: https://postgr.es/m/20ea8bf5-3569-4e46-92ef-ebb2666debf6@tantorlabs.com
2025-11-19 13:22:12 -05:00
2025-08-14 12:09:34 -04:00
2022-12-04 15:23:00 -05:00
2024-11-05 13:56:02 +01:00
2025-11-06 08:00:57 +01:00
2025-11-06 08:00:57 +01:00
2020-02-10 20:47:50 +01:00
2024-02-28 15:17:23 +04:00

PostgreSQL Database Management System

This directory contains the source code distribution of the PostgreSQL database management system.

PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. This distribution also contains C language bindings.

Copyright and license information can be found in the file COPYRIGHT.

General documentation about this version of PostgreSQL can be found at https://www.postgresql.org/docs/devel/. In particular, information about building PostgreSQL from the source code can be found at https://www.postgresql.org/docs/devel/installation.html.

The latest version of this software, and related software, may be obtained at https://www.postgresql.org/download/. For more information look at our web site located at https://www.postgresql.org/.

Description
Зеркало официального репозитория PostgreSQL GIT
Readme 1.1 GiB
Languages
C 85.1%
PLpgSQL 6%
Perl 4.6%
Yacc 1.2%
Meson 0.7%
Other 2.2%