1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-30 11:03:19 +03:00

Create a type-specific typanalyze routine for tsvector, which collects stats

on the most common individual lexemes in place of the mostly-useless default
behavior of counting duplicate tsvectors.  Future work: create selectivity
estimation functions that actually do something with these stats.

(Some other things we ought to look at doing: using the Lossy Counting
algorithm in compute_minimal_stats, and using the element-counting idea for
stats on regular arrays.)

Jan Urbanski
This commit is contained in:
Tom Lane
2008-07-14 00:51:46 +00:00
parent 6816577a78
commit 6f6d863258
11 changed files with 467 additions and 41 deletions

View File

@ -1,4 +1,4 @@
<!-- $PostgreSQL: pgsql/doc/src/sgml/catalogs.sgml,v 2.167 2008/07/11 07:02:43 petere Exp $ -->
<!-- $PostgreSQL: pgsql/doc/src/sgml/catalogs.sgml,v 2.168 2008/07/14 00:51:45 tgl Exp $ -->
<!--
Documentation of the system catalogs, directed toward PostgreSQL developers
-->
@ -6516,6 +6516,8 @@
<entry>
A list of the most common values in the column. (NULL if
no values seem to be more common than any others.)
For some datatypes such as <type>tsvector</>, this is a list of
the most common element values rather than values of the type itself.
</entry>
</row>
@ -6524,10 +6526,10 @@
<entry><type>real[]</type></entry>
<entry></entry>
<entry>
A list of the frequencies of the most common values,
A list of the frequencies of the most common values or elements,
i.e., number of occurrences of each divided by total number of rows.
(NULL when <structfield>most_common_vals</structfield> is.)
</entry>
</entry>
</row>
<row>