Before 9.4, such an aggregate couldn't be declared, because its final
function would have to have polymorphic result type but no polymorphic
argument, which CREATE FUNCTION would quite properly reject. The
ordered-set-aggregate patch found a workaround: allow the final function
to be declared as accepting additional dummy arguments that have types
matching the aggregate's regular input arguments. However, we failed
to notice that this problem applies just as much to regular aggregates,
despite the fact that we had a built-in regular aggregate array_agg()
that was known to be undeclarable in SQL because its final function
had an illegal signature. So what we should have done, and what this
patch does, is to decouple the extra-dummy-arguments behavior from
ordered-set aggregates and make it generally available for all aggregate
declarations. We have to put this into 9.4 rather than waiting till
later because it slightly alters the rules for declaring ordered-set
aggregates.
The patch turned out a bit bigger than I'd hoped because it proved
necessary to record the extra-arguments option in a new pg_aggregate
column. I'd thought we could just look at the final function's pronargs
at runtime, but that didn't work well for variadic final functions.
It's probably just as well though, because it simplifies life for pg_dump
to record the option explicitly.
While at it, fix array_agg() to have a valid final-function signature,
and add an opr_sanity test to notice future deviations from polymorphic
consistency. I also marked the percentile_cont() aggregates as not
needing extra arguments, since they don't.
Until now, when executing an aggregate function as a window function
within a window with moving frame start (that is, any frame start mode
except UNBOUNDED PRECEDING), we had to recalculate the aggregate from
scratch each time the frame head moved. This patch allows an aggregate
definition to include an alternate "moving aggregate" implementation
that includes an inverse transition function for removing rows from
the aggregate's running state. As long as this can be done successfully,
runtime is proportional to the total number of input rows, rather than
to the number of input rows times the average frame length.
This commit includes the core infrastructure, documentation, and regression
tests using user-defined aggregates. Follow-on commits will update some
of the built-in aggregates to use this feature.
David Rowley and Florian Pflug, reviewed by Dean Rasheed; additional
hacking by me
This patch introduces generic support for ordered-set and hypothetical-set
aggregate functions, as well as implementations of the instances defined in
SQL:2008 (percentile_cont(), percentile_disc(), rank(), dense_rank(),
percent_rank(), cume_dist()). We also added mode() though it is not in the
spec, as well as versions of percentile_cont() and percentile_disc() that
can compute multiple percentile values in one pass over the data.
Unlike the original submission, this patch puts full control of the sorting
process in the hands of the aggregate's support functions. To allow the
support functions to find out how they're supposed to sort, a new API
function AggGetAggref() is added to nodeAgg.c. This allows retrieval of
the aggregate call's Aggref node, which may have other uses beyond the
immediate need. There is also support for ordered-set aggregates to
install cleanup callback functions, so that they can be sure that
infrastructure such as tuplesort objects gets cleaned up.
In passing, make some fixes in the recently-added support for variadic
aggregates, and make some editorial adjustments in the recent FILTER
additions for aggregates. Also, simplify use of IsBinaryCoercible() by
allowing it to succeed whenever the target type is ANY or ANYELEMENT.
It was inconsistent that it dealt with other polymorphic target types
but not these.
Atri Sharma and Andrew Gierth; reviewed by Pavel Stehule and Vik Fearing,
and rather heavily editorialized upon by Tom Lane
Formerly the planner had a hard-wired rule of thumb for guessing the amount
of space consumed by an aggregate function's transition state data. This
estimate is critical to deciding whether it's OK to use hash aggregation,
and in many situations the built-in estimate isn't very good. This patch
adds a column to pg_aggregate wherein a per-aggregate estimate can be
provided, overriding the planner's default, and infrastructure for setting
the column via CREATE AGGREGATE.
It may be that additional smarts will be required in future, perhaps even
a per-aggregate estimation function. But this is already a step forward.
This is extracted from a larger patch to improve the performance of numeric
and int8 aggregates. I (tgl) thought it was worth reviewing and committing
this infrastructure separately. In this commit, all built-in aggregates
are given aggtransspace = 0, so no behavior should change.
Hadi Moshayedi, reviewed by Pavel Stehule and Tomas Vondra
There's no inherent reason why an aggregate function can't be variadic
(even VARIADIC ANY) if its transition function can handle the case.
Indeed, this patch to add the feature touches none of the planner or
executor, and little of the parser; the main missing stuff was DDL and
pg_dump support.
It is true that variadic aggregates can create the same sort of ambiguity
about parameters versus ORDER BY keys that was complained of when we
(briefly) had both one- and two-argument forms of string_agg(). However,
the policy formed in response to that discussion only said that we'd not
create any built-in aggregates with varying numbers of arguments, not that
we shouldn't allow users to do it. So the logical extension of that is
we can allow users to make variadic aggregates as long as we're wary about
shipping any such in core.
In passing, this patch allows aggregate function arguments to be named, to
the extent of remembering the names in pg_proc and dumping them in pg_dump.
You can't yet call an aggregate using named-parameter notation. That seems
like a likely future extension, but it'll take some work, and it's not what
this patch is really about. Likewise, there's still some work needed to
make window functions handle VARIADIC fully, but I left that for another
day.
initdb forced because of new aggvariadic field in Aggref parse nodes.
Remove duplicate implementations of catalog munging and miscellaneous
privilege checks. Instead rely on already existing data in
objectaddress.c to do the work.
Author: KaiGai Kohei, changes by me
Reviewed by: Robert Haas, Álvaro Herrera, Dimitri Fontaine
Extracted from a larger patch by Dimitri Fontaine. It is hoped that
this will provide infrastructure for enriching the new event trigger
functionality, but it seems possibly useful for other purposes as
well.
The initial transition value is stored as a text string and not fed to the
transition type's input function until runtime (so that values such as
"now" don't get frozen at creation time). Previously, CREATE AGGREGATE
didn't do anything with it but that, which meant that even erroneous values
would be accepted and not complained of until the aggregate is used. This
seems unhelpful, and it's confused at least one user, as in Rhys Stewart's
recent report. It seems worth taking a few more cycles to invoke the input
function and verify that the value is acceptable. We can't do this if the
transition type is polymorphic, but in normal aggregates we know the actual
transition type so we can call the right input function.
Remove duplicate implementation of catalog munging and miscellaneous
privilege and consistency checks. Instead rely on already existing data
in objectaddress.c to do the work.
Author: KaiGai Kohei
Tweaked by me
Reviewed by Robert Haas
This reduces unnecessary exposure of other headers through htup.h, which
is very widely included by many files.
I have chosen to move the function prototypes to the new file as well,
because that means htup.h no longer needs to include tupdesc.h. In
itself this doesn't have much effect in indirect inclusion of tupdesc.h
throughout the tree, because it's also required by execnodes.h; but it's
something to explore in the future, and it seemed best to do the htup.h
change now while I'm busy with it.
This gets rid of an impressive amount of duplicative code, with only
minimal behavior changes. DROP FOREIGN DATA WRAPPER now requires object
ownership rather than superuser privileges, matching the documentation
we already have. We also eliminate the historical warning about dropping
a built-in function as unuseful. All operations are now performed in the
same order for all object types handled by dropcmds.c.
KaiGai Kohei, with minor revisions by me
Split the old typenameTypeId() into two functions: A new typenameTypeId() that
returns only a type OID, and typenameTypeIdAndMod() that returns type OID and
typmod. This isolates call sites better that actually care about the typmod.
The purpose of this change is to eliminate the need for every caller
of SearchSysCache, SearchSysCacheCopy, SearchSysCacheExists,
GetSysCacheOid, and SearchSysCacheList to know the maximum number
of allowable keys for a syscache entry (currently 4). This will
make it far easier to increase the maximum number of keys in a
future release should we choose to do so, and it makes the code
shorter, too.
Design and review by Tom Lane.
if the user is superuser. This makes available to extension modules the same
sort of trick being practiced by array_agg(). The reason for the superuser
restriction is that you could crash the system by connecting up an
incompatible pair of internal-using functions as an aggregate. It shouldn't
interfere with any legitimate use, since you'd have to be superuser to create
the internal-using transition and final functions anyway.
patches that dealt with object ownership. It wasn't updating pg_shdepend
nor adjusting the aggregate's ACL. In 8.2 and up, fix this permanently
by making it use AlterFunctionOwner_oid. In 8.1, the function code wasn't
factored that way, so just copy and paste.
even in code paths where we don't pay any subsequent attention to the typmod
value. This seems needed in view of the fact that 8.3's generalized typmod
support will accept a lot of bogus syntax, such as "timestamp(foo)" or
"record(int, 42)" --- if we allow such things to pass without comment,
users will get confused. Per a recent example from Greg Stark.
To implement this in a way that's not very vulnerable to future
bugs-of-omission, refactor the API of parse_type.c's TypeName lookup routines
so that typmod validation is folded into the base lookup operation. Callers
can still choose not to receive the encoded typmod, but we'll check the
decoration anyway if it's present.
the opportunity to treat COUNT(*) as a zero-argument aggregate instead
of the old hack that equated it to COUNT(1); this is materially cleaner
(no more weird ANYOID cases) and ought to be at least a tiny bit faster.
Original patch by Sergey Koposov; review, documentation, simple regression
tests, pg_dump and psql support by moi.
CREATE AGGREGATE aggname (input_type) (parameter_list)
along with the old syntax where the input type was named in the parameter
list. This fits more naturally with the way that the aggregate is identified
in DROP AGGREGATE and other utility commands; furthermore it has a natural
extension to handle multiple-input aggregates, where the basetype-parameter
method would get ugly. In fact, this commit fixes the grammar and all the
utility commands to support multiple-input aggregates; but DefineAggregate
rejects it because the executor isn't fixed yet.
I didn't do anything about treating agg(*) as a zero-input aggregate instead
of artificially making it a one-input aggregate, but that should be considered
in combination with supporting multi-input aggregates.
during parse analysis, not only errors detected in the flex/bison stages.
This is per my earlier proposal. This commit includes all the basic
infrastructure, but locations are only tracked and reported for errors
involving column references, function calls, and operators. More could
be done later but this seems like a good set to start with. I've also
moved the ReportSyntaxErrorPosition logic out of psql and into libpq,
which should make it available to more people --- even within psql this
is an improvement because warnings weren't handled by ReportSyntaxErrorPosition.
comment line where output as too long, and update typedefs for /lib
directory. Also fix case where identifiers were used as variable names
in the backend, but as typedefs in ecpg (favor the backend for
indenting).
Backpatch to 8.1.X.
discussion of getting around this by relaxing the checks made for regular
users, but I'm disinclined to toy with the security model right now,
so just special-case it for superusers where needed.
requiring superuserness always, allow an owner to reassign ownership
to any role he is a member of, if that role would have the right to
create a similar object. These three requirements essentially state
that the would-be alterer has enough privilege to DROP the existing
object and then re-CREATE it as the new role; so we might as well
let him do it in one step. The ALTER TABLESPACE case is a bit
squirrely, but the whole concept of non-superuser tablespace owners
is pretty dubious anyway. Stephen Frost, code review by Tom Lane.
and pg_auth_members. There are still many loose ends to finish in this
patch (no documentation, no regression tests, no pg_dump support for
instance). But I'm going to commit it now anyway so that Alvaro can
make some progress on shared dependencies. The catalog changes should
be pretty much done.
indexes. Replace all heap_openr and index_openr calls by heap_open
and index_open. Remove runtime lookups of catalog OID numbers in
various places. Remove relcache's support for looking up system
catalogs by name. Bulky but mostly very boring patch ...
indexes. Extend the macros in include/catalog/*.h to carry the info
about hand-assigned OIDs, and adjust the genbki script and bootstrap
code to make the relations actually get those OIDs. Remove the small
number of RelOid_pg_foo macros that we had in favor of a complete
set named like the catname.h and indexing.h macros. Next phase will
get rid of internal use of names for looking up catalogs and indexes;
but this completes the changes forcing an initdb, so it looks like a
good place to commit.
Along the way, I made the shared relations (pg_database etc) not be
'bootstrap' relations any more, so as to reduce the number of hardwired
entries and simplify changing those relations in future. I'm not
sure whether they ever really needed to be handled as bootstrap
relations, but it seems to work fine to not do so now.
be supported for all datatypes. Add CREATE AGGREGATE and pg_dump support
too. Add specialized min/max aggregates for bpchar, instead of depending
on text's min/max, because otherwise the possible use of bpchar indexes
cannot be recognized.
initdb forced because of catalog changes.
change saves a great deal of space in pg_proc and its primary index,
and it eliminates the former requirement that INDEX_MAX_KEYS and
FUNC_MAX_ARGS have the same value. INDEX_MAX_KEYS is still embedded
in the on-disk representation (because it affects index tuple header
size), but FUNC_MAX_ARGS is not. I believe it would now be possible
to increase FUNC_MAX_ARGS at little cost, but haven't experimented yet.
There are still a lot of vestigial references to FUNC_MAX_ARGS, which
I will clean up in a separate pass. However, getting rid of it
altogether would require changing the FunctionCallInfoData struct,
and I'm not sure I want to buy into that.