MCOL-1201 Modify docs. Fix group concat bug

2025-07-30 19:23:07 +03:00 · 2018-05-15 13:15:45 -05:00
parent 06e9772310
commit c8c3b23e32
13 changed files with 75 additions and 51 deletions
--- a/dbcon/mysql/ha_calpont_execplan.cpp
+++ b/dbcon/mysql/ha_calpont_execplan.cpp
@ -4165,6 +4165,7 @@ ReturnedColumn* buildAggregateColumn(Item* item, gp_walk_info& gwi)
            rowCol->columnVec(selCols);
            (dynamic_cast<GroupConcatColumn*>(ac))->orderCols(orderCols);
            parm.reset(rowCol);
+            ac->aggParms().push_back(parm);

            if (gc->str_separator())
            {
--- a/utils/udfsdk/docs/source/changelog.rst
+++ b/utils/udfsdk/docs/source/changelog.rst
@ -5,4 +5,5 @@ Version History
 | Version | Date       | Changes                     |
 +=========+============+=============================+
 | 1.1.0α  | 2017-08-25 | - First alpha release       |
+| 1.2.0α  | 2016-05-18 | - Add multi parm support    |
 +---------+------------+-----------------------------+
--- a/utils/udfsdk/docs/source/reference/ColumnDatum.rst
+++ b/utils/udfsdk/docs/source/reference/ColumnDatum.rst
@ -1,3 +1,5 @@
+.. _ColumnDatum:
+
 ColumnDatum
 ===========

@ -13,7 +15,7 @@ Example for int data:
     int myint = valIn.cast<int>();


-For multi-paramter aggregations (not available in Columnstore 1.1), the colsIn vector of next_value() contains the ordered set of row parameters.
+For multi-paramter aggregations (not available in Columnstore 1.1), the colsIn array of next_value() contains the ordered set of row parameters.

 For char, varchar, text, varbinary and blob types, columnData will be std::string.

@ -59,7 +61,7 @@ The provided values are:
   * - SMALLINT
     - A signed two byte integer
   * - DECIMAL
-     - A Columnstore Decimal value. For Columnstore 1.1, this is stored in the smallest integer type field that will hold the required precision.
+     - A Columnstore Decimal value. This is stored in the smallest integer type field that will hold the required precision.
   * - MEDINT
     - A signed four byte integer
   * - INT
--- a/utils/udfsdk/docs/source/reference/MariaDBUDAF.rst
+++ b/utils/udfsdk/docs/source/reference/MariaDBUDAF.rst
@ -13,7 +13,7 @@ The library placed in mysql/lib is the name you use in the SQL CREATE AGGREGATE

    CREATE AGGREGATE FUNCTION ssq returns REAL soname 'libudf_mysql.so';

-Unlike the code you write for the Columnstore UDAF, MariaDB does not handle allocation and de-allocation of your memory structures. If writing your function for other engines, you must handle allocation and de-alloaction in :ref:`function_init <func_init>` and :ref:`function_deinit <func_deinit>`
+Unlike the code you write for the Columnstore UDAF, MariaDB does not handle allocation and de-allocation of your memory structures in other engines. If writing your function for other engines, you must handle allocation and de-alloaction in :ref:`function_init <func_init>` and :ref:`function_deinit <func_deinit>`

 All of the MariaDB UDF and UDAF example functions are in a single source file named udfmysql.cpp and linked into libudf_mysql.so.

--- a/utils/udfsdk/docs/source/reference/UDAFMap.rst
+++ b/utils/udfsdk/docs/source/reference/UDAFMap.rst
@ -3,7 +3,7 @@
 UDAFMap
 =======

-The UDAFMap is where we tell the system about our function. For Columnstore 1.1, you must manually place your function into this map.
+The UDAFMap is where we tell the system about our function. For Columnstore 1.2, you must manually place your function into this map.

 * open mcsv1_udaf.cpp
 * add your header to the #include list
--- a/utils/udfsdk/docs/source/reference/mcsv1Context.rst
+++ b/utils/udfsdk/docs/source/reference/mcsv1Context.rst
@ -150,7 +150,7 @@ Use these to determine the way your UDA(n)F was called

 .. c:function:: size_t getParameterCount() const;

-:returns: the number of parameters to the function in the SQL query. Columnstore 1.1 only supports one parameter.
+:returns: the number of parameters to the function in the SQL query. 

 .. c:function:: bool isParamNull(int paramIdx);

--- a/utils/udfsdk/docs/source/reference/mcsv1_UDAF.rst
+++ b/utils/udfsdk/docs/source/reference/mcsv1_UDAF.rst
@ -1,4 +1,4 @@
-.. _ mcsv1_udaf:
+.. _mcsv1_udaf:

 mcsv1_UDAF
 ==========
@ -11,12 +11,14 @@ The base class has no data members. It is designed to be only a container for yo

 However, adding static const members makes sense.

-For UDAF (not Wndow Functions) Aggregation takes place in three stages:
+For UDAF (not Window Functions) Aggregation takes place in three stages:

 * Subaggregation on the PM. nextValue()
 * Consolodation on the UM. subevaluate()
 * Evaluation of the function on the UM. evaluate()

+There are situations where the system makes a choice to perform all UDAF calculations on the UM. The presence of group_concat() in the query and certain joins can cause the optimizer to make this choice.
+
 For Window Functions, all aggregation occurs on the UM, and thus the subevaluate step is skipped. There is an optional dropValue() function that may be added.

 * Aggregation on the UM. nextValue()
@ -80,17 +82,11 @@ Callback Methods

 .. _init:

-.. c:function:: ReturnCode init(mcsv1Context* context, COL_TYPES& colTypes);
+.. c:function:: ReturnCode init(mcsv1Context* context, ColumnDatum* colTypes);

 :param context: The context object for this call.

-:param colTypes: A list of the column types of the parameters.
-
- COL_TYPES is defined as::
-
-  typedef std::vector<std::pair<std::string, CalpontSystemCatalog::ColDataType> >COL_TYPES;
-
- In Columnstore 1.1, only one column is supported, so colTyoes will be of length one.
+:param colTypes: A list of ColumnDatum structures. Use this to access the column types of the parameters. colTypes.columnData will be invalid.

 :returns: ReturnCode::ERROR or ReturnCode::SUCCESS
 
@ -116,13 +112,11 @@ Callback Methods

 .. _nextvalue:

-.. c:function:: ReturnCode nextValue(mcsv1Context* context, 				 std::vector<ColumnDatum>& valsIn);
+.. c:function:: ReturnCode nextValue(mcsv1Context* context, 				 ColumnDatum* valsIn);

 :param context: The context object for this call

-:param valsIn: a vector representing the values to be added for each parameter for this row.
-
- In Columnstore 1.1, this will be a vector of length one.
+:param valsIn: an array representing the values to be added for each parameter for this row.
 
 :returns: ReturnCode::ERROR or ReturnCode::SUCCESS

@ -130,11 +124,11 @@ Callback Methods

 nextValue() is called for each Window movement that passes the WHERE and HAVING clauses. The context's UserData will contain values that have been sub-aggregated to this point for the group, partition or Window Frame. nextValue is called on the PM for aggregation and on the UM for Window Functions.

- When used in an aggregate, the function may not rely on order or completeness since the sub-aggregation is going on at the PM, it only has access to the data stored on the PM's dbroots.
+ When used in an aggregate, the function should not rely on order or completeness since the sub-aggregation is going on at the PM, it only has access to the data stored on the PM's dbroots.

- When used as a analytic function (Window Function), nextValue is call for each Window movement in the Window. If dropValue is defined, then it may be called for every value leaving the Window, and nextValue called for each new value entering the Window.
+ When used as a analytic function (Window Function), nextValue is called for each Window movement in the Window. If dropValue is defined, then it may be called for every value leaving the Window, and nextValue called for each new value entering the Window.

- Since this is called for every row, it is important that this method be efficient.
+ Since this may called for every row, it is important that this method be efficient.

 .. _subevaluate:

@ -172,13 +166,11 @@ Callback Methods

 .. _dropvalue:

-.. c:function:: ReturnCode dropValue(mcsv1Context* context, 				 std::vector<ColumnDatum>& valsDropped);
+.. c:function:: ReturnCode dropValue(mcsv1Context* context, 				 ColumnDatum* valsDropped);

 :param context: The context object for this call

-:param valsDropped: a vector representing the values to be dropped for each parameter for this row.
-
- In Columnstore 1.1, this will be a vector of length one.
+:param valsDropped: an array representing the values to be dropped for each parameter for this row.

 :returns: ReturnCode::ERROR or ReturnCode::SUCCESS

--- a/utils/udfsdk/docs/source/usage/cmakelists.rst
+++ b/utils/udfsdk/docs/source/usage/cmakelists.rst
@ -3,7 +3,7 @@
 CMakeLists.txt
 ==============

-For Columnstore 1.1, you compile your function by including it in the CMakeLists.txt file for the udfsdk.
+For Columnstore 1.2, you compile your function by including it in the CMakeLists.txt file for the udfsdk.

 You need only add the new .cpp files to the udfsdk_LIB_SRCS target list::

--- a/utils/udfsdk/docs/source/usage/compile.rst
+++ b/utils/udfsdk/docs/source/usage/compile.rst
@ -3,7 +3,7 @@
 Compile
 =======

-To compile your function for Columnstore 1.1, simple recompile the udfsdk directory::
+To compile your function for Columnstore 1.2, simply recompile the udfsdk directory::

 cd utils/usdsdk
 cmake .
--- a/utils/udfsdk/docs/source/usage/headerfile.rst
+++ b/utils/udfsdk/docs/source/usage/headerfile.rst
@ -5,7 +5,7 @@ Header file

 Usually, each UDA(n)F function will have one .h and one .cpp file plus code for the mariadb UDAF plugin which may or may not be in a separate file. It is acceptable to put a set of related functions in the same files or use separate files for each.

-The easiest way to create these files is to copy them an example closest to the type of function you intend to create.
+The easiest way to create these files is to copy them from an example closest to the type of function you intend to create.

 Your header file must have a class defined that will implement your function. This class must be derived from mcsv1_UDAF and be in the mcsv1sdk namespace. The following examples use the "allnull" UDAF.

@ -29,9 +29,9 @@ allnull uses the Simple Data Model. See :ref:`complexdatamodel` to see how that
    allnull() : mcsv1_UDAF(){};
    virtual ~allnull(){};

-    virtual ReturnCode init(mcsv1Context* context, COL_TYPES& colTypes);
+    virtual ReturnCode init(mcsv1Context* context, ColumnDatum* colTypes);
    virtual ReturnCode reset(mcsv1Context* context);
-    virtual ReturnCode nextValue(mcsv1Context* context, std::vector<ColumnDatum>& valsIn);
+    virtual ReturnCode nextValue(mcsv1Context* context, ColumnDatum* valsIn);
    virtual ReturnCode subEvaluate(mcsv1Context* context, const UserData* userDataIn);
    virtual ReturnCode evaluate(mcsv1Context* context, static_any::any& valOut);
 };
--- a/utils/udfsdk/docs/source/usage/introduction.rst
+++ b/utils/udfsdk/docs/source/usage/introduction.rst
@ -3,7 +3,7 @@ mcsv1_udaf Introduction

 mcsv1_udaf is a C++ API for writing User Defined Aggregate Functions (UDAF) and User Defined Analytic Functions (UDAnF) for the MariaDB Columstore engine. 

-In Columnstore 1.1.0, functions written using this API must be compiled into the udfsdk and udf_mysql libraries of the Columnstore code branch.
+In Columnstore 1.2, functions written using this API must be compiled into the udfsdk and udf_mysql libraries of the Columnstore code branch.

 The API has a number of features. The general theme is, there is a class that represents the function, there is a context under which the function operates, and there is a data store for intermediate values.

@ -18,5 +18,5 @@ The steps required to create a function are:
 * :ref:`Compile udfsdk <compile>`.
 * :ref:`Copy the compiled libraries <copylibs>` to the working directories.

-In 1.1.0, Columnstore does not have a plugin framework, so the functions have to be compiled into the libraries that Columnstore already loads.
+In 1.2, Columnstore does not have a plugin framework, so the functions have to be compiled into the libraries that Columnstore already loads.

--- a/utils/udfsdk/docs/source/usage/sourcefile.rst
+++ b/utils/udfsdk/docs/source/usage/sourcefile.rst
@ -34,21 +34,17 @@ Or, if using the :ref:`complexdatamodel`, type cast the UserData to your UserDat
 init()
 ------

-.. c:function:: ReturnCode init(mcsv1Context* context, COL_TYPES& colTypes);
+.. c:function:: ReturnCode init(mcsv1Context* context, ColumnDatum* colTypes);

 :param context: The context object for this call.

-:param colTypes: A list of the column types of the parameters.
+:param colTypes: A list of the ColumnDatum used to access column types of the parameters. In init(), the columnData member is invalid.

- COL_TYPES is defined as::
-
-  typedef std::vector<std::pair<std::string, CalpontSystemCatalog::ColDataType> >COL_TYPES;
-
- see :ref:`ColDataTypes <coldatatype>`. In Columnstore 1.1, only one column is supported, so colTyoes will be of length one.
+ see :ref:`ColumnDatum`. In Columnstore 1.2, An arbitrary number of parameters is supported.

 :returns: ReturnCode::ERROR or ReturnCode::SUCCESS

-The init() method is where you sanity check the input, set the output type and set any run flags for this instance. init() is called one time from the mysqld process. All settings you do here are propagated through the system.
+The init() method is where you sanity check the input datatypes, set the output type and set any run flags for this instance. init() is called one time from the mysqld process. All settings you do here are propagated through the system.

 init() is the exception to type casting the UserData member of context. UserData has not been created when init() is called, so you shouldn't use it here. 

@ -60,13 +56,14 @@ If you're using :ref:`simpledatamodel`, you need to set the size of the structur

 .. rubric:: Check parameter count and type

-Each function expects a certain number of columns to entered as parameters in the SQL query. For columnstore 1.1, the number of parameters is limited to one.
+Each function expects a certain number of columns to be entered as parameters in the SQL query. It is possible to create a UDAF that accepts a variable number of parameters. You can discover which ones were actually used in init(), and modify your function's behavior accordingly.

-colTypes is a vector of each parameter name and type. The name is the colum name from the SQL query. You can use this information to sanity check for compatible type(s) and also to modify your functions behavior based on type. To do this, add members to your data struct to be tested in the other Methods. Set these members based on colDataTypes (:ref:`ColDataTypes <coldatatype>`).
+colTypes is an array of ColumnData from which can be gleaned the type and name. The name is the column name from the SQL query. You can use this information to sanity check for compatible type(s) and also to modify your functions behavior based on type. To do this, add members to your data struct to be tested in the other Methods. Set these members based on colDataTypes (:ref:`ColDataTypes <coldatatype>`).

+The actual number of paramters passed can be gotten from context->getParameterCount().
 ::

-	if (colTypes.size() < 1)
+	if (context->getParameterCount() < 1)
 	{
 		// The error message will be prepended with
 		// "The storage engine for the table doesn't support "
@ -84,7 +81,7 @@ When you create your function using the SQL CREATE FUNCTION command, you must in

 .. rubric:: Set width and scale

-If you have secial requirements, especially if you might be dealing with decimal types::
+If you have special requirements, especially if you might be dealing with decimal types::

 	context->setColWidth(8);
 	context->setScale(context->getScale()*2);
@ -117,13 +114,11 @@ This function may be called multiple times from both the UM and the PM. Make no
 nextValue()
 -----------

-.. c:function:: ReturnCode nextValue(mcsv1Context* context, 				 std::vector<ColumnDatum>& valsIn);
+.. c:function:: ReturnCode nextValue(mcsv1Context* context, 				 ColumnDatum* valsIn);

 :param context: The context object for this call

-:param valsIn: a vector representing the values to be added for each parameter for this row.
-
- In Columnstore 1.1, this will be a vector of length one.
+:param valsIn: an array representing the values to be added for each parameter for this row.

 :returns: ReturnCode::ERROR or ReturnCode::SUCCESS

@ -208,7 +203,7 @@ For AVG, you might see::
 dropValue
 ---------

-.. c:function:: ReturnCode dropValue(mcsv1Context* context, 				 std::vector<ColumnDatum>& valsDropped);
+.. c:function:: ReturnCode dropValue(mcsv1Context* context, 				 ColumnDatum* valsDropped);

 :param context: The context object for this call

--- a/utils/udfsdk/udfsdk.vpj
+++ b/utils/udfsdk/udfsdk.vpj
@ -238,5 +238,38 @@
                N="Makefile"
                Type="Makefile"/>
        </Folder>
+        <Folder
+            Name="doc"
+            Filters="*.rst">
+            <Folder
+                Name="reference"
+                Filters="">
+                <F N="docs/source/reference/api.rst"/>
+                <F N="docs/source/reference/ByteStream.rst"/>
+                <F N="docs/source/reference/ColumnDatum.rst"/>
+                <F N="docs/source/reference/index.rst"/>
+                <F N="docs/source/reference/MariaDBUDAF.rst"/>
+                <F N="docs/source/reference/mcsv1_UDAF.rst"/>
+                <F N="docs/source/reference/mcsv1Context.rst"/>
+                <F N="docs/source/reference/UDAFMap.rst"/>
+                <F N="docs/source/reference/UserData.rst"/>
+            </Folder>
+            <Folder
+                Name="usage"
+                Filters="">
+                <F N="docs/source/usage/cmakelists.rst"/>
+                <F N="docs/source/usage/compile.rst"/>
+                <F N="docs/source/usage/copylibs.rst"/>
+                <F N="docs/source/usage/headerfile.rst"/>
+                <F N="docs/source/usage/index.rst"/>
+                <F N="docs/source/usage/introduction.rst"/>
+                <F N="docs/source/usage/memoryallocation.rst"/>
+                <F N="docs/source/usage/sourcefile.rst"/>
+            </Folder>
+            <F N="docs/source/changelog.rst"/>
+            <F N="docs/source/conf.py"/>
+            <F N="docs/source/index.rst"/>
+            <F N="docs/source/license.rst"/>
+        </Folder>
    </Files>
 </Project>