diff --git a/utils/udfsdk/docs/build/latex/UDAF.pdf b/utils/udfsdk/docs/build/latex/UDAF.pdf index 70b1c72e4..36f93e15f 100644 Binary files a/utils/udfsdk/docs/build/latex/UDAF.pdf and b/utils/udfsdk/docs/build/latex/UDAF.pdf differ diff --git a/utils/udfsdk/docs/source/reference/MariaDBUDAF.rst b/utils/udfsdk/docs/source/reference/MariaDBUDAF.rst new file mode 100644 index 000000000..1f6fa7acb --- /dev/null +++ b/utils/udfsdk/docs/source/reference/MariaDBUDAF.rst @@ -0,0 +1,190 @@ +.. _mariadb_udaf: + +MariaDB UDAF +============ + +In order for the Columnstore UDAF to be parsed by MariaDB, a standard MariaDB c UDAF must be written and included in a library which is placed in the mysql/lib directory of the Columnstore install directory (default: /usr/local/mariadb/columstore/mysql/lib). + +This set of c functions may be just a stub out to tell the parser about the function, or may be a fully implemented function if you want the UDAF to be usable by other engines. + +The library placed in mysql/lib is the name you use in the SQL CREATE AGGREGATE FUNCTION statement to tell the parser where to find the function: + +.. code-block:: sql + + CREATE AGGREGATE FUNCTION ssq returns REAL soname 'libudf_mysql.so'; + +Unlike the code you write for the Columnstore UDAF, MariaDB does not handle allocation and de-allocation of your memory structures. If writing your function for other engines, you must handle allocation and de-alloaction in :ref:`function_init ` and :ref:`function_deinit ` + +All of the MariaDB UDF and UDAF example functions are in a single source file named udfmysql.cpp and linked into libudf_mysql.so. + +For more information on MariaDB UDF see the `MariaDB library UDF `_ + +The following functions must be defined. The function names and signatures are generated by the CREATE AGGREGATE FUNCTION statement and are not optional. Replace "function" with your function name. + +* :ref:`my_bool function_init ` +* :ref:`void function_deinit ` +* :ref:`void function_clear ` +* :ref:`void function_add ` +* :ref:`long long function ` + +.. _func_init: + +function_init() +--------------- + +.. c:function:: my_bool function_init(UDF_INIT* initid, UDF_ARGS* args, char* message); + +:param initd [in/out]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB UDF calling sequence `_. The ptr member is a char* which can be used to point to user allocated memory. + +:param args [in]: An pointer to a UDF_ARGS struct defining the input arguments as entered in the SQL query. UDF_ARGS is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library `_ + +:param message [out]: A pre-allocated buffer in which to copy an error message if needed. The size of the buffer is MYSQL_ERRMSG_SIZE, currently 512 bytes. + +:returns: 1 on failure or 0 on success. + + The init function does any argument checking, sets values in initid and allocates up any function specific memory. + +:: + + #ifdef _MSC_VER + __declspec(dllexport) + #endif + my_bool ssq_init(UDF_INIT* initid, UDF_ARGS* args, char* message) + { + struct ssq_data* data; + if (args->arg_count != 1) + { + strcpy(message,"ssq() requires one argument"); + return 1; + } + + if (!(data = (struct ssq_data*) malloc(sizeof(struct ssq_data)))) + { + strmov(message,"Couldn't allocate memory"); + return 1; + } + data->sumsq = 0; + + initid->ptr = (char*)data; + return 0; + } + + +.. _func_deinit: + +function_deinit() +----------------- + +.. c:function:: void function_deinit(UDF_INIT* initid); + +:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library `_. If you allocated memory to the ptr member in function_init, then you must deallocate it here. + +:returns: nothing. + + The deinit function is used to free any memory allocated in function_init + +:: + + #ifdef _MSC_VER + __declspec(dllexport) + #endif + void ssq_deinit(UDF_INIT* initid) + { + free(initid->ptr); + } + +.. _func_clear: + +function_clear() +---------------- + +.. c:function:: void function_clear(UDF_INIT* initid, char* is_null, char* message); + +:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library `_. use the initid->ptr member to access your user allocated memory. + +:param is_null [out]: A pointer to a single byte that you can set and use in later functions. is_null is set to 0 before each call to clear. + +:param message [out]: A pointer to a single byte that you can set and use in later functions. Do not copy a string to this parameter, as it is not a buffer. The initial value is 0 and is not reset for further calls to any function including clear. + +:returns: nothing. + + clear is called to reset the summary results. It is called at the beginning of each GROUP BY, and may also be called where there are no matching rows. + +:: + + #ifdef _MSC_VER + __declspec(dllexport) + #endif + void + ssq_clear(UDF_INIT* initid, char* is_null __attribute__((unused)), + char* message __attribute__((unused))) + { + struct ssq_data* data = (struct ssq_data*)initid->ptr; + data->sumsq = 0; + } + +.. _func_add: + +function_add() +-------------- + +.. c:function:: void function_add(UDF_INIT* initid UDF_ARGS* args, char* is_null, char* message); + +:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library `_. use the initid->ptr member to access your user allocated memory. + +:param args [in]: An array of UDF_ARGS structs defining the input arguments as entered in the SQL query. UDF_ARGS is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library `_. The args array in args will contain the values of the args as char**. These must be cast to the type indicated in args->arg_type + +:param is_null [in/out]: A pointer to a single byte that you can set and use in later functions. is_null will contain the most recent value you set since the last clear call. + +:param message [in/out]: A pointer to a single byte that you can set and use in later functions. Do not copy a string to this parameter, as it is not a buffer. message will contain the last value you set. + +:returns: nothing. + + add is called for each row in the filtered result set. Used to insert the row data into the functions summary data. + +:: + + #ifdef _MSC_VER + __declspec(dllexport) + #endif + void ssq_add(UDF_INIT* initid, UDF_ARGS* args, + char* is_null, + char* message __attribute__((unused))) + { + struct ssq_data* data = (struct ssq_data*)initid->ptr; + double val = cvtArgToDouble(args->arg_type[0], args->args[0]); + data->sumsq = val*val; + } + +.. _func_body: + +function +-------- + +.. c:function:: function_add(UDF_INIT* initid UDF_ARGS* args, char* is_null, char* message); + +:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library `_. use the initid->ptr member to access your user allocated memory. + +:param args [in]: An array of UDF_ARGS structs defining the input arguments as entered in the SQL query. UDF_ARGS is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library `_. The values in args->args are undefined here. + +:param is_null [in/out]: A pointer to a single byte that you can set and use in later functions. is_null will contain the most recent value you set since the last clear call. + +:param message [in/out]: A pointer to a single byte that you can set and use in later functions. Do not copy a string to this parameter, as it is not a buffer. message will contain the last value you set. + +:returns: The data type as set by the SQL CREATE AGGREGATE FUNCTION. + + This is considered the function body. Use your summary data as accumulated in the calls to function_add and do any manipulation needed to come up with your answer for the GROUP. + +:: + + #ifdef _MSC_VER + __declspec(dllexport) + #endif + long long ssq(UDF_INIT* initid, UDF_ARGS* args __attribute__((unused)), + char* is_null, char* error __attribute__((unused))) + { + struct ssq_data* data = (struct ssq_data*)initid->ptr; + return data->sumsq; + } + + diff --git a/utils/udfsdk/docs/source/reference/UDAFMap.rst b/utils/udfsdk/docs/source/reference/UDAFMap.rst deleted file mode 100644 index 74fd31a84..000000000 --- a/utils/udfsdk/docs/source/reference/UDAFMap.rst +++ /dev/null @@ -1,14 +0,0 @@ -UDAFMap -------- - -UDAFMap holds a mapping from the function name to its implementation. The engine uses the map when a UDA(n)F is called by a SQL statement. - -You must enter your function into the map. This means you must: - -* #include your header file -* add an entry into UDAFMap::getMap(). - -The map is fully populated the first time it is called, i.e., the first time any UDA(n)F is called by a SQL statement. - -The need for you to manually enter your function into this map may be alleviated by future enhancements. - diff --git a/utils/udfsdk/docs/source/reference/index.rst b/utils/udfsdk/docs/source/reference/index.rst index 2bfc361be..70ad0b593 100644 --- a/utils/udfsdk/docs/source/reference/index.rst +++ b/utils/udfsdk/docs/source/reference/index.rst @@ -11,4 +11,5 @@ mcsv1_udaf reference ColumnDatum mcsv1_UDAF ByteStream + MariaDBUDAF diff --git a/utils/udfsdk/docs/source/usage/UDAFMap.rst b/utils/udfsdk/docs/source/usage/UDAFMap.rst new file mode 100644 index 000000000..afd739c12 --- /dev/null +++ b/utils/udfsdk/docs/source/usage/UDAFMap.rst @@ -0,0 +1,36 @@ +.. _udaf_map: + +UDAFMap +======= + +The UDAFMap is where we tell the system about our function. For Columnstore 1.1, you must manually place your function into this map. + +* open mcsv1_udaf.cpp +* add your header to the #include list +* add a new line to the UDAFMap::getMap() function + +:: + + #include "allnull.h" + #include "ssq.h" + #include "median.h" + #include "avg_mode.h" + UDAF_MAP& UDAFMap::getMap() + { + if (fm.size() > 0) + { + return fm; + } + // first: function name + // second: Function pointer + // please use lower case for the function name. Because the names might be + // case-insensitive in MySQL depending on the setting. In such case, + // the function names passed to the interface is always in lower case. + fm["allnull"] = new allnull(); + fm["ssq"] = new ssq(); + fm["median"] = new median(); + fm["avg_mode"] = new avg_mode(); + + return fm; + } + diff --git a/utils/udfsdk/docs/source/usage/cmakelists.rst b/utils/udfsdk/docs/source/usage/cmakelists.rst new file mode 100644 index 000000000..32a218459 --- /dev/null +++ b/utils/udfsdk/docs/source/usage/cmakelists.rst @@ -0,0 +1,15 @@ +.. _cmakelists: + +CMakeLists.txt +============== + +For Columnstore 1.1, you compile your function by including it in the CMakeLists.txt file for the udfsdk. + +You need only add the new .cpp files to the udfsdk_LIB_SRCS target list:: + + set(udfsdk_LIB_SRCS udfsdk.cpp mcsv1_udaf.cpp allnull.cpp ssq.cpp median.cpp avg_mode.cpp) + +If you create a new file for your MariaDB Aggregate function, add it to the target list for udf_mysql_LIB_SRCS:: + + set(udf_mysql_LIB_SRCS udfmysql.cpp) + diff --git a/utils/udfsdk/docs/source/usage/compile.rst b/utils/udfsdk/docs/source/usage/compile.rst new file mode 100644 index 000000000..e6319e45b --- /dev/null +++ b/utils/udfsdk/docs/source/usage/compile.rst @@ -0,0 +1,12 @@ +.. _compile: + +Compile +======= + +To compile your function for Columnstore 1.1, simple recompile the udfsdk directory:: + + cd utils/usdsdk + cmake . + make + + diff --git a/utils/udfsdk/docs/source/usage/copylibs.rst b/utils/udfsdk/docs/source/usage/copylibs.rst new file mode 100644 index 000000000..b9d543307 --- /dev/null +++ b/utils/udfsdk/docs/source/usage/copylibs.rst @@ -0,0 +1,12 @@ +.. _copylibs: + +Copy Libraries +============== + +The libraries will be built in the Columnstore source tree at utils/udfsdk. If your using out of source object creation, they will be in the appropriate place. + +Copy the following the lib directory of your installation directory. The default install directory is /usr/local/mariadb/columnstore + + * libudfsdk.so.1.1.0 + * libudf_mysql.so.1.0.0 + diff --git a/utils/udfsdk/docs/source/usage/headerfile.rst b/utils/udfsdk/docs/source/usage/headerfile.rst index a24a5b344..d97c1210e 100644 --- a/utils/udfsdk/docs/source/usage/headerfile.rst +++ b/utils/udfsdk/docs/source/usage/headerfile.rst @@ -1,3 +1,5 @@ +.. _header_file: + Header file =========== diff --git a/utils/udfsdk/docs/source/usage/index.rst b/utils/udfsdk/docs/source/usage/index.rst index bd5cff529..ce419f574 100644 --- a/utils/udfsdk/docs/source/usage/index.rst +++ b/utils/udfsdk/docs/source/usage/index.rst @@ -8,4 +8,7 @@ Using mcsv1_udaf memoryallocation headerfile sourcefile - + UDAFMap + cmakelists + compile + copylibs diff --git a/utils/udfsdk/docs/source/usage/introduction.rst b/utils/udfsdk/docs/source/usage/introduction.rst index 9cff92bb7..403c83bc6 100644 --- a/utils/udfsdk/docs/source/usage/introduction.rst +++ b/utils/udfsdk/docs/source/usage/introduction.rst @@ -9,14 +9,14 @@ The API has a number of features. The general theme is, there is a class that re The steps required to create a function are: -* Decide on memory allocation scheme. -* Create a header file for your function. -* Create a source file for your function. -* Implement mariadb udaf api code. -* Add the function to UDAFMap in mcsv1_udaf.cpp -* Add the function to CMakeLists.txt in ./utils/udfsdk -* Compile udfsdk. -* Copy the compiled libraries to the working directories. +* Decide on a :ref:`memory allocation ` scheme. +* Create a :ref:`header file ` for your function. +* Create a :ref:`source file ` for your function. +* Implement :ref:`mariadb udaf api ` code. +* Add the function to :ref:`UDAFMap ` in mcsv1_udaf.cpp +* Add the function to :ref:`CMakeLists.txt ` in ./utils/udfsdk +* :ref:`Compile udfsdk `. +* :ref:`Copy the compiled libraries ` to the working directories. In 1.1.0, Columnstore does not have a plugin framework, so the functions have to be compiled into the libraries that Columnstore already loads. diff --git a/utils/udfsdk/docs/source/usage/memoryallocation.rst b/utils/udfsdk/docs/source/usage/memoryallocation.rst index c98a314e2..8f94e3f5c 100644 --- a/utils/udfsdk/docs/source/usage/memoryallocation.rst +++ b/utils/udfsdk/docs/source/usage/memoryallocation.rst @@ -1,3 +1,5 @@ +.. _memory_allocation: + Memory allocation and usage =========================== diff --git a/utils/udfsdk/docs/source/usage/sourcefile.rst b/utils/udfsdk/docs/source/usage/sourcefile.rst index d3995c55e..39f11cd7b 100644 --- a/utils/udfsdk/docs/source/usage/sourcefile.rst +++ b/utils/udfsdk/docs/source/usage/sourcefile.rst @@ -1,3 +1,5 @@ +.. _source_file: + Source file ===========