1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-07-30 19:23:07 +03:00

MCOL-523 documentation part 2

This commit is contained in:
David Hall
2017-09-05 17:11:22 -05:00
parent bc9bdec1f4
commit bb21c79e87
13 changed files with 284 additions and 23 deletions

View File

@ -0,0 +1,190 @@
.. _mariadb_udaf:
MariaDB UDAF
============
In order for the Columnstore UDAF to be parsed by MariaDB, a standard MariaDB c UDAF must be written and included in a library which is placed in the mysql/lib directory of the Columnstore install directory (default: /usr/local/mariadb/columstore/mysql/lib).
This set of c functions may be just a stub out to tell the parser about the function, or may be a fully implemented function if you want the UDAF to be usable by other engines.
The library placed in mysql/lib is the name you use in the SQL CREATE AGGREGATE FUNCTION statement to tell the parser where to find the function:
.. code-block:: sql
CREATE AGGREGATE FUNCTION ssq returns REAL soname 'libudf_mysql.so';
Unlike the code you write for the Columnstore UDAF, MariaDB does not handle allocation and de-allocation of your memory structures. If writing your function for other engines, you must handle allocation and de-alloaction in :ref:`function_init <func_init>` and :ref:`function_deinit <func_deinit>`
All of the MariaDB UDF and UDAF example functions are in a single source file named udfmysql.cpp and linked into libudf_mysql.so.
For more information on MariaDB UDF see the `MariaDB library UDF <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions/>`_
The following functions must be defined. The function names and signatures are generated by the CREATE AGGREGATE FUNCTION statement and are not optional. Replace "function" with your function name.
* :ref:`my_bool function_init <func_init>`
* :ref:`void function_deinit <func_deinit>`
* :ref:`void function_clear <func_clear>`
* :ref:`void function_add <func_add>`
* :ref:`long long function <func_body>`
.. _func_init:
function_init()
---------------
.. c:function:: my_bool function_init(UDF_INIT* initid, UDF_ARGS* args, char* message);
:param initd [in/out]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB UDF calling sequence <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_. The ptr member is a char* which can be used to point to user allocated memory.
:param args [in]: An pointer to a UDF_ARGS struct defining the input arguments as entered in the SQL query. UDF_ARGS is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_
:param message [out]: A pre-allocated buffer in which to copy an error message if needed. The size of the buffer is MYSQL_ERRMSG_SIZE, currently 512 bytes.
:returns: 1 on failure or 0 on success.
The init function does any argument checking, sets values in initid and allocates up any function specific memory.
::
#ifdef _MSC_VER
__declspec(dllexport)
#endif
my_bool ssq_init(UDF_INIT* initid, UDF_ARGS* args, char* message)
{
struct ssq_data* data;
if (args->arg_count != 1)
{
strcpy(message,"ssq() requires one argument");
return 1;
}
if (!(data = (struct ssq_data*) malloc(sizeof(struct ssq_data))))
{
strmov(message,"Couldn't allocate memory");
return 1;
}
data->sumsq = 0;
initid->ptr = (char*)data;
return 0;
}
.. _func_deinit:
function_deinit()
-----------------
.. c:function:: void function_deinit(UDF_INIT* initid);
:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_. If you allocated memory to the ptr member in function_init, then you must deallocate it here.
:returns: nothing.
The deinit function is used to free any memory allocated in function_init
::
#ifdef _MSC_VER
__declspec(dllexport)
#endif
void ssq_deinit(UDF_INIT* initid)
{
free(initid->ptr);
}
.. _func_clear:
function_clear()
----------------
.. c:function:: void function_clear(UDF_INIT* initid, char* is_null, char* message);
:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_. use the initid->ptr member to access your user allocated memory.
:param is_null [out]: A pointer to a single byte that you can set and use in later functions. is_null is set to 0 before each call to clear.
:param message [out]: A pointer to a single byte that you can set and use in later functions. Do not copy a string to this parameter, as it is not a buffer. The initial value is 0 and is not reset for further calls to any function including clear.
:returns: nothing.
clear is called to reset the summary results. It is called at the beginning of each GROUP BY, and may also be called where there are no matching rows.
::
#ifdef _MSC_VER
__declspec(dllexport)
#endif
void
ssq_clear(UDF_INIT* initid, char* is_null __attribute__((unused)),
char* message __attribute__((unused)))
{
struct ssq_data* data = (struct ssq_data*)initid->ptr;
data->sumsq = 0;
}
.. _func_add:
function_add()
--------------
.. c:function:: void function_add(UDF_INIT* initid UDF_ARGS* args, char* is_null, char* message);
:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_. use the initid->ptr member to access your user allocated memory.
:param args [in]: An array of UDF_ARGS structs defining the input arguments as entered in the SQL query. UDF_ARGS is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_. The args array in args will contain the values of the args as char**. These must be cast to the type indicated in args->arg_type
:param is_null [in/out]: A pointer to a single byte that you can set and use in later functions. is_null will contain the most recent value you set since the last clear call.
:param message [in/out]: A pointer to a single byte that you can set and use in later functions. Do not copy a string to this parameter, as it is not a buffer. message will contain the last value you set.
:returns: nothing.
add is called for each row in the filtered result set. Used to insert the row data into the functions summary data.
::
#ifdef _MSC_VER
__declspec(dllexport)
#endif
void ssq_add(UDF_INIT* initid, UDF_ARGS* args,
char* is_null,
char* message __attribute__((unused)))
{
struct ssq_data* data = (struct ssq_data*)initid->ptr;
double val = cvtArgToDouble(args->arg_type[0], args->args[0]);
data->sumsq = val*val;
}
.. _func_body:
function
--------
.. c:function:: <data type> function_add(UDF_INIT* initid UDF_ARGS* args, char* is_null, char* message);
:param initd [in]: A pointer to a pr-allocated UDF_INIT struct. UDF_INIT is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_. use the initid->ptr member to access your user allocated memory.
:param args [in]: An array of UDF_ARGS structs defining the input arguments as entered in the SQL query. UDF_ARGS is defined in the libmariadb/include/mariadb_com.h file in the mariadb source structure. See also the `MariaDB library <https://mariadb.com/kb/en/the-mariadb-library/user-defined-functions-calling-sequences/>`_. The values in args->args are undefined here.
:param is_null [in/out]: A pointer to a single byte that you can set and use in later functions. is_null will contain the most recent value you set since the last clear call.
:param message [in/out]: A pointer to a single byte that you can set and use in later functions. Do not copy a string to this parameter, as it is not a buffer. message will contain the last value you set.
:returns: The data type as set by the SQL CREATE AGGREGATE FUNCTION.
This is considered the function body. Use your summary data as accumulated in the calls to function_add and do any manipulation needed to come up with your answer for the GROUP.
::
#ifdef _MSC_VER
__declspec(dllexport)
#endif
long long ssq(UDF_INIT* initid, UDF_ARGS* args __attribute__((unused)),
char* is_null, char* error __attribute__((unused)))
{
struct ssq_data* data = (struct ssq_data*)initid->ptr;
return data->sumsq;
}

View File

@ -1,14 +0,0 @@
UDAFMap
-------
UDAFMap holds a mapping from the function name to its implementation. The engine uses the map when a UDA(n)F is called by a SQL statement.
You must enter your function into the map. This means you must:
* #include your header file
* add an entry into UDAFMap::getMap().
The map is fully populated the first time it is called, i.e., the first time any UDA(n)F is called by a SQL statement.
The need for you to manually enter your function into this map may be alleviated by future enhancements.

View File

@ -11,4 +11,5 @@ mcsv1_udaf reference
ColumnDatum
mcsv1_UDAF
ByteStream
MariaDBUDAF

View File

@ -0,0 +1,36 @@
.. _udaf_map:
UDAFMap
=======
The UDAFMap is where we tell the system about our function. For Columnstore 1.1, you must manually place your function into this map.
* open mcsv1_udaf.cpp
* add your header to the #include list
* add a new line to the UDAFMap::getMap() function
::
#include "allnull.h"
#include "ssq.h"
#include "median.h"
#include "avg_mode.h"
UDAF_MAP& UDAFMap::getMap()
{
if (fm.size() > 0)
{
return fm;
}
// first: function name
// second: Function pointer
// please use lower case for the function name. Because the names might be
// case-insensitive in MySQL depending on the setting. In such case,
// the function names passed to the interface is always in lower case.
fm["allnull"] = new allnull();
fm["ssq"] = new ssq();
fm["median"] = new median();
fm["avg_mode"] = new avg_mode();
return fm;
}

View File

@ -0,0 +1,15 @@
.. _cmakelists:
CMakeLists.txt
==============
For Columnstore 1.1, you compile your function by including it in the CMakeLists.txt file for the udfsdk.
You need only add the new .cpp files to the udfsdk_LIB_SRCS target list::
set(udfsdk_LIB_SRCS udfsdk.cpp mcsv1_udaf.cpp allnull.cpp ssq.cpp median.cpp avg_mode.cpp)
If you create a new file for your MariaDB Aggregate function, add it to the target list for udf_mysql_LIB_SRCS::
set(udf_mysql_LIB_SRCS udfmysql.cpp)

View File

@ -0,0 +1,12 @@
.. _compile:
Compile
=======
To compile your function for Columnstore 1.1, simple recompile the udfsdk directory::
cd utils/usdsdk
cmake .
make

View File

@ -0,0 +1,12 @@
.. _copylibs:
Copy Libraries
==============
The libraries will be built in the Columnstore source tree at utils/udfsdk. If your using out of source object creation, they will be in the appropriate place.
Copy the following the lib directory of your installation directory. The default install directory is /usr/local/mariadb/columnstore
* libudfsdk.so.1.1.0
* libudf_mysql.so.1.0.0

View File

@ -1,3 +1,5 @@
.. _header_file:
Header file
===========

View File

@ -8,4 +8,7 @@ Using mcsv1_udaf
memoryallocation
headerfile
sourcefile
UDAFMap
cmakelists
compile
copylibs

View File

@ -9,14 +9,14 @@ The API has a number of features. The general theme is, there is a class that re
The steps required to create a function are:
* Decide on memory allocation scheme.
* Create a header file for your function.
* Create a source file for your function.
* Implement mariadb udaf api code.
* Add the function to UDAFMap in mcsv1_udaf.cpp
* Add the function to CMakeLists.txt in ./utils/udfsdk
* Compile udfsdk.
* Copy the compiled libraries to the working directories.
* Decide on a :ref:`memory allocation <memory_allocation>` scheme.
* Create a :ref:`header file <header_file>` for your function.
* Create a :ref:`source file <source_file>` for your function.
* Implement :ref:`mariadb udaf api <mariadb_udaf>` code.
* Add the function to :ref:`UDAFMap <udaf_map>` in mcsv1_udaf.cpp
* Add the function to :ref:`CMakeLists.txt <cmakelists>` in ./utils/udfsdk
* :ref:`Compile udfsdk <compile>`.
* :ref:`Copy the compiled libraries <copylibs>` to the working directories.
In 1.1.0, Columnstore does not have a plugin framework, so the functions have to be compiled into the libraries that Columnstore already loads.

View File

@ -1,3 +1,5 @@
.. _memory_allocation:
Memory allocation and usage
===========================

View File

@ -1,3 +1,5 @@
.. _source_file:
Source file
===========