mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-12-18 13:54:11 +03:00

Author	SHA1	Message	Date
Denis Khalikov	3fcb9b66f5	MCOL-5555 Add support for `startreadonly` command. This patch adds support for `startreadonly` command which waits until all active cpimport jobs are done and then puts controller node to readonly mode.	2023-10-16 16:11:12 +03:00
Gagan Goel	931f2b36a1	MCOL-4931 Make cpimport charset-aware. (#2938 ) 1. Extend the following CalpontSystemCatalog member functions to set CalpontSystemCatalog::ColType::charsetNumber, after the system catalog update to add charset number to calpontsys.syscolumn in MCOL-5005: CalpontSystemCatalog::lookupOID CalpontSystemCatalog::colType CalpontSystemCatalog::columnRIDs CalpontSystemCatalog::getSchemaInfo 2. Update cpimport to use the CHARSET_INFO object associated with the charset number retrieved from the system catalog, for a dictionary/non-dictionary CHAR/VARCHAR/TEXT column, to truncate long strings that exceed the target column character length. 3. Add MTR test cases.	2023-09-05 17:17:20 +03:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
Leonid Fedorov	f5b2a6885f	MCOL-5013: Load Data from S3 into Columnstore Introduced UDF and stored prodecure. usage: set columnstore_s3_key='<s3_key>'; set columnstore_s3_secret='<s3_secret>'; set columnstore_s3_region='region'; and then use UDF select columnstore_dataload("<tablename>", "<filename>", "<bucket>", "<db_name>"); for UDF db_name can be ommited, then current connection db will be used or stored function call calpontsys.columnstore_load_from_s3("<tablename>", "<filename>", "<bucket>", "<db_name>");	2022-07-04 19:52:37 +03:00
Roman Nozdrin	4c26e4f960	MCOL-4912 This patch introduces Extent Map index to improve EM scaleability EM scaleability project has two parts: phase1 and phase2. This is phase1 that brings EM index to speed up(from O(n) down to the speed of boost::unordered_map) EM lookups looking for <dbroot, oid, partition> tuple to turn it into LBID, e.g. most bulk insertion meta info operations. The basis is boost::shared_managed_object where EMIndex is stored. Whilst it is not debug-friendly it allows to put a nested structs into shmem. EMIndex has 3 tiers. Top down description: vector of dbroots, map of oids to partition vectors, partition vectors that have EM indices. Separate EM methods now queries index before they do EM run. EMIndex has a separate shmem file with the fixed id MCS-shm-00060001.	2022-05-04 12:59:16 +00:00
benthompson15	2ec502aaf3	MCOL-4576: remove S3 options from cpimport. (#2328 )	2022-04-01 09:18:34 -05:00
Roman Nozdrin	2b4946f53a	Revert "MCOL-4576: remove S3 options from cpimport. (#2307 )" This reverts commit `14c4840d53`.	2022-03-23 12:33:23 +00:00
benthompson15	14c4840d53	MCOL-4576: remove S3 options from cpimport. (#2307 )	2022-03-21 09:54:39 -05:00
Gagan Goel	973e5024d8	MCOL-4957 Fix performance slowdown for processing TIMESTAMP columns. Part 1: As part of MCOL-3776 to address synchronization issue while accessing the fTimeZone member of the Func class, mutex locks were added to the accessor and mutator methods. However, this slows down processing of TIMESTAMP columns in PrimProc significantly as all threads across all concurrently running queries would serialize on the mutex. This is because PrimProc only has a single global object for the functor class (class derived from Func in utils/funcexp/functor.h) for a given function name. To fix this problem: (1) We remove the fTimeZone as a member of the Func derived classes (hence removing the mutexes) and instead use the fOperationType member of the FunctionColumn class to propagate the timezone values down to the individual functor processing functions such as FunctionColumn::getStrVal(), FunctionColumn::getIntVal(), etc. (2) To achieve (1), a timezone member is added to the execplan::CalpontSystemCatalog::ColType class. Part 2: Several functors in the Funcexp code call dataconvert::gmtSecToMySQLTime() and dataconvert::mySQLTimeToGmtSec() functions for conversion between seconds since unix epoch and broken-down representation. These functions in turn call the C library function localtime_r() which currently has a known bug of holding a global lock via a call to __tz_convert. This significantly reduces performance in multi-threaded applications where multiple threads concurrently call localtime_r(). More details on the bug: https://sourceware.org/bugzilla/show_bug.cgi?id=16145 This bug in localtime_r() caused processing of the Functors in PrimProc to slowdown significantly since a query execution causes Functors code to be processed in a multi-threaded manner. As a fix, we remove the calls to localtime_r() from gmtSecToMySQLTime() and mySQLTimeToGmtSec() by performing the timezone-to-offset conversion (done in dataconvert::timeZoneToOffset()) during the execution plan creation in the plugin. Note that localtime_r() is only called when the time_zone system variable is set to "SYSTEM". This fix also required changing the timezone type from a std::string to a long across the system.	2022-02-14 14:12:27 -05:00
Leonid Fedorov	04752ec546	clang format apply	2022-01-21 16:43:49 +00:00
Alexander Barkov	9608533d92	MCOL-4734 Compilation failure: MariaDB-10.6 + ColumnStore-develop mcsconfig.h and my_config.h have the following pre-processor definitions: 1. Conflicting definitions coming from the standard cmake definitions: - PACKAGE - PACKAGE_BUGREPORT - PACKAGE_NAME - PACKAGE_STRING - PACKAGE_TARNAME - PACKAGE_VERSION - VERSION 2. Conflicting definitions of other kinds: - HAVE_STRTOLL - this is a dirt in MariaDB headers. Should be fixed in the server code. my_config.h erroneously performs "#define HAVE_STRTOLL" instead of "#define HAVE_STRTOLL 1". in some cases. The former is not CMake compatible style. The latter is. 3. Non-conflicting definitions: Otherwise, mcsconfig.h and my_config.h should be mutually compatible, because both are generated by cmake on the same host machine. So they should have exactly equal definitions like "HAVE_XXX", "SIZEOF_XXX", etc. Observations: - It's OK to include both mcsconfig.h and my_config.h providing that we suppress duplicate definition of the above conflicting types #1 and #2. - There is no a need to suppress duplicate definitions mentioned in #3, as they are compatible! - my_sys.h and m_ctype.h must always follow a CMake configuation header, either my_config.h or mcsconfig.h (or both). They must never be included without any preceeding configuration header. This change make sure that we resolve conflicts by: - either disallowing inclusion of mcsconfig.h and my_config.h at the same time - or by hiding conflicting definitions #1 and #2 (with their later restoring). - also, by making sure that my_sys.h and m_ctype.h always follow a CMake configuration file. Details: - idb_mysql.h can now only be included only after my_config.h An attempt to use idb_mysql.h with mcsconfig.h instead of my_config.h is caught by the "#error" preprocessor directive. - mariadb_my_sys.h can now be only included after mcsconfig.h. An attempt to use mariadb_my_sys.h without mcscofig.h (e.g. with my_config.h) is also caught by "#error". - collation.h now can now be included in two ways. It now has the following effective structure: #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) // Remember current conflicting definitions on the preprocessor stack // Undefine current conflicting definitions #endif #include "mcsconfig.h" #include "m_ctype.h" #if defined(PREFER_MY_CONFIG_H) && defined(MY_CONFIG_H) # Restore conflicting definitions from the preprocessor stack #endif and can be included as follows: a. using only mcsconfig.h as a configuration header: // my_config.h must not be included so far #include "collation.h" b. using my_config.h as the first included configuration file: #define PREFER_MY_CONFIG_H // Force conflict resolution #include "my_config.h" // can be included directly or indirectly ... #include "collation.h" Other changes: - Adding helper header files utils/common/mcsconfig_conflicting_defs_remember.h utils/common/mcsconfig_conflicting_defs_restore.h utils/common/mcsconfig_conflicting_defs_undef.h to perform conflict resolution easier. - Removing `#include "collation.h"` from a number of files, as it's automatically included from rowgroup.h. - Removing redundant `#include "utils_utf8.h"`. This change is not directly related to the problem being fixed, but it's nice to remove redundant directives for both collation.h and utils_utf8.h from all the files that do not really need them. (this change could probably have gone as a separate commit) - Changing my_init() to MY_INIT(argv[0]) in the MCS services sources. After the fix of the complitation failure it appeared that ColumnStore services compiled with the debug build crash due to recent changes in safemalloc. The crash happened in strcmp() with `my_progname` as an argument (where my_progname is a mysys global variable). This problem should probably be fixed on the server side as well to avoid passing NULL. But, the majority of MariaDB executable programs also use MY_INIT(argv[0]) rather than my_init(). So let's make MCS do like the other programs do.	2021-05-25 12:34:36 +04:00
benthompson15	afa88866bb	MCOL-4483: Fix and consolidate log files and cpimport logging.	2021-02-12 15:40:16 -06:00
Roman Nozdrin	328ae25650	MCOL-4328 There is a new option in both cpimport and cpimport.bin to asign an owner for all data files created by cpimport The patch consists of two parts: cpimport.bin changes, cpimport splitter changes cpimport.bin computes uid_t and gid_t early and propagates it down the stack where MCS creates data files	2020-10-03 14:05:29 +00:00
Gagan Goel	2ba9263df4	Silence -Werror=implicit-fallthrough compiler errors - Patch from Monty. The patch also fixes some potential bugs due to missing break statements.	2020-06-26 12:32:57 -04:00
David Hall	06e50e0926	MCOL-3536 collation	2020-05-26 12:42:11 -05:00
David Hall	1f3d1e6fd6	MCOL-3536 collation	2020-05-14 16:02:49 -05:00
Andrew Hutchings	8633859dd4	MCOL-3514 Add support for S3 to cpimport cpimport now has the ability to use libmarias3 to read an object from an S3 bucket instead of a file on local disk. This also moves libmarias3 to utils/libmarias3.	2019-09-24 10:31:22 +01:00
Andrew Hutchings	7b006b6e74	MCOL-104 Remove InfiniDB Engine Remove the InfiniDB engine which is a duplicate of the ColumnStore engine.	2019-08-09 11:51:55 +01:00
Andrew Hutchings	5e4f1b9933	Merge branch 'develop' into MCOL-265	2019-06-10 13:58:03 +01:00
Roman Nozdrin	e12a2acd53	MCOL-537 Regression test doesn't tolerate 'failed' in stderr, stdout. I reformulate the messages. Changed version in preprocessor conditions to avoid compilation warnings in Debian 9. Disabled sign-compare check for generated files in DML/DDL.	2019-05-20 18:30:52 +03:00
Roman Nozdrin	b2436502cb	MCOL-537 Enabled -Wno-unused-result for OAM code. Fixed pragmas that disables compilation checks. DDLProc now returns an error if it couldn't cwd. Use either auto_ptr or unique_ptr depending on GCC version.	2019-05-08 19:44:01 +03:00
Roman Nozdrin	7e2cb05624	MCOL-537 There are no CS-specific warnings building with gcc 8.2.	2019-05-07 16:00:05 +03:00
Gagan Goel	e89d1ac3cf	MCOL-265 Add support for TIMESTAMP data type	2019-04-23 00:00:09 -04:00
Andrew Hutchings	01446d1e22	Reformat all code to coding standard	2017-10-26 17:18:17 +01:00
david hill	7d8de28b43	MCOL-59, change calpont.xml	2016-06-22 16:00:00 -05:00
david hill	f6afc42dd0	the begginning	2016-01-06 14:08:59 -06:00

26 Commits