mariadb-columnstore-engine

mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-10-17 01:27:36 +03:00

Author	SHA1	Message	Date
Sergey Zefirov	3bcc2e2fda	fix(memory leaks): MCOL-5791 - get rid of memory leaks in plugin code (#3365 ) There were numerous memory leaks in plugin's code and associated code. During typical run of MTR tests it leaked around 65 megabytes of objects. As a result they may severely affect long-lived connections. This patch fixes (almost) all leaks found in the plugin. The exceptions are two leaks associated with SHOW CREATE TABLE columnstore_table and getting information of columns of columnstore-handled table. These should be fixed on the server side and work is on the way.	2024-12-06 09:04:55 +00:00
drrtuy	6f6e69815d	feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing	2024-11-08 16:28:51 +04:00
Leonid Fedorov	5824a0cebe	MCOL-5746 Cpimport: convert blob data from ascii hex when reads from STDIN (#3217 ) Co-authored-by: Denis Khalikov <dennis.khalikov@gmail.com>	2024-06-27 14:23:07 +04:00
Leonid Fedorov	3f8758ba4c	MCOL-5747 gcc-14.1.1 compile error - calloc - transposed args (#3216 ) The arguments of calloc are the number of members and the sizeof the member. Gcc-14.1.1 worked out how to tell the difference. We correct this by transposing to gcc's will. Co-authored-by: Daniel Black <daniel@mariadb.org>	2024-06-27 14:21:26 +04:00
Serguey Zefirov	2cd8f716c1	Fix MCOL-5035, a difference in INSERT and UPDATE behavior The UPDATE statement wrote NULL when the column set is DATETIME and value is '0000-00-00 00:00:00'. The problem was inside WriteEngine's handling of UPDATE statements and this is where heart of change is. Other changes are related to some obsolete data structures in DML/DDL handling that just hanging around there, doing nothing.	2024-06-27 13:07:49 +03:00
Denis Khalikov	55ffacf546	MCOL-5555 Add support for `startreadonly` command. (#3081 ) This patch adds support for `startreadonly` command which waits until all active cpimport jobs are done and then puts controller node to readonly mode.	2023-12-26 15:11:33 +04:00
Denis Khalikov	6315546557	Revert "MCOL-5555 Add support for `startreadonly` command." This reverts commit `441cd9d34f`.	2023-12-20 01:05:37 +03:00
Leonid Fedorov	97abd9866b	fix(writeengine) MCOL-4202: use schema name when renaming table and change it's fields in syscat	2023-12-18 09:59:44 +03:00
Denis Khalikov	441cd9d34f	MCOL-5555 Add support for `startreadonly` command. This patch adds support for `startreadonly` command which waits until all active cpimport jobs are done and then puts controller node to readonly mode.	2023-10-16 16:11:50 +03:00
Gagan Goel	7f9c624626	MCOL-5573 Fix cpimport truncation of TEXT columns. 1. Restore the utf8_truncate_point() function in utils/common/utils_utf8.h that I removed as part of the patch for MCOL-4931. 2. As per the definition of TEXT columns, the default column width represents the maximum number of bytes that can be stored in the TEXT column. So the effective maximum length is less if the value contains multi-byte characters. However, if the user explicitly specifies the length of the TEXT column in a table DDL, such as TEXT(65535), then the DDL logic ensures that enough number of bytes are allocated (upto a system maximum) to allow upto that many number of characters (multi-byte characters if the charset for the column is multi-byte, such as utf8mb3).	2023-09-20 12:23:22 -04:00
Gagan Goel	e0d9b82705	MCOL-5021 Fix a minor bug related to the AUX column support in cpimport.	2023-09-06 16:48:54 -04:00
Gagan Goel	931f2b36a1	MCOL-4931 Make cpimport charset-aware. (#2938 ) 1. Extend the following CalpontSystemCatalog member functions to set CalpontSystemCatalog::ColType::charsetNumber, after the system catalog update to add charset number to calpontsys.syscolumn in MCOL-5005: CalpontSystemCatalog::lookupOID CalpontSystemCatalog::colType CalpontSystemCatalog::columnRIDs CalpontSystemCatalog::getSchemaInfo 2. Update cpimport to use the CHARSET_INFO object associated with the charset number retrieved from the system catalog, for a dictionary/non-dictionary CHAR/VARCHAR/TEXT column, to truncate long strings that exceed the target column character length. 3. Add MTR test cases.	2023-09-05 17:17:20 +03:00
Gagan Goel	d50a0fa2e6	MCOL-5005 Add charset number to system catalog - Part 2. 1. Extend the calpontsys.syscolumn system catalog table with a new column, 'charsetnum'. 'charsetnum' field is set to the 'number' member of the 'charset_info_st' struct defined in the server in m_ctype.h. For CHAR/VARCHAR/TEXT column types, 'charset_info_st' is initialized to the charset/collation of the column, which is set at the column-level or at the table-level in the DDL. For BLOB/VARBINARY binary column types, 'charset_info_st' is initialized to my_charset_bin (charsetnum=63). For all other column types, charsetnum is set to 0. 2. Add support for the newly added 'charsetnum' column in the automatic system catalog upgrade logic in dbbuilder. For existing table definitions, charsetnum for the column is defaulted to 0. 3. Add MTR test case that creates a few table definitions with a range of charset/collation combinations and queries the calpontsys.syscolumn system catalog table with the charsetnum field for the columns in the table DDLs.	2023-08-15 17:21:47 +00:00
Sergei Golubchik	30064e3d4a	cpimport.bin doesn't really need libboost_program_options linking with unused libraries creates a difference in dependencies between --no-as-needed (gcc default) and --as-needed (default on Fedora rpmbuild) builds.	2023-07-04 12:58:18 -04:00
Gagan Goel	ab7dfaa25b	Fix resource leak in DDLProc/DMLProc/PrimProc/WriteengineServer processes. As part of the charset support, a call to MY_INIT() was added at the initialization of the above processes. This call initializes the MySQL thread environment required by the charset library. However, the accompanying my_end() call required to terminate this thread environment was not added at the termination of these process, hence leaking resources. As a fix, we move the MY_INIT() calls to the Child() functions of these services and also add the missing my_end() call.	2023-06-23 20:18:32 +00:00
Leonid Fedorov	8f93fc3623	MCOL-5493: First portion of UBSan fixes (#2842 ) Multiple UB fixes	2023-06-02 17:02:09 +03:00
Roman Nozdrin	4fe9cd64a3	Revert "No boost condition (#2822 )" (#2828 ) This reverts commit `f916e64927`.	2023-04-22 15:49:50 +03:00
Leonid Fedorov	f916e64927	No boost condition (#2822 ) This patch replaces boost primitives with stdlib counterparts.	2023-04-22 00:42:45 +03:00
Leonid Fedorov	3ce19abdae	Options to build with TSAN, UBSAN and skipping smoke (#2826 )	2023-04-21 21:24:48 +03:00
Leonid Fedorov	c2d0fa24da	replace boost::shared_array<T> to std::shared_ptr<T[]>	2023-04-14 10:33:27 +00:00
Leonid Fedorov	2e1394149b	MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792 ) * Fixes of bugs from ASAN warnings, part one * MQC as static library, with nifty counter for global map and mutex * Switch clang to 16 * link messageqcpp to execplan	2023-04-04 02:33:23 +03:00
Sergey Zefirov	b53c231ca6	MCOL-271 empty strings should not be NULLs (#2794 ) This patch improves handling of NULLs in textual fields in ColumnStore. Previously empty strings were considered NULLs and it could be a problem if data scheme allows for empty strings. It was also one of major reasons of behavior difference between ColumnStore and other engines in MariaDB family. Also, this patch fixes some other bugs and incorrect behavior, for example, incorrect comparison for "column <= ''" which evaluates to constant True for all purposes before this patch.	2023-03-30 21:18:29 +03:00
Otto Kekäläinen	70124ecc01	Fix trivial spelling errors - occured -> occurred - reponse -> response - seperated -> separated All new code of the whole pull request, including one or several files that are either new files or modified ones, are contributed under the BSD-new license. I am contributing on behalf of my employer Amazon Web Services, Inc.	2023-03-11 11:59:47 -08:00
Leonid Fedorov	56f2346083	Remove windows ifdefs	2023-03-02 15:59:42 +00:00
Gagan Goel	006b92bba2	Revert "This commit fixes an incorrect predicate in the if condition (#2608 )" This reverts commit `f4e3022fbd`. The commit apparently caused MCOL-5318 and MCOL-5319 which involve the internal ColumnStore batch insert mechanism passing through the SQL layer. The code block involved in this change is a predicate checking for the HWM extent in WriteEngineServer at the end of the batch insert. This is done in WE_DMLCommandProc::processBatchInsertHwm(). The original predicate check in this function for the HWM extent is restored until further investigation.	2023-02-02 08:07:18 -05:00
Gagan Goel	ad59ed5402	MCOL-5367 Fix a bug introduced in MCOL-5021 (AUX column implementation). In the implementation of MCOL-5021, an assert was added in `WE_DMLCommandProc::processBatchInsertHwm()` that assumed the `WriteEngine::TableMetaData` cache is uniform across the cluster. However, this assumption is incorrect. This bug caused undefined behaviour in ColumnStore resulting in bugs such as MCOL-5367. In MCOL-5367, in a multi-node ColumnStore cluster, an INSERT ... SELECT in a transaction with system variable `columnstore_use_import_for_batchinsert=OFF/ON` did not show inserted records when a SELECT query was issued. Assuming a 3-node cluster setup, DMLProc only sends a given batch of records to be inserted to one of the 3 nodes, and not all nodes. As a result, the `WriteEngine::TableMetaData` cache is only populated for that one node and is not uniform across the cluster, causing the assert to fail. As a fix, we simply remove this assert as it is redundant and should not have been added in the first place.	2023-01-16 05:54:44 -05:00
Denis Khalikov	d61780cab1	MCOL-5263 Add support to ROLLBACK when PP were restarted. DMLProc starts ROLLBACK when SELECT part of UPDATE fails b/c EM facility in PP were restarted. Unfortunately this ROLLBACK stuck if EM/PP are not yet available. DMLProc must have a t/o with re-try doing ROLLBACK.	2022-12-13 16:18:53 +03:00
Leonid Fedorov	b936ed8b2e	Fix some GCC-12 Build errors	2022-11-22 03:28:17 +03:00
Leonid Fedorov	37fd915a08	Serg`s patch for develop-6 revised for develop https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/2614	2022-11-09 22:41:38 +00:00
Gagan Goel	f4e3022fbd	This commit fixes an incorrect predicate in the if condition (#2608 ) that checks for HWM extent in WE_DMLCommandProc::processBatchInsertHwm().	2022-11-08 14:51:42 -06:00
Leonid Fedorov	d2432f9bf6	get rid of pointers for 128 fields	2022-08-26 15:12:22 +00:00
mariadb-AndreyPiskunov	0863ecd279	Replace getBinaryField	2022-08-25 18:21:43 +03:00
Gagan Goel	cbfdae3481	MCOL-5021 Code changes based on review feedback.	2022-08-05 14:40:50 -04:00
Gagan Goel	1355237ca3	MCOL-5021 Some minor fixes.	2022-08-05 14:40:50 -04:00
Gagan Goel	9b6d3c3870	MCOL-5021 Add support for AUX column in the client code calling CalpontSystemCatalog::columnRIDs().	2022-08-05 14:40:49 -04:00
Gagan Goel	262cd5c501	MCOL-5021 Remove hard-coded values for data type, column width and compression type for the AUX column, and replace them with constants defined in the execplan namespace.	2022-08-05 14:40:49 -04:00
Gagan Goel	c8b6b154bf	MCOL-5021 Add an option in Columnstore.xml, fastdelete (disabled by default), which when enabled, indiscriminately invalidates all column extents and performs the actual DELETE only on the AUX column. The trade-off with this approach would now be that the first SELECT for certain query patterns (those containing a WHERE predicate) after the DELETE operation will slow down as the invalidated column extent would need to be scanned again to set the min/max values.	2022-08-05 14:40:49 -04:00
Gagan Goel	60eb0f86ec	MCOL-5021 non-AUX column files are opened in read-only mode during the DELETE operation. ColumnOp::readBlock() calls can cause writes to database files when the active chunk list in ChunkManager is full. Since non-AUX columns are read-only for the DELETE operation, we prevent writes of compressed chunks and header for these columns by passing an isReadOnly flag to CompFileData which indicates whether the column is read-only or read-write.	2022-08-05 14:40:49 -04:00
Gagan Goel	35a3a93964	MCOL-5021 For the DELETE operation, empty magic values are only written to database files for AUX column. Perform read-only operation for other columns in the table to update the Casual Partitioning information.	2022-08-05 14:40:49 -04:00
Gagan Goel	86df9a972c	MCOL-5021 Add prototype support for the AUX column in CREATE/DROP DDL commands, single and multi-value INSERTs, cpimport, and DELETE.	2022-08-05 14:40:49 -04:00
Denis Khalikov	fb1e23bb83	[MCOL-5106] Add support to work with StorageManager. This patch eliminates boost::filesystem from `mcsRebuildEM` tool. After this change we should be able to work with any filesystem even S3.	2022-07-28 16:47:34 +03:00
Leonid Fedorov	f5b2a6885f	MCOL-5013: Load Data from S3 into Columnstore Introduced UDF and stored prodecure. usage: set columnstore_s3_key='<s3_key>'; set columnstore_s3_secret='<s3_secret>'; set columnstore_s3_region='region'; and then use UDF select columnstore_dataload("<tablename>", "<filename>", "<bucket>", "<db_name>"); for UDF db_name can be ommited, then current connection db will be used or stored function call calpontsys.columnstore_load_from_s3("<tablename>", "<filename>", "<bucket>", "<db_name>");	2022-07-04 19:52:37 +03:00
Roman Nozdrin	4c26e4f960	MCOL-4912 This patch introduces Extent Map index to improve EM scaleability EM scaleability project has two parts: phase1 and phase2. This is phase1 that brings EM index to speed up(from O(n) down to the speed of boost::unordered_map) EM lookups looking for <dbroot, oid, partition> tuple to turn it into LBID, e.g. most bulk insertion meta info operations. The basis is boost::shared_managed_object where EMIndex is stored. Whilst it is not debug-friendly it allows to put a nested structs into shmem. EMIndex has 3 tiers. Top down description: vector of dbroots, map of oids to partition vectors, partition vectors that have EM indices. Separate EM methods now queries index before they do EM run. EMIndex has a separate shmem file with the fixed id MCS-shm-00060001.	2022-05-04 12:59:16 +00:00
Commander thrashdin	4bb3743110	Added word bytes	2022-04-20 16:04:14 +03:00
benthompson15	2ec502aaf3	MCOL-4576: remove S3 options from cpimport. (#2328 )	2022-04-01 09:18:34 -05:00
Leonid Fedorov	65252df4f6	C++20 fixes	2022-03-28 12:32:29 +00:00
Roman Nozdrin	2b4946f53a	Revert "MCOL-4576: remove S3 options from cpimport. (#2307 )" This reverts commit `14c4840d53`.	2022-03-23 12:33:23 +00:00
benthompson15	14c4840d53	MCOL-4576: remove S3 options from cpimport. (#2307 )	2022-03-21 09:54:39 -05:00
Serguey Zefirov	53b9a2a0f9	MCOL-4580 extent elimination for dictionary-based text/varchar types The idea is relatively simple - encode prefixes of collated strings as integers and use them to compute extents' ranges. Then we can eliminate extents with strings. The actual patch does have all the code there but miss one important step: we do not keep collation index, we keep charset index. Because of this, some of the tests in the bugfix suite fail and thus main functionality is turned off. The reason of this patch to be put into PR at all is that it contains changes that made CHAR/VARCHAR columns unsigned. This change is needed in vectorization work.	2022-03-02 23:53:39 +03:00
Leonid Fedorov	6b1c696991	chars are unsigned on arm y default	2022-02-17 23:18:27 +00:00

1 2 3 4 5 ...

387 Commits