1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-06-13 16:01:32 +03:00
Commit Graph

392 Commits

Author SHA1 Message Date
a0bee173f6 chore(build): fixes to satisfy clang19 warnings 2025-05-15 19:05:38 +04:00
80277a5e7d DBBuilder: check syscat files instead dirs 2025-04-01 19:53:28 +04:00
02b8ea1331 feat(PP,ByteStream): new counting memory allocator 2025-03-27 22:12:48 +00:00
87981483e2 chore(compilation): to resolve unexpected boost warnings 2025-03-27 21:35:11 +04:00
0ab03c7258 chore(codestyle): mark virtual methods as override 2025-02-21 20:01:34 +04:00
3bcc2e2fda fix(memory leaks): MCOL-5791 - get rid of memory leaks in plugin code (#3365)
There were numerous memory leaks in plugin's code and associated code.
During typical run of MTR tests it leaked around 65 megabytes of
objects. As a result they may severely affect long-lived connections.

This patch fixes (almost) all leaks found in the plugin. The exceptions
are two leaks associated with SHOW CREATE TABLE columnstore_table and
getting information of columns of columnstore-handled table. These
should be fixed on the server side and work is on the way.
2024-12-06 09:04:55 +00:00
6f6e69815d feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing 2024-11-08 16:28:51 +04:00
5824a0cebe MCOL-5746 Cpimport: convert blob data from ascii hex when reads from STDIN (#3217)
Co-authored-by: Denis Khalikov <dennis.khalikov@gmail.com>
2024-06-27 14:23:07 +04:00
3f8758ba4c MCOL-5747 gcc-14.1.1 compile error - calloc - transposed args (#3216)
The arguments of calloc are the number of members and the sizeof the
member. Gcc-14.1.1 worked out how to tell the difference.

We correct this by transposing to gcc's will.

Co-authored-by: Daniel Black <daniel@mariadb.org>
2024-06-27 14:21:26 +04:00
2cd8f716c1 Fix MCOL-5035, a difference in INSERT and UPDATE behavior
The UPDATE statement wrote NULL when the column set is DATETIME and
value is '0000-00-00 00:00:00'. The problem was inside WriteEngine's
handling of UPDATE statements and this is where heart of change is.

Other changes are related to some obsolete data structures in DML/DDL
handling that just hanging around there, doing nothing.
2024-06-27 13:07:49 +03:00
55ffacf546 MCOL-5555 Add support for startreadonly command. (#3081)
This patch adds support for `startreadonly` command which waits
until all active cpimport jobs are done and then puts controller node to readonly
mode.
2023-12-26 15:11:33 +04:00
6315546557 Revert "MCOL-5555 Add support for startreadonly command."
This reverts commit 441cd9d34f.
2023-12-20 01:05:37 +03:00
97abd9866b fix(writeengine) MCOL-4202: use schema name when renaming table and change it's fields in syscat 2023-12-18 09:59:44 +03:00
441cd9d34f MCOL-5555 Add support for startreadonly command.
This patch adds support for `startreadonly` command which waits
until all active cpimport jobs are done and then puts controller node to readonly
mode.
2023-10-16 16:11:50 +03:00
7f9c624626 MCOL-5573 Fix cpimport truncation of TEXT columns.
1. Restore the utf8_truncate_point() function in utils/common/utils_utf8.h
that I removed as part of the patch for MCOL-4931.

2. As per the definition of TEXT columns, the default column width represents
the maximum number of bytes that can be stored in the TEXT column. So the
effective maximum length is less if the value contains multi-byte characters.
However, if the user explicitly specifies the length of the TEXT column in a
table DDL, such as TEXT(65535), then the DDL logic ensures that enough number
of bytes are allocated (upto a system maximum) to allow upto that many number
of characters (multi-byte characters if the charset for the column is multi-byte,
such as utf8mb3).
2023-09-20 12:23:22 -04:00
e0d9b82705 MCOL-5021 Fix a minor bug related to the AUX column support in cpimport. 2023-09-06 16:48:54 -04:00
931f2b36a1 MCOL-4931 Make cpimport charset-aware. (#2938)
1. Extend the following CalpontSystemCatalog member functions to
   set CalpontSystemCatalog::ColType::charsetNumber, after the
   system catalog update to add charset number to calpontsys.syscolumn
   in MCOL-5005:
     CalpontSystemCatalog::lookupOID
     CalpontSystemCatalog::colType
     CalpontSystemCatalog::columnRIDs
     CalpontSystemCatalog::getSchemaInfo

2. Update cpimport to use the CHARSET_INFO object associated with the
   charset number retrieved from the system catalog, for a
   dictionary/non-dictionary CHAR/VARCHAR/TEXT column, to truncate
   long strings that exceed the target column character length.

3. Add MTR test cases.
2023-09-05 17:17:20 +03:00
d50a0fa2e6 MCOL-5005 Add charset number to system catalog - Part 2.
1. Extend the calpontsys.syscolumn system catalog table
  with a new column, 'charsetnum'.

  'charsetnum' field is set to the 'number' member of the
  'charset_info_st' struct defined in the server in m_ctype.h.

  For CHAR/VARCHAR/TEXT column types, 'charset_info_st' is
  initialized to the charset/collation of the column, which
  is set at the column-level or at the table-level in the DDL.

  For BLOB/VARBINARY binary column types, 'charset_info_st' is
  initialized to my_charset_bin (charsetnum=63).

  For all other column types, charsetnum is set to 0.

  2. Add support for the newly added 'charsetnum' column in the
  automatic system catalog upgrade logic in dbbuilder.

  For existing table definitions, charsetnum for the column is
  defaulted to 0.

  3. Add MTR test case that creates a few table definitions with
  a range of charset/collation combinations and queries the
  calpontsys.syscolumn system catalog table with the charsetnum
  field for the columns in the table DDLs.
2023-08-15 17:21:47 +00:00
30064e3d4a cpimport.bin doesn't really need libboost_program_options
linking with unused libraries creates a difference in dependencies
between --no-as-needed (gcc default) and --as-needed (default on
Fedora rpmbuild) builds.
2023-07-04 12:58:18 -04:00
ab7dfaa25b Fix resource leak in DDLProc/DMLProc/PrimProc/WriteengineServer processes.
As part of the charset support, a call to MY_INIT() was added at the
initialization of the above processes. This call initializes the MySQL
thread environment required by the charset library. However, the
accompanying my_end() call required to terminate this thread environment
was not added at the termination of these process, hence leaking
resources. As a fix, we move the MY_INIT() calls to the Child()
functions of these services and also add the missing my_end() call.
2023-06-23 20:18:32 +00:00
8f93fc3623 MCOL-5493: First portion of UBSan fixes (#2842)
Multiple UB fixes
2023-06-02 17:02:09 +03:00
4fe9cd64a3 Revert "No boost condition (#2822)" (#2828)
This reverts commit f916e64927.
2023-04-22 15:49:50 +03:00
f916e64927 No boost condition (#2822)
This patch replaces boost primitives with stdlib counterparts.
2023-04-22 00:42:45 +03:00
3ce19abdae Options to build with TSAN, UBSAN and skipping smoke (#2826) 2023-04-21 21:24:48 +03:00
c2d0fa24da replace boost::shared_array<T> to std::shared_ptr<T[]> 2023-04-14 10:33:27 +00:00
2e1394149b MCOL-5464: Fixes of bugs from ASAN warnings, part one (#2792)
* Fixes of bugs from ASAN warnings, part one

* MQC as static library, with nifty counter for global map and mutex

* Switch clang to 16

* link messageqcpp to execplan
2023-04-04 02:33:23 +03:00
b53c231ca6 MCOL-271 empty strings should not be NULLs (#2794)
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.

Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
2023-03-30 21:18:29 +03:00
70124ecc01 Fix trivial spelling errors
- occured -> occurred
- reponse -> response
- seperated -> separated

All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services, Inc.
2023-03-11 11:59:47 -08:00
56f2346083 Remove windows ifdefs 2023-03-02 15:59:42 +00:00
006b92bba2 Revert "This commit fixes an incorrect predicate in the if condition (#2608)"
This reverts commit f4e3022fbd.

The commit apparently caused MCOL-5318 and MCOL-5319 which involve the
internal ColumnStore batch insert mechanism passing through the SQL
layer. The code block involved in this change is a predicate checking
for the HWM extent in WriteEngineServer at the end of the batch insert.
This is done in WE_DMLCommandProc::processBatchInsertHwm(). The original
predicate check in this function for the HWM extent is restored until
further investigation.
2023-02-02 08:07:18 -05:00
ad59ed5402 MCOL-5367 Fix a bug introduced in MCOL-5021 (AUX column implementation).
In the implementation of MCOL-5021, an assert was added in
`WE_DMLCommandProc::processBatchInsertHwm()` that assumed the
`WriteEngine::TableMetaData` cache is uniform across the cluster.
However, this assumption is incorrect.

This bug caused undefined behaviour in ColumnStore resulting in bugs
such as MCOL-5367. In MCOL-5367, in a multi-node ColumnStore cluster,
an INSERT ... SELECT in a transaction with system variable
`columnstore_use_import_for_batchinsert=OFF/ON` did not show inserted
records when a SELECT query was issued. Assuming a 3-node cluster setup,
DMLProc only sends a given batch of records to be inserted to one of the
3 nodes, and not all nodes. As a result, the `WriteEngine::TableMetaData`
cache is only populated for that one node and is not uniform across the
cluster, causing the assert to fail.

As a fix, we simply remove this assert as it is redundant and should not
have been added in the first place.
2023-01-16 05:54:44 -05:00
d61780cab1 MCOL-5263 Add support to ROLLBACK when PP were restarted.
DMLProc starts ROLLBACK when SELECT part of UPDATE fails b/c EM facility in PP were restarted.
Unfortunately this ROLLBACK stuck if EM/PP are not yet available.
DMLProc must have a t/o with re-try doing ROLLBACK.
2022-12-13 16:18:53 +03:00
b936ed8b2e Fix some GCC-12 Build errors 2022-11-22 03:28:17 +03:00
37fd915a08 Serg`s patch for develop-6 revised for develop https://github.com/mariadb-corporation/mariadb-columnstore-engine/pull/2614 2022-11-09 22:41:38 +00:00
f4e3022fbd This commit fixes an incorrect predicate in the if condition (#2608)
that checks for HWM extent in WE_DMLCommandProc::processBatchInsertHwm().
2022-11-08 14:51:42 -06:00
d2432f9bf6 get rid of pointers for 128 fields 2022-08-26 15:12:22 +00:00
0863ecd279 Replace getBinaryField 2022-08-25 18:21:43 +03:00
cbfdae3481 MCOL-5021 Code changes based on review feedback. 2022-08-05 14:40:50 -04:00
1355237ca3 MCOL-5021 Some minor fixes. 2022-08-05 14:40:50 -04:00
9b6d3c3870 MCOL-5021 Add support for AUX column in the client code calling
CalpontSystemCatalog::columnRIDs().
2022-08-05 14:40:49 -04:00
262cd5c501 MCOL-5021 Remove hard-coded values for data type, column width
and compression type for the AUX column, and replace them with
constants defined in the execplan namespace.
2022-08-05 14:40:49 -04:00
c8b6b154bf MCOL-5021 Add an option in Columnstore.xml, fastdelete (disabled
by default), which when enabled, indiscriminately invalidates all
column extents and performs the actual DELETE only on the AUX
column. The trade-off with this approach would now be that the
first SELECT for certain query patterns (those containing a WHERE
predicate) after the DELETE operation will slow down as the
invalidated column extent would need to be scanned again to set
the min/max values.
2022-08-05 14:40:49 -04:00
60eb0f86ec MCOL-5021 non-AUX column files are opened in read-only mode during
the DELETE operation. ColumnOp::readBlock() calls can cause writes
to database files when the active chunk list in ChunkManager is full.
Since non-AUX columns are read-only for the DELETE operation, we prevent
writes of compressed chunks and header for these columns by passing
an isReadOnly flag to CompFileData which indicates whether the column
is read-only or read-write.
2022-08-05 14:40:49 -04:00
35a3a93964 MCOL-5021 For the DELETE operation, empty magic values are only
written to database files for AUX column. Perform read-only operation
for other columns in the table to update the Casual Partitioning information.
2022-08-05 14:40:49 -04:00
86df9a972c MCOL-5021 Add prototype support for the AUX column in CREATE/DROP
DDL commands, single and multi-value INSERTs, cpimport, and
DELETE.
2022-08-05 14:40:49 -04:00
fb1e23bb83 [MCOL-5106] Add support to work with StorageManager.
This patch eliminates boost::filesystem from `mcsRebuildEM` tool.
After this change we should be able to work with any filesystem
even S3.
2022-07-28 16:47:34 +03:00
f5b2a6885f MCOL-5013: Load Data from S3 into Columnstore
Introduced UDF and stored prodecure.
usage:

set columnstore_s3_key='<s3_key>';
set columnstore_s3_secret='<s3_secret>';
set columnstore_s3_region='region';

and then use UDF
select columnstore_dataload("<tablename>", "<filename>", "<bucket>", "<db_name>");
for UDF db_name can be ommited, then current connection db will be used

or stored function
call calpontsys.columnstore_load_from_s3("<tablename>", "<filename>", "<bucket>", "<db_name>");
2022-07-04 19:52:37 +03:00
4c26e4f960 MCOL-4912 This patch introduces Extent Map index to improve EM scaleability
EM scaleability project has two parts: phase1 and phase2.
        This is phase1 that brings EM index to speed up(from O(n) down
        to the speed of boost::unordered_map) EM lookups looking for
        <dbroot, oid, partition> tuple to turn it into LBID,
        e.g. most bulk insertion meta info operations.
        The basis is boost::shared_managed_object where EMIndex is
        stored. Whilst it is not debug-friendly it allows to put a
        nested structs into shmem. EMIndex has 3 tiers. Top down description:
        vector of dbroots, map of oids to partition vectors, partition
        vectors that have EM indices.
        Separate EM methods now queries index before they do EM run.
        EMIndex has a separate shmem file with the fixed id
        MCS-shm-00060001.
2022-05-04 12:59:16 +00:00
4bb3743110 Added word bytes 2022-04-20 16:04:14 +03:00
2ec502aaf3 MCOL-4576: remove S3 options from cpimport. (#2328) 2022-04-01 09:18:34 -05:00