Sometimes server assigns DOUBLE type for arithmetic operations over
DECIMAL arguments. In this rare case width of result was incorrectly
adjusted and it triggered an assertion.
Now width of result gets adjusted only if result type is also DECIMAL.
This patch introduces an internal aggregate operator SELECT_SOME that
is automatically added to columns that are not in GROUP BY. It
"computes" some plausible value of the column (actually, last one
passed).
Along the way it fixes incorrect handling of HAVING being transferred
into WHERE, window function handling and a bit of other inconsistencies.
There were numerous memory leaks in plugin's code and associated code.
During typical run of MTR tests it leaked around 65 megabytes of
objects. As a result they may severely affect long-lived connections.
This patch fixes (almost) all leaks found in the plugin. The exceptions
are two leaks associated with SHOW CREATE TABLE columnstore_table and
getting information of columns of columnstore-handled table. These
should be fixed on the server side and work is on the way.
* fix(rowgroup): RGData now uses uint64_t counter for the fixed sizes columns data buf.
The buffer can utilize > 4GB RAM that is necessary for PM side join.
RGData ctor uses uint32_t allocating data buffer.
This fact causes implicit heap overflow.
* feat(bytestream,serdes): BS buffer size type is uint64_t
This necessary to handle 64bit RGData, that comes as
a separate patch. The pair of patches would allow to
have PM joins when SmallSide size > 4GB.
* feat(bytestream,serdes): Distribute BS buf size data type change to avoid implicit data type narrowing
* feat(rowgroup): this returns bits lost during cherry-pick. The bits lost caused the first RGData::serialize to crash a process
* Fixes of bugs from ASAN warnings, part one
* MQC as static library, with nifty counter for global map and mutex
* Switch clang to 16
* link messageqcpp to execplan
This patch improves handling of NULLs in textual fields in ColumnStore.
Previously empty strings were considered NULLs and it could be a problem
if data scheme allows for empty strings. It was also one of major
reasons of behavior difference between ColumnStore and other engines in
MariaDB family.
Also, this patch fixes some other bugs and incorrect behavior, for
example, incorrect comparison for "column <= ''" which evaluates to
constant True for all purposes before this patch.
* toCppCode for ParseTree and TreeNode
* generated tree is compiling
* Put tree constructors into tests
* Minor fixes
* Fixed parse + some constructors
* Fixed includes, removed debug and old data
* Hopefully fix clang errors
* Forgot an override
* More overrides
Added logical transformation of the execplan::ParseTrees with the taking out the common factor in expression of the form "(A and B) or (A and C)" for the purposes of passing a TPCH 19 query.
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
* Add MTR_SUITE_LIST
* Typo
* Add data download
* Install tar and lz4
* Change the way MTR_SUITE_LIST is set up
* Use bash for MTR_SUITE_LIST
* Another one
* Fix reference results for full MTR develop, disable broken JSON test and tests with 10GB database
* Fix timestamps and truncate cos
* Fix some more references
* Fix dokcerhub step for custom build
* One more fix for dockerhub step on custom build
* Fix tests for regr functions with truncate
* Full mtr set on nghtly + MTR_FULL_SET flag
* One more fix for dockerhub
* Fix MTR_FULL_SET
* Testing MTR_FULL_SET
* sorted_result in tests + fix typo
* Truncate even more
* Typo
* truncate 2 more tests
* Disable regr_* functions tests
* fix setup mtr step
* correct settings for table creation
* Put setup for tests into drone
* Fix for debian based distros
* More truncates
* Disable the rest
---------
Co-authored-by: Leonid Fedorov <leonid.fedorov@mariadb.com>
ARM platforms. This patch resolves this issue and unifies AUX column
processing at x86 and ARM using tempate class SimdProcessor.
The patch also replaces uint16_t mask previously used in column.cpp and
SimProcessor code with a native masks that platform uses, e.g. __m128i
or __m128 on x86 and variety of masks on ARM.
To unify the processing I introduced a new filtering Compare Operator - COMPARE_NULLEQ.
with a 'c1 IS NULL semantics'.
* more build dependencies
* fix for cmake < 3.11
It cannot do ADD_LIBRARY(... ALIAS ...) on IMPORTED targets
* another fix for cmake 3.10.2
It doesn't know about CMAKE_CXX_STANDARD=20,
let's add the correct flag manually
* gcc 8 on aarch64
utils/common/simd_arm.h:241:16: error: need ‘typename’ before ‘simd::TypeToVecWrapperType<T>::WrapperType’ because ‘simd::TypeToVecWrapperType<T>’ is a dependent scope
JobList low-level code relateod to primitive jobs now uses shared pointers instead of ByteStream refs talking to DEC
b/c same-node EM-PP communication now goes over a queue in DEC instead of a network hop.
PP now has a separate thread that processes the primitive job messages from that DEC queue.
Despite we have another number of tests in result, they all still run
gtests_add_test cannot parse TYPED_TEST_SUITE one by one and run them
in one bunch
operations + morsel size weight model to equally allocate CPU b/w parallel query morsels.
This patch delivers better parallel query timings distribution(timings graph resembles normal
distribution with a bigger left side thus more queries runs faster comparing with PrioThreadPool-based
single-node installation).
See changes in batchprimitiveprocessor-jl.h and comments in fair_threadpool.h for
important implementation details
PP now uses PriorityThreadPool that arbitrary picks another jobs pack
to run. This scheduling discipline tend to run portions of a single query
forcing other simultaneous queries to wait. In result parallel queries
timings variance is high. The FairThreadPool picks the job with the smallest
amount of work done so far(see the code for details)
EM scaleability project has two parts: phase1 and phase2.
This is phase1 that brings EM index to speed up(from O(n) down
to the speed of boost::unordered_map) EM lookups looking for
<dbroot, oid, partition> tuple to turn it into LBID,
e.g. most bulk insertion meta info operations.
The basis is boost::shared_managed_object where EMIndex is
stored. Whilst it is not debug-friendly it allows to put a
nested structs into shmem. EMIndex has 3 tiers. Top down description:
vector of dbroots, map of oids to partition vectors, partition
vectors that have EM indices.
Separate EM methods now queries index before they do EM run.
EMIndex has a separate shmem file with the fixed id
MCS-shm-00060001.