1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-08-05 16:15:50 +03:00
Commit Graph

110 Commits

Author SHA1 Message Date
Andrey Piskunov
589b786fda Don't ignore null or empty in calculation 2022-08-04 16:16:38 +03:00
Andrey Piskunov
04ac04ff74 Temporary test fix 2022-08-04 16:16:38 +03:00
Andrey Piskunov
5c6cd2cca3 use vect update for everything except TEXT 2022-08-04 16:16:38 +03:00
Andrey Piskunov
225f54fd79 Tests for simd min/max 2022-08-04 16:16:38 +03:00
Andrey Piskunov
b8200acd3b Don't ignore null or empty in calculation 2022-08-04 16:16:38 +03:00
Andrey Piskunov
2a7da39610 Temporary test fix 2022-08-04 16:16:38 +03:00
Andrey Piskunov
c4df7925d1 use vect update for everything except TEXT 2022-08-04 16:16:38 +03:00
Andrey Piskunov
1681edaca0 Tests for simd min/max 2022-08-04 16:16:38 +03:00
Roman Nozdrin
3b87532413 Revert "This patch disables FairThreadPool to double check if this feature contributes to multiple strange side-effects and ocassional failed MTR tests"
This reverts commit b78cbffa93.
2022-07-22 14:04:06 +00:00
Roman Nozdrin
b78cbffa93 This patch disables FairThreadPool to double check if this feature contributes to multiple strange side-effects and ocassional failed MTR tests 2022-07-20 11:17:19 +00:00
Leonid Fedorov
1cd382ba3b Clang warning fix 2022-07-16 16:26:10 +03:00
Leonid Fedorov
140770d6f4 Delete tests/shared_components_tests.cpp, erase legacy code from tests/primitives_scan_bench.cpp, option to run benchmarks from build/bootstrap_mcs.sh 2022-07-15 15:56:24 +00:00
Leonid Fedorov
56b01fdefc Workaround for gtest compile bug 2022-07-11 22:27:25 +02:00
Roman Nozdrin
0907ca414f MCOL-5044 This patch simplifies addJob interfaces removing extra bool that control mutex locking,
adds additional nullptr dereference check in removeJobs and fixes FairThreadPool
hashmap iter invalidation issues
2022-07-09 12:50:30 +00:00
Roman Nozdrin
6cff14997d Revert "This reverts MCOL-5044 AKA FairThreadPool that breaks regr test002"
This reverts commit 61359119ad.
2022-07-09 12:38:51 +00:00
NTH19
a4842ef998 rename 2022-06-24 16:53:02 +08:00
NTH19
4c0b8fd829 simd of arm neon
unit testing

pass unit test for simdprocessor

add test cases

implement specific _mm_movemask for different types

float movemask change

rename
2022-06-24 11:24:59 +08:00
Leonid Fedorov
3638f4ac8c Replace gtest_discovery_tests with gtests_add_tests
Despite we have another number of tests in result, they all still run
gtests_add_test cannot parse TYPED_TEST_SUITE one by one and run them
in one bunch
2022-06-13 15:05:10 +00:00
Roman Nozdrin
61359119ad This reverts MCOL-5044 AKA FairThreadPool that breaks regr test002
This reverts commit e40c16bd56, reversing
changes made to 18e6b1d77b.
2022-06-10 14:17:59 +00:00
Roman Nozdrin
fd8ba33f21 MCOL-5044 This patch replaces PriorityThreadPool with FairThreadPool that uses a simple
operations + morsel size weight model to equally allocate CPU b/w parallel query morsels.
This patch delivers better parallel query timings distribution(timings graph resembles normal
distribution with a bigger left side thus more queries runs faster comparing with PrioThreadPool-based
single-node installation).
See changes in batchprimitiveprocessor-jl.h and comments in fair_threadpool.h for
important implementation details
2022-06-03 10:08:12 +00:00
Roman Nozdrin
0f0b3a2bed Disable FairThreadPool unit tests in develop-6 b/c its unit test segfaults in containers 2022-06-02 17:05:30 +00:00
Roman Nozdrin
c92dc08264 MCOL-5044 Initial version of a fair thread pool
PP now uses PriorityThreadPool that arbitrary picks another jobs pack
    to run. This scheduling discipline tend to run portions of a single query
    forcing other simultaneous queries to wait. In result parallel queries
    timings variance is high. The FairThreadPool picks the job with the smallest
    amount of work done so far(see the code for details)
2022-06-02 17:05:12 +00:00
Roman Nozdrin
4c26e4f960 MCOL-4912 This patch introduces Extent Map index to improve EM scaleability
EM scaleability project has two parts: phase1 and phase2.
        This is phase1 that brings EM index to speed up(from O(n) down
        to the speed of boost::unordered_map) EM lookups looking for
        <dbroot, oid, partition> tuple to turn it into LBID,
        e.g. most bulk insertion meta info operations.
        The basis is boost::shared_managed_object where EMIndex is
        stored. Whilst it is not debug-friendly it allows to put a
        nested structs into shmem. EMIndex has 3 tiers. Top down description:
        vector of dbroots, map of oids to partition vectors, partition
        vectors that have EM indices.
        Separate EM methods now queries index before they do EM run.
        EMIndex has a separate shmem file with the fixed id
        MCS-shm-00060001.
2022-05-04 12:59:16 +00:00
Roman Nozdrin
7cdc914b4e MCOL-4809 This patch introduces vectorized scanning/filtering for short CHAR/VARCHAR columns
Short CHAR/VARCHAR column values contain integer-encoded strings.
    After certain manipulations(orderSwap(strnxfrm(str))) the values
    become integers that preserve original strings order relation
    according to a certain translation rules(collation). Prepared
    values are ready to be SIMD-processed.
2022-04-01 10:28:33 +00:00
Leonid Fedorov
c847f6ce25 Fix segfault for vector scan tests on clang 2022-03-25 13:49:25 +00:00
Leonid Fedorov
fbd043b036 Fixing alightment for clang tests of rowgroup 2022-03-23 14:29:19 +00:00
Roman Nozdrin
b46f4b42b3 MCOL-4809 Vectorized comparison operations unit tests
This commit replaces system googletest with 0.11.1 version compiled from sources
    to enable typed tests feature
2022-02-25 14:32:47 +03:00
Leonid Fedorov
3919c541ac New warnfixes (#2254)
* Fix clang warnings

* Remove vim tab guides

* initialize variables

* 'strncpy' output truncated before terminating nul copying as many bytes from a string as its length

* Fix ISO C++17 does not allow 'register' storage class specifier for outdated bison

* chars are unsigned on ARM, having  if (ival < 0) always false

* chars are unsigned by default on ARM and comparison with -1 if always true
2022-02-17 13:08:58 +03:00
Roman Nozdrin
c79dfc4925 MCOL-4809 This patch adds support for float data types filtering and scanning vectorization 2022-02-03 16:38:56 +00:00
Leonid Fedorov
04752ec546 clang format apply 2022-01-21 16:43:49 +00:00
Leonid Fedorov
01f3ceb437 replace header guards with #pragma once 2022-01-21 15:24:58 +00:00
Roman Nozdrin
af36f9940f This patch introduces support for scanning/filtering vectorized execution for numeric-based
data types TEXT, CHAR, VARCHAR, FLOAT and DOUBLE are not yet supported by vectorized path
This patch introduces an example for Google benchmarking suite to measure a perf diff
b/w legacy scan/filtering code and the templated version
2021-12-10 10:30:00 +00:00
Roman Nozdrin
3de038c1da MCOL-4876 This patch enables continues buffer to be used by ColumnCommand and aligns BPP::blockData
that in most cases was unaligned
2021-10-06 09:23:40 +00:00
Roman Nozdrin
4cb9fe4850 This patch migrates filtering UT to ctest and elimites static files dependencies of the UT 2021-10-05 15:03:18 +00:00
Roman Nozdrin
67c85dae15 MCOL-4809 The patch replaces legacy scanning/filtering code with a number of templates that
simplifies control flow removing needless expressions
2021-09-06 17:04:52 +00:00
Leonid Fedorov
f584e90718 Drone build
run unittests
2021-08-03 05:36:05 +03:00
Leonid Fedorov
73e710ed52 Add ctest for google unittests 2021-08-02 19:41:04 +03:00
Denis Khalikov
cc1c3629c5 MCOL-987 Add LZ4 compression.
* Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress.
* Adds CMake machinery to search LZ4 on running host.
* All methods which use static data and do not modify any internal data - become `static`,
  so we can use them without creation of the specific object. This is possible, because
  the header specification has not been modified. We still use 2 sections in header, first
  one with file meta data, the second one with pointers for compressed chunks.
* Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become
  pure virtual, so we can override them for the other compression algos.
* Adds method `getChunkMagicNumber`, so we can verify chunk magic number
  for each compression algo.
* Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.
2021-07-06 18:04:37 +03:00
Denis Khalikov
5d497e8821 MCOL-4566: Add rebuildEM tool support to work with compressed files.
* This patch adds rebuildEM tool support to work with compressed files.
* This patch increases a version of the file header.

Note: Default version of the `rebuildEM` tool was using very old API,
those functions are not present currently. So `rebuildEM` will not work with
files created without compression, because we cannot deduce some info which are
needed to create column extent.
2021-04-02 10:55:01 +03:00
Roman Nozdrin
508d5455a8 Merge pull request #1795 from denis0x0D/MCOL-4566/CompressedHeader
MCOL-4566: Extend CompressedDBFileHeader struct with new fields.
2021-03-08 12:24:59 +03:00
Denis Khalikov
a2efa1efeb MCOL-4566: Extend CompressedDBFileHeader struct with new fields.
* This patch extends CompressedDBFileHeader struct with new fields:
  `fColumWidth`, `fColDataType`, which are necessary to rebuild extent map
  from the given file. Note: new fields do not change the memory
  layout of the struct, because the size is calculated as
  max(sizeof(CompressedDBFileHeader), HDR_BUF_LEN)).

* This patch changes API of some functions, by adding new function
  argument `colDataType` when needed, to be able to call `initHdr`
  function with colDataType value.
2021-03-05 22:15:34 +03:00
Denis Khalikov
797716ef13 MCOL-4566: Add file2Oid function.
* This patch adds file2Oid function. This function is needed
  to map ColumnStore file name to an oid, partition and segment.
* Tests added to check that this function works correctly.
* This patch is related to MCOL-4566, so it adds a new file with GTests.

Note: The description for the functions follows the description style
in the current file.
2021-03-04 23:37:23 +03:00
Denis Khalikov
ef8915a884 Fixes for shared_components_tests build, move test to tests directory.
* Use const uint8_t* instead of uint64_t.
* Turn off 'testExtentCrWOPreallocBin' test body since this test
turned off after MCOL-641 when CalpontSystemCatalog::BINARY type was removed.
* Move shared_components_tests to tests directory.
2021-03-03 14:16:08 +03:00
Roman Nozdrin
bb70f845fb Trim up Decimal comparison 2020-12-05 12:19:50 +00:00
Roman Nozdrin
494bde61e1 MCOL-4409 Moved static Decimal conversion methods into VDecimal class
MCOL-4409 This patch combines VDecimal and Decimal and makes
IDB_Decimal an alias for the result class

MCOL-4409 More boilerplate reduction in Func_mod

Removed couple TSInt128::toType() methods
2020-11-30 12:08:52 +00:00
Roman Nozdrin
5ba6737965 Fixes for Decimal multiplication overflow check and RowGroup UTs 2020-11-22 17:55:22 +00:00
Roman Nozdrin
58495d0d2f MCOL-4387 Convert dataconvert::decimalToString() into VDecimal and TSInt128 methods 2020-11-18 13:53:16 +00:00
Alexander Barkov
d5c6645ba1 Adding mcs_basic_types.h
For now it consists of only:

using int128_t = __int128;
using uint128_t = unsigned __int128;

All new privitive data types should go into this file in the future.
2020-11-18 13:53:15 +00:00
Alexander Barkov
129d5b5a0f MCOL-4174 Review/refactor frontend/connector code 2020-11-18 13:53:15 +00:00
Roman Nozdrin
1588ebe439 MCOL-641 Clean up primitives code
Add int128_t support into ByteStream

Fixed UTs broken after collation patch
2020-11-18 13:52:19 +00:00