1
0
mirror of https://github.com/mariadb-corporation/mariadb-columnstore-engine.git synced 2025-04-18 21:44:02 +03:00

An initial Obsidian vault as a knowledge base.

This commit is contained in:
Roman Nozdrin 2023-05-20 15:39:49 +03:00
parent 4eced2c23a
commit 26aa9abaf0
20 changed files with 305 additions and 0 deletions

View File

@ -0,0 +1,7 @@
[[makeJobSteps]]
- [[subquery preprocessing]]
- [[preprocessSelectSubquery]]
- [[preprocessHavingClause]]
- [[parseExecutionPlan]]
- [[makeVtableModeSteps]]

View File

@ -0,0 +1,105 @@
The following introduction will take this SQL statement as an example:
```SQL
select a+b*c, c from test where a>10;
```
Firstly, according to the link you provided, we can see that the main calculation code starts in BatchPrimitiveProcess::execute() from here:
```c++
fe2Output.resetRowGroup(baseRid);
fe2Output.getRow(0, &fe2Out);
fe2Input->getRow(0, &fe2In);
for (j = 0; j < outputRG.getRowCount(); j++, fe2In.nextRow())
{
if (fe2->evaluate(&fe2In))
{
applyMapping(fe2Mapping, fe2In, &fe2Out);
fe2Out.setRid(fe2In.getRelRid());
fe2Output.incRowCount();
fe2Out.nextRow();
}
}
```
By calling the `evaluate` method of `fe2`, with a reference of `row` type `fe2In` as the parameter. The execution of the code then enters the `FuncExpWrapper`, which contains the current calculation expression. It mainly performs two steps, which are:
1. Filtering the conditions in the `where` statement through the functions defined in the `filters` variable. If the `where` condition is met, the calculation proceeds to the next step; otherwise, it will return directly.
2. Calculating the data in `row` through `expression` and obtaining the final result.
```c++
bool FuncExpWrapper::evaluate(Row* r)
{
uint32_t i;
for (i = 0; i < filters.size(); i++)
// filter the conditions in the where statement
if (!fe->evaluate(*r, filters[i].get()))
return false;
// calculate the row that meets the where condition
fe->evaluate(*r, rcs);
return true;
}
```
That is to say, the function that actually performs the calculation is the `evaluate` method pointed to by `fe`. Its parameters are a `row` and a `ReturnedColumn` type array `expression`.
```c++
void FuncExp::evaluate(rowgroup::Row& row, std::vector<execplan::SRCP>& expression)
{
bool isNull;
for (uint32_t i = 0; i < expression.size(); i++)
{
isNull = false;
switch (expression[i]->resultType().colDataType)
{
case CalpontSystemCatalog::DATE:
{
int64_t val = expression[i]->getIntVal(row, isNull);
// @bug6061, workaround date_add always return datetime for both date and datetime
if (val & 0xFFFFFFFF00000000)
val = (((val >> 32) & 0xFFFFFFC0) | 0x3E);
if (isNull)
row.setUintField<4>(DATENULL, expression[i]->outputIndex());
else
row.setUintField<4>(val, expression[i]->outputIndex());
break;
}
....
}
```
In this function, each expression will be looped through, and the calculation results will be obtained by calling the `getValue` method. Taking the SQL statement mentioned at the beginning as an example, there will be two expressions, representing the columns `a + b * c` and `c`. Next, we will use the `a + b * c` expression as an example to introduce its data type and calculation process.
`a + b * c` is an arithmetic operation column, so its type is `ArithmeticColumn` inherited from `ReturnedColumn`. It contains a member variable of type `ParseTree* fExpression`, as well as the implementation of the `getIntValue` method:
```c++
virtual int64_t getIntVal(rowgroup::Row& row, bool& isNull)
{
return fExpression->getIntVal(row, isNull);
}
```
![](./image-20230321175135576.png) Here is AST graphical representation of a the given math expression.
After obtaining the structure of `fExpression`, we continue to look at the calculation process and enter the `getIntVal` method pointed to by `fExpression`:
```c++
//fExpression->getIntVal(row, isNull);
inline int64_t getIntVal(rowgroup::Row& row, bool& isNull)
{
if (fLeft && fRight)
return (reinterpret_cast<Operator*>(fData))->getIntVal(row, isNull, fLeft, fRight);
else
return fData->getIntVal(row, isNull);
}
```
From the code, we can see that the entire calculation process is actually a recursive process. When both the left and right subtrees of the current node are not empty, the current node is converted into an `Operator` type and continues to recursively call; otherwise, the value of the leaf node is returned. Taking the binary tree in the image as an example, when both left and right subtrees exist, the current node's `TreeNode` will be converted to an `operator` type (in this case, the `ArithmeticOperator` type), and the `getIntVal` method defined in it will be recursively called for calculation. When it is a leaf node, the node data is extracted.
Each row of data will go through this recursive process, which will affect the calculation performance to some extent. Therefore, here is my implementation plan for the GSOC project:
Since we can obtain the `ParseTree` before the calculation, we can dynamically generate LLVM calculation code based on the `ParseTree` using JIT technology. When executing `void FuncExp::evaluate(rowgroup::Row& row, std::vector<execplan::SRCP>& expression);`, it is only necessary to call the generated LLVM code, which can avoid the recursive calculation operation for each row of data. In this way, there will be only one recursive call (when dynamically generating code using JIT).
The above are some thoughts I had after reading this part of the code. I hope you can give me some suggestions. Thank you very much.

View File

@ -0,0 +1,8 @@
Firstly, here is the context for GB in MCS in general. The whole facility is spread between multiple compilation units. Speaking in terms of classical SQL engine processing. There is a code for PLAN and PREPARE phases. It is glued together b/c now MCS doesn't have a clear distinction b/w the two. It is available in dbcon/mysql/ and dbcon/joblist. I don't think that you will need this part so I will omit it for simplicity.
There is EXECUTE phase code that you will optimize.
It mainly consists of symbols that defined
- [here](https://github.com/mariadb-corporation/mariadb-columnstore-engine/blob/develop/dbcon/joblist/tupleaggregatestep.cpp). SQL AST is translated into a flat program called JobList where Jobs are called Steps and closely relates to the nodes of the original AST. There is TupleAggregateStep that is an equivalent for GROUP BY, DISTINCT and derivatives. This file describes high-level control of TupleAggregateStep.
- [here](https://github.com/mariadb-corporation/mariadb-columnstore-engine/blob/develop/utils/rowgroup/rowaggregation.cpp) is the aggregate and distinct machinery that operates on rows.
- [here](https://github.com/mariadb-corporation/mariadb-columnstore-engine/blob/develop/utils/rowgroup/rowstorage.cpp) is an abstraction of hashmap based as I prev said on RobinHood header-only hash map.

View File

@ -0,0 +1,9 @@
MTR is a regression test framework for MariaDB/MySQL. It is written in Perl.
[Here](https://github.com/mariadb-corporation/mariadb-columnstore-engine/blob/develop/mysql-test/columnstore/basic/t/mcol271-empty-string-is-not-null.test) is an example of MTR test. When you right the test you can ask MTR to produce a golden file automatically like this.
```shell
./mtr --record --extern socket=/run/mysqld/mysqld.sock --suite=columnstore/basic test_name
```
The golden file goes into mysql-test/columnstore/basic/r/test_name.result.

View File

0
docs/SubQueries TBD.md Normal file
View File

View File

@ -0,0 +1,14 @@
Statements:
- WFS adds extra aux columns into projection.
- WFS sorts using the outer ORDER BY key column list()
jobInfo.windowDels is used as a temp storage to pass unchanged list of jobInfo.deliveredCols(read as projection list of delivered cols) through WFS if there is no GB/DISTINCT involved.
GB/DISTINCT can clean the list up on their own.
This method might implicitly add GB column in case of aggregates over WF and w/o explicit GROUP BY. This is disabled in WFS::AddSimpleColumn
```SQL
SELECT min(fist_value(i) partition over i) from tab1;
```

5
docs/WFS_initialize.md Normal file
View File

@ -0,0 +1,5 @@
This method runs ORDER BY using built-in sort. Not great in case of ORDER BY + LIMIT but not the end of the world. Remove this. New approach is to save all ORDER BY key columns in output RG.
Retain those ReturnedColumns that are not in WFS columns list but in orderByColsVec.
!!! Remove ORDER BY duplication in WFS.

View File

@ -0,0 +1,2 @@
WFS adds extra columns into RG that must be removed after WFS. There is a block here that restores the original deliveredCols only there is no DISTINCT or GROUP BY b/c DISTINCT/GB does a similar filtering modifying jobInfo.distinctColVec.
[[WFS_initialize]]

View File

@ -0,0 +1,2 @@
There must be 3 projection steps but TBPS RowGroup must be set to something else.
psv size is 4 vs 3

3
docs/adjustLastStep.md Normal file
View File

@ -0,0 +1,3 @@
[[WFS_makeWindowFunctionStep]]
[[prepAggregate]]
windowDels -> jobInfo.nonConstDelCols

View File

@ -0,0 +1,8 @@
- if no tables involved makeNoTableJobStep
- Check if constantBooleanSteps is false => sets jobInfo.constantFalse | constant expression evaluates to false
- [[JOIN-related processing]]
- spanningTreeCheck
- [[combineJobStepsByTable]]
- joinTables
- add JOIN query steps in JOIN order
- [[adjustLastStep]]

View File

@ -0,0 +1 @@
[[addProjectStepsToBps]]

87
docs/custom-build.md Normal file
View File

@ -0,0 +1,87 @@
How to trigger a custom build
=============================
- Open <https://ci.columnstore.mariadb.net>.
- Click the Continue button to login via github.
- After you logged in, select mariadb/mariadb-columnstore-engine repository. Please note that recipes below do not work for branches in forked repositories. The branch you want to build against should be in the main engine repository.
- Click the New Build button on the top right corner.
- Fill the Branch field (branch you want to build).
- Fill desired parameters in key-value style.
Supported parameters with some their values for develop/develop-6 branches:
| parameter name | develop | develop-6 |
|--|--|--|
|`SERVER_REF` | 10.9 |10.6-enterprise |
|`SERVER_SHA` | 10.9| 10.6-enterprise |
|`SERVER_REMOTE` | https://github.com/MariaDB/server|https://github.com/mariadb-corporation/MariaDBEnterprise |
|`REGRESSION_REF` |develop |develop-6 |
|`REGRESSION_TESTS` |`test000.sh,test001.sh` |`test000.sh,test001.sh` |
|`BUILD_DELAY_SECONDS` |0 |0 |
|`SMOKE_DELAY_SECONDS` |0 |0 |
|`MTR_DELAY_SECONDS` |0 |0 |
| `REGRESSION_DELAY_SECONDS`|0 |0 |
| `MTR_SUITE_LIST`|`basic,bugfixes` |`basic,bugfixes` |
| `MTR_FULL_SUITE`|`false` |`false` |
`REGRESSION_TESTS` parameter has an empty value on `cron` (nightly) builds and it passed to build just as an argument to regression script like that:
`./go.sh --tests=${REGRESSION_TESTS}`
So you can set it to `test000.sh,test001.sh` for example (comma separated list).
Build artifacts (packages and tests results) will be available [here](https://cspkg.s3.amazonaws.com/index.html?prefix=custom/%5D).
Trigger a build against external packages (built by external ci-systems like Jenkins)
=============================
- Start build just like a regular custom build, but choose branch `external-packages`.
- Add `EXTERNAL_PACKAGES_URL` variable. For example, if you want to run tests for packages from URL `https://es-repo.mariadb.net/jenkins/ENTERPRISE/bb-10.6.9-5-cs-22.08.1-2/a71ceba3a33888a62ee0a783adab8b34ffc9c046/`, you should set
`EXTERNAL_PACKAGES_URL=https://es-repo.mariadb.net/jenkins/ENTERPRISE/10.6-enterprise-undo/d296529db9a1e31eab398b5c65fc72e33d0d6a8a`.
|parameter name | mandatory |default value |
|--|--|--|
|`EXTERNAL_PACKAGES_URL` | true | |
|`REGRESSION_REF` |false |`develop` |
Get into the live build on mtr/regression steps
===============================================
Prerequisites:
- [docker binary](https://docs.docker.com/engine/install/) (we need only client, no need to use docker daemon)
- [drone cli binary](https://docs.drone.io/cli/install/)
- [your personal drone token](https://ci.columnstore.mariadb.net/account)
- run you custom build number with `MTR_DELAY_SECONDS` or `REGRESSION_DELAY_SECONDS` parameters and note build number.
Build number example:
![](https://lh4.googleusercontent.com/bUXokNezygP7Xx8KqIAYrJEXzFJua6QqP1aDKkr2LTmb3VXASem8MYSzYfB3K3ZySmJTs6ylfh37oYsnFMp0arVT4iNZonJH4kClFlzja_Un89g9n9En6M8kw-VM4VwF3d_ONI18I00Zdsbard1MTmg)
1. Export environment variables:
```Shell
export DRONE_AUTOSCALER=https://autoscaler.columnstore.mariadb.net
export DRONE_SERVER=https://ci.columnstore.mariadb.net
export DRONE_TOKEN=your-personal-token-from-drone-ui-account-page
```
Note Use https://autoscaler-arm.columnstore.mariadb.net as ARM autoscaler.
2. Run:
```Shell
for i in $(drone server ls); do eval "$(drone server env $i)" && drone server info $i --format="{{ .Name }}" && docker ps --format="{{ .Image }} {{ .Names }}" --filter=name=5107; done
```
Where 5107 is your build number.
You should see some output looks like this:
![](https://lh5.googleusercontent.com/O5gbs6bHH9PnlqP_R-nUkGUM_V98c9s9AvDhEDcNx0R22Wlpka4O1-G7GkdZCJNxzxmsMLn5rlRKcYjRakOgF4FQkVZrCSVYQueaqxaL8-lmQg45Yc6ZOEIUOUZhiXe4YQNid1L3N4YqlDiNjSq4FfE)
3. Run:
```
eval "$(drone server env agent-A4kVtsDU)"
```
4. Run:
```
docker exec -it regression5107 bash
```

4
docs/doAggProject.md Normal file
View File

@ -0,0 +1,4 @@
Here is a filter that removes all aux columns' ids from JI::returnedColVec.
JI::returnedColVec : map<id , aggOp> where 0 - noAggOp means non-agg column
JI::returnedColVec is used in TAS methods.

0
docs/doProject.md Normal file
View File

View File

@ -0,0 +1,3 @@
- [[Limit and Order By]]
- [[associateTupleJobSteps]]
- [[numberSteps]]

View File

@ -0,0 +1,35 @@
- SessionManagerServer has a number of state flags:
- SS_READY = 1; // Set by dmlProc one time when dmlProc is ready
- SS_SUSPENDED = 2// Set by console when the system has been suspended by user.
- SS_SUSPEND_PENDING = 4// Set by console when user wants to suspend, but writing is occuring.
- SS_SHUTDOWN_PENDING = 8 // Set by console when user wants to shutdown, but writing is occuring.
- SS_ROLLBACK = 16; // In combination with a PENDING flag, force a rollback as soon as possible.
- SS_FORCE = 32; // In combination with a PENDING flag, force a shutdown without rollback.
- SS_QUERY_READY = 64 // Set by PrimProc after ExeMgr thread is up and running
- The actual state is a combination of flags.
- The state of a running cluster resides in SessionManagerServer instance attribute that is inside a controllernode process.
- There is a cold storage for a state that is stored in a file pointed by SessionManager.TxnIDFile with /var/lib/columnstore/data1/systemFiles/dbrm/SMTxnID. This cold state is loaded up when controllernode starts.
- The following FSM diagram demostrates some transitions. !It is not full yet!
```mermaid
stateDiagram-v2
ZeroState --> InitState : Controllernode reads cold state
InitState --> InitState_6 : ExeMgr threads starts in PP/SS_QUERY_READY
InitState_6 --> InitState_6_!1 : DMLProc begins rollbackAll / !SS_READY
InitState_6_!1 --> InitState_6_!1 : DMLProc fails rollbackAll /
InitState_6_!1 --> InitState_6_1 : DMLProc finishes rollbackAll sucessfully / SS_READY
InitState_6_1 --> InitState_16_8_6_1 : cmapi_gets_shutdown
InitState_16_8_6_1 --> ZeroState : see rollback_is_ok
InitState_16_8_6_1 --> InitState_32_16_8_6_1 : see failed_rollback
InitState_32_16_8_6_1 --> ZeroState : force DMLProc shutdown
```
cmapi_gets_shutdown: CMAPI gets shutdown request with TO / SS_SHUTDOWN_PENDING + SS_ROLLBACK
rollback_is_ok: DMLProc sucessfully rollbacks active txns within TO and cluster stops
failed_rollback: DMLProc sucessfully rollbacks active txns within TO and cluster stops

View File

@ -0,0 +1,11 @@
- [[walkTree filters]]
- [[optimizeFilterOrder]]
- [[SubQueries TBD]]
- [[WFS_checkWindowFunction]]
- [[TBD checkAggregation]]
- [[JI::havingStepVec -> querySteps]]
- [[iterate querySteps]]
- Filter VARBINARY from querySteps
- pColStep -> pColScanStep translation; feel in seenTableIds
- [[doAggProject]]
- [[doProject]]

1
docs/prepAggregate.md Normal file
View File

@ -0,0 +1 @@
creates RowGroups for aggregation