This patch:
1. Properly processes situation when pm join result count is exceeded.
2. Adds session variable 'columnstore_max_pm_join_result_count` to control the limit.
This patch is the JSON_ARRAYAGG clone of the changes done in MCOL-5429
where we enabled usage of StringStore for long strings in
GROUP_CONCAT() processing to reduce memory footprint of PrimProc and
thus avoiding a potential OS triggered OOM crash.
1. Input and output RowGroup's used in GROUP_CONCAT classes
are currently allocating a raw memory buffer of size equal
to the actual width of the string datatype. As an example,
for the following query:
SELECT col1, GROUP_CONCAT(col2) FROM t GROUP BY col1;
If col2 is a TEXT field with default width, the input
RowGroup containing the target rows to be concatenated will
assign 64kb of memory for every input row in the RowGroup.
This is wasteful as actual field values in real workloads
would be much smaller. We fix this by enabling the
RowGroup to use the StringStore when the RowGroup contains
long strings.
2. RowAggregation::initialize() allocates a memory buffer
for a NULL row. The size of this buffer is equal to the
row size for the output RowGroup. For the above scenario,
using the default group_concat_max_len (which is a server
variable that sets the maximum length of the GROUP_CONCAT string)
value of 1mb, the buffer size would be
(1mb + 64kb + some additional metadata). If the user sets
group_concat_max_len to a higher value, say 3gb, this buffer
size would be ~3gb. Now if the runtime initiates several
instances of RowAggregation, total memory consumption by
PrimProc could exceed the hardware memory limits causing the
OS OOM to kill the process. We fix this problem by again
enabling the StringStore for the NULL row allocation.
3. In the plugin code in buildAggregateColumn(), there is
an integer overflow when the server group_concat_max_len
variable (which is an uint32_t) is set to a value > INT32_MAX
(such as 3gb) and is assigned to
CalpontSystemCatalog::ColType::colWidth (which is an int32_t).
As a short term fix, we saturate the assigned value to colWidth
to INT32_MAX. Proper fix would be to upgrade
CalpontSystemCatalog::ColType::colWidth to an uint32_t.
1. In TupleUnion::writeNull(), add the missing switch case for
wide decimal with 16bytes column width.
2. MCOL-5432 Disable complete/partial pushdown of UNION operation
if the query involves an ORDER BY or a LIMIT clause, until
MCOL-5222 is fixed. Also add MTR test cases for this.
When a UNION operation involving DECIMAL datatypes with scale and digits
before the decimal exceeds the currently supported maximum precision
of 38, we throw an error to the user:
"MCS-2060: Union operation exceeds maximum DECIMAL precision of 38".
This is until MCOL-5417 is implemented where ColumnStore will have
full parity with MariaDB server in terms of maximum supported DECIMAL
precision and scale of 65 and 38 digits respectively.
DMLProc starts ROLLBACK when SELECT part of UPDATE fails b/c EM facility in PP were restarted.
Unfortunately this ROLLBACK stuck if EM/PP are not yet available.
DMLProc must have a t/o with re-try doing ROLLBACK.
Some changes made to 10.6-enterprise make a build using the out-of-band method of compiling columnstore not work. Out-of band means the source for the engine is not in the storage subdir of server, but rather in a stand alone directory. This is used by developers for easier develop work. In the case of out-of-band, INSTALL_LAYOUT is false in CMakeLists.txt
boost::uniqie_lock dtor calls a fancy unlock logic that throws twice.
First if the mutex is 0 and second lock doesn't own the mutex.
The first condition failure causes unhandled exception for one of the clients
in DEC::writeToClient(). I was unable to find out why Linux can have a 0
mutex and replaced boost::mutex with std::mutex b/c stdlibc++ should
be more stable comparing with boost.
ColumnStore used to include server's mysql.h
but link all tools with libmariadb.so
There's no guarantee that this would work, even with workarounds
it had in dbcon/mysql/sm.cpp
Fix:
* tools (linked with libmariadb.so) *must* include libmariadb's mysql.h
* as a hack prevent service_thd_timezone.h from being loaded into tools,
as it conflicts with libmariadb's mysql.h
* server plugin *must* include server's mysql.h
* also don't link every tool with libmariadb.so, link the helper library
(liblibmysqlclient.so) that actually needs it, tools use this
helper library, not libmariadb.so directly
This is attempt to make some part of the code more stable.
For some reason we can get a spurious nullptr for boost::shared_ptr
which cause an assert and abort.
* MCOL-5279 Send ByteStream with Primitive message to remote PPs first
and to the local PP last
MCOL-5166 forces EM to interact with PP over a messaging queue when they are in the same process. When ByteStream is taken from the queue it is drained so it is impossible to re-use the same ByteStream to send to the other nodes.
* MCOL-5279 Fix a corner case when sending PP messages
Multi-node can break when flow control is used to backpressure PPs that overflows EM with Primitive messages.
and to the local PP last
MCOL-5166 forces EM to interact with PP over a messaging queue when they are in the same process. When ByteStream is taken from the queue it is drained so it is impossible to re-use the same ByteStream to send to the other nodes.
MCOL-5166 forces EM to interact with PP over a messaging queue when they are in the same process. When ByteStream is taken from the queue it is drained so it is impossible to re-use the same ByteStream to send to the other nodes.
ARM platforms. This patch resolves this issue and unifies AUX column
processing at x86 and ARM using tempate class SimdProcessor.
The patch also replaces uint16_t mask previously used in column.cpp and
SimProcessor code with a native masks that platform uses, e.g. __m128i
or __m128 on x86 and variety of masks on ARM.
To unify the processing I introduced a new filtering Compare Operator - COMPARE_NULLEQ.
with a 'c1 IS NULL semantics'.
This patch improves the runtime performance of UNION processing in CS, as reported JIRA issue MCOL 4590. The idea of the optimization is to infer the normalize seperate functions beforehand and perform the normalization individually later, instead of a huge switch body of all normalization. This patch also cover engineering optimization, removing the hotspots in UNION processing. After application of this patch, the normalize part takes only about 25% of the whole UNION query in our experiment avg case.
Signed-off-by: Jigao Luo <luojigao@outlook.com>