HWM for DML and API was being calculated using the first column in a
table instead of the smallest column. This shifts the calculation to the
correct column.
It turns out -c wasn't actually connected to anything and now with have
BLOB/TEXT it is pretty useful. If -c is set to < 1MB then 1MB is used,
otherwise it will use the selected buffer size.
When the API inserts data into ColumnStore which will roll over into a
new extent that data wasn't being put into the new extent and corruption
occured. This patch now tracks the additional data and inserts it into
the new extent. It also makes sure the LBIDs are stored so that they are
correctly committed.
cpimport would truncate UTF8 data half way through a character which
would cause problems for functions using that data. This patch
calculates the correct truncation point when inserting the data.
The signatures were getting destroyed whilst processing before being
used in the dictionary de-duplication cache making the cache full of
empty strings. This fix resets the signature after insertion for the
cache.
This fix also lets the signature cache be read when there is 1000
entries in it.
The rowID and therefore HWM for an insert was being calculated based on
the first column. If there are smaller columns in the table these will
insert in the middle of blocks instead of creating new blocks. This means
that we would no longer be crash safe and PrimProc gets very confused until
a cache flush.
We now use the smallest column to calculate the rowID and HWM increment
(as cpimport does).
With 1.1 we have removed libdrizzle and used MariaDB's client library
instead for both CrossEngine and QueryStats. Unfortunately MariaDB 10.2
has two client libraries which have different structs with the same
name. When QueryStats was running inside the ColumnStore plugin this
symbol conflict was causing a crash.
The server's built-in client API has several different and several
missing functions so some additions to sm.cpp were made to fill the
gaps.
This patch does the following:
* Make sure that libmariadb is only linked to executables, not the
ColumnStore Plugin (to avoid symbol conflicts). Note that all
executables that link to CrossEngine and/or QueryStats need to link to
libmariadb to avoid missing symbol issues.
* Use the server's built-in client API for QueryStats when run in the
plugin
* Replace missing server built-in client API calls in sm.cpp (this is
for QueryStats and CrossEngine to keep the dynamic linker happy)
* Fixes issue where using 'localhost' as the MariaDB Server hostname
would fail in QueryStats.
* 64KB TEXT column had off-by-one length pointer counting
* TEXT I_S/LDI was looping where it shouldn't causing pointer issues
* TEXT data type wasn't fully understood by cpimport
* TEXT and BLOB now have separate identifiers internally
* TEXT columns are identified as such in system catalog
* cpimport only requires hex input for BLOB, not TEXT
* DML writes for multi-block dictionary (blob) now works
* PrimProc fixed so that the first block in multi-block is read
correctly
* Performance optimisation (removed string copy into stack) for new
dictionary entries
The end of a cpimport adds a delay which is 1 second multiplied by the
number of PMs. This in-turn is multipled up with the number of
connections/threads causing a long delay when you have several PMs.
The delay is not required as we already have a 20ms delay and we will
re-enter the recv() if there is no data and we are still connected.
The end of a cpimport adds a delay which is 1 second multiplied by the
number of PMs. This in-turn is multipled up with the number of
connections/threads causing a long delay when you have several PMs.
The delay is not required as we already have a 20ms delay and we will
re-enter the recv() if there is no data and we are still connected.
This does the following:
* Switch resource manager to a singleton which reduces the amount of
times the XML data is scanned and objects allocated.
* Make the I_S tables use the FE implementation of the system catalog
* Make the I_S.columnstore_columns table use the RID list cache
* Make the extentmap pre-allocate a vector instead of many small allocs