This bug was caused by a patch for the task MDEV-32733.
Incorrect memory root was used for allocation of memory
pointed by the data memebr Item_func_json_contains_path::p_found.
This is the follow-up patch that removes explicit use of thd->stmt_arena
for memory allocation and replaces it with call of the method
THD::active_stmt_arena_to_use()
Additionally, this patch adds extra DBUG_ASSERT to check that right
query arena is in use.
The tests main.func_json and json.json_no_table fail on server built with
the option -DWITH_PROTECT_STATEMENT_MEMROOT=YES by the reason that a memory
is allocated on the statement's memory root on the second execution of
a query that uses the function json_contains_path().
The reason that a memory is allocated on second execution of a prepared
statement that ivokes the function json_contains_path() is that a memory
allocated on every call of the method Item_json_str_multipath::fix_fields
To fix the issue, memory allocation should be done only once on first
call of the method Item_json_str_multipath::fix_fields. Simmilar issue
take place inside the method Item_func_json_contains_path::fix_fields.
Both methods are modified to make memory allocation only once on its
first execution and later re-use the allocated memory.
Before this patch the memory referenced by the pointers stored in the array
tmp_paths were released by the method Item_func_json_contains_path::cleanup
that is called on finishing execution of a prepared statement. Now that
memory allocation performed inside the method Item_json_str_multipath::fix_fields
is done only once, the item clean up has degenerate form and can be
delegated to the cleanup() method of the base class and memory deallocation
can be performed in the destructor.
from Item::val_json, UBSAN: member access within null pointer of
type 'struct String' in sql/item_jsonfunc.cc
Analysis:
The first argument of json_schema_valid() needs to be a constant.
Fix:
Parse the schema if the item is constant otherwise set it to return null.
data from a table similar to other JSON functions
Analysis:
Since we are fetching values for every row ( because we are running SELECT
for all rows of a table ), correct value can be only obtained at the time of
calling val_int() because it is called to get value for each row.
Fix:
Set up hash for each row instead of doing it during fixing fields.
* invoke parent's cleanup()
* don't reinit memroot, if already inited (causes memory leak)
also move free_root() from destructor to cleanup() to not accumulate
allocations from prepare and multiple executes
The idea is to have simple functions that the user can combine to produce
the exact result one wants, whether the user wants JSON object that has
common keys with another JSON object, or same key/value pair etc. So
making simpler function helps here.
We accomplish this by making three separate functions.
1) JSON_OBJECT_FILTER_KEYS(Obj, Arr_keys):
Put keys ( which are basically strings ) in hash, go over the object and
get key one by one. If the key is present in the hash,
add the key-value pair to result.
2) JSON_OBJECT_TO_ARRAY(Obj) : Create a string variable, Go over the json
object, and add each key value pair as an array into the result.
3) JSON_ARRAY_INTERSECT(arr1, arr2) :
Go over one of the json and add each item of the array
in hash (after normalizing each item). Go over the second array,
search the normalized item one by one in the hash. If item is found,
add it to the result.
Implementation Idea: Holyfoot ( Alexey Botchkov)
Author: tanruixiang and Rucha Deodhar
objects
Idea behind implementation:
We get the json object specified by the json path. Then, transform it into
key-value pairs by going over the json. Get each key-value pair
one-by-one and return the result.
Analysis: null_value is not set if any one of the arguments is NULL. So it
returns 1.
Fix: when either argument is NULL, set null_value to true, so that null can
be returned
starts with '['
Analysis:
When type is non-scalar and the json document has syntax error
then it is not detected during validating type. And Since other validate
functions take const argument, the error state is not stored eventually.
Fix:
After we run out of all schemas (in case of no error during validation) from
the schema list, go over the json document until there is error in parsing
or json doc has ended.
input json schema
Analysis: In case of syntax error while scanning json schema, true is
returned inspite of it being wanring and not error.
Fix: return true instead of false.
Implementation:
Implementation is made according to json schema validation draft 2020
JSON schema basically has same structure as that of json object, consisting
of key-value pairs. So it can be parsed in the same manner as
any json object.
However, none of the keywords are mandatory, so making guess about the
json value type based only on the keywords would be incorrect.
Hence we need separate objects denoting each keyword.
So during create_object_and_handle_keyword() we create appropriate objects
based on the keywords and validate each of them individually on the json
document by calling respective validate() function if the type matches.
If any of them fails, return false, else return true.
Analysis:
When we skip level when path is found, it changes the state of the json
engine. This breaks the sequence for json_get_path_next() which is called at
the end to ensure json document is valid and leads to crash.
Fix:
Use json_scan_next() at the end to check if json document has correct
syntax (is valid).
Analysis:
Parsing json path happens only once. When paring, we set types of path
(types_used) to use later. If the path type has range or wild card, only
then multiple values get added to the result set.
However for each row in the table, types_used still gets
overwritten to default (no multiple values) and is also not set again
(because path is already parsed). Since multiple values depend on the
type of path, they dont get added to result set either.
Fix:
set default for types_used only if path is not parsed already.
Analysis: The JSON functions(JSON_ARRAY[OBJECT|ARRAY_APPEND|ARRAY_INSERT|INSERT|SET|REPLACE]) result is truncated when the function is called based on LONGTEXT field. The overflow occurs when computing the result length due to the LONGTEXT max length is same as uint32 max length. It lead to wrong result length.
Fix: Add static_cast<ulonglong> to avoid uint32 overflow and fix the arguments used.
Analysis: JSON_VALUE() returns "null" string instead of NULL pointer.
Fix: When the type is JSON_VALUE_NULL (which is also a scalar) set
null_value to true and return 0 instead of returning string.
Analysis: JSON_OVERLAPS() does not check nested key-value pair completely.
If there is nested object, then it only scans and validates if two json values
overlap until one of the value (which is of type object) is exhausted.
This does not really check if the two values of keys are exacly the same, instead
it only checks if key-value pair of one is present in key-value pair of the
other
Fix: Normalize the values (which are of type object) and compare
using string compare. This will validate if two values
are exactly the same.