Analysis:
So, there were two problems that needed to be fixed.
1) To fix the crash.
2) After fixing the crash, the result was wrong.
Reason for crash: When we pass the hash to get_intersect_between_arrays(),
We were initialially not passing it value, so the operations were not
performed correctly.
Reason for wrong result: The number of rows that it was returning were same
as that in the table, but, only the first row had correct ouput, rest of
them were NULL (it should also be the result of interection). This was
because we modified the "items" HASH by deleting the "seen" elements.
So for next rows, it did not have the elements it should have in the hash.
Fix:
1) To fix the crash: pass the HASH by reference
2) To fix incorrect result: Maintain a separate "seen" hash, if an item
is found the the "items" hash, delete it ony temporarily and put it in the
seen hash. At then end, put the items from "seen" back into "items" and
reset "seen".
called with empty json arrays, UBSAN runtime error: member access within
null pointer of type 'struct String' in
Item_func_json_array_intersect::prepare_json_and_create_hash
Analysis:
Arguments are not initilized
Fix:
If the arguments are not initialized the the val_json() return NULL, so
if val_json() for either of the arguments, return NULL.
Regression from MDEV-36765 / 2b24ed87f0.
json_unescape can return a string length 0 without it being an error.
The regression caused this 0 length empty string to appear as an
error and result in a NULL return value.
numerous bugs in JSON_DETAILED and multibyte charsets:
* String:chop() must be charset-aware and not simply length--
* String::append(char) must be charset-aware and not simply length++
* json_nice() first removes value_len bytes, then a
certain number of characters
Analysis:
The value gets appended as string instead of unescaped json value
Fix:
Append the value of json in a temporary string and then store it in the
field instead of directly storing as string.
non-default collation_connection
Analysis:
Due to different collation, the string has nothing to chop off.
Fix:
Got rid of chop(), only append " ," only when we have more elements to
add to the result.
Analysis:
When we scan json to get to a beginning according to the path, we end up
scanning json even if we have exhausted it. When eventually returns error.
Fix:
Continue scanning json only if we have not exhausted it and return result
accordingly.
Analysis:
When scanning json and getting the exact path at each step, if a path
is reached, we end up adding the item in the result and immediately get the
next item which results in current path changing.
Fix:
Instead of immediately returning the item, count the occurences of the path
in argument and append in the result as needed.
(returns NULL) and for Date/DateTime returns "INTEGER"
Analysis:
When the first character of json is scanned it is number. Based on that
integer is returned.
Fix:
Scan rest of the json before returning the final result to ensure json is
valid in the first place in order to have a valid type.
Some fixes related to commit f838b2d799 and
Rows_log_event::do_apply_event() and Update_rows_log_event::do_exec_row()
for system-versioned tables were provided by Nikita Malyavin.
This was required by test versioning.rpl,trx_id,row.
Modify the NS_ZERO state in the JSON number parser to allow
exponential notation with a zero coefficient (e.g. 0E-4).
The NS_ZERO state transition on 'E' was updated to move to the
NS_EX state rather than returning a syntax error. Similar change
was made for the NS_ZE1 (negative zero) starter state.
This allows accepted number grammar to include cases like:
- 0E4
- -0E-10
which were previously disallowed. Numeric parsing remains
the same for all other states.
Test cases are added to func_json.test to validate parsing for
various exponential numbers starting with zero coefficients.
All new code of the whole pull request, including one or several files
that are either new files or modified ones, are contributed under the
BSD-new license. I am contributing on behalf of my employer Amazon Web
Services.