1
0
mirror of https://github.com/sqlite/sqlite.git synced 2025-07-29 08:01:23 +03:00
Commit Graph

14 Commits

Author SHA1 Message Date
dan
0cd2ffffb7 Fix the fts5 trigram tokenizer so that it handles non-nul-terminated strings.
FossilOrigin-Name: 84f4e37178a65e3128ac0240d37ac40df08b4050ab070d10707e35d11dcbeb10
2024-11-11 19:49:26 +00:00
dan
59b4f75e0f Add test case for fts5 trigram tokenizer.
FossilOrigin-Name: ba358d265b7ee360d62b5219faaa1010ea90dac4e20cc7adc3ebd46161a65f94
2024-10-26 18:09:13 +00:00
dan
ef2401f669 Tests to improve coverage of fts5_expr.c.
FossilOrigin-Name: f4b839e5265700b1a89066d1b6e0d0d010852a69c5da3d75d2c41624dbf3c0af
2024-08-17 19:07:13 +00:00
dan
b651084713 Add tests to restore coverage of fts5_tokenizer.c.
FossilOrigin-Name: 8f9257361b05e368bf433e56d0698923b0f97d12e7c0ad7760aaab6746c0e467
2024-08-17 17:22:49 +00:00
dan
189c41221d Further tests and fixes for this branch.
FossilOrigin-Name: d27985245a0e8c0d6b04323c98b26b6a8fb4e489fa8f5f3234252c7c198f23c8
2024-08-15 18:50:13 +00:00
drh
e9b919d550 Improved robustness of parsing of tokenize= arguments in FTS5.
[forum:/forumpost/171bcc2bcd|Forum post 171bcc2bcd].

FossilOrigin-Name: d9f726ade6b258f8723f90d0b04a4682e885e30939eb29773913e4dfc8e85503
2024-08-06 22:49:01 +00:00
dan
d548f74024 Fix a problem with the fts5 highlight() and snippet() functions when used with tokenizers like "trigram" that output overlapping tokens. Forum post [forum:/forumpost/63735293ec|63735293ec].
FossilOrigin-Name: d570aa02f79b1d7d3889e33f9eebab1b7edcf5231b1357451eed9a538607de54
2023-10-24 15:53:02 +00:00
dan
a3e6192941 Fix a problem with the fts5 trigram tokenizer and LIKE or GLOB patterns for which contain runs of 2 or fewer non-wildcard characters that are 3 or more bytes when encoded as utf-8.
FossilOrigin-Name: 00714b39b39c51519edbc0194f98c7275fecf96763a06fd95db6e1d81bb9f1f1
2023-02-10 17:17:04 +00:00
drh
8210233c7b Revise tests cases to align with the new EXPLAIN QUERY PLAN output.
FossilOrigin-Name: 50fbd532602d2c316813046ed6be8be2991c281eb5f295c4c28520a0de73862c
2021-03-20 15:11:29 +00:00
dan
f46be6a1b9 Allow fts5 trigram tables created with detail=column or detail=none to optimize LIKE and GLOB queries. Allow case-insensitive tables to optimize GLOB as well as LIKE.
FossilOrigin-Name: 64782463be62b72b5cd0bfaa7c9b69aa487d807c5fe0e65a272080b7739fd21b
2020-10-05 16:41:56 +00:00
dan
12a6a1eaf9 Fix a segfault caused by running "column LIKE NULL" against an fts5 table using the trigram tokenizer. Fix for [e33ee62575fc22].
FossilOrigin-Name: 6e72a08de764077f2bba6f7e3b99ea29001941671a971f2ccf7ceeb9c682fb1a
2020-10-03 17:06:02 +00:00
dan
95dca8d0cf FTS5 does not handle tokens that contain embedded nul characters. Prevent the trigram tokenizer from returning such tokens. Fix for [2ba5930b2].
FossilOrigin-Name: b1d048748c054575425a4bebf0c5d09962f9329d5ce6a978cf54e508b238584c
2020-10-03 14:36:06 +00:00
dan
ccf578d435 Add tests for the trigram tokenizer. Fix minor issues.
FossilOrigin-Name: 897ced99b44085012aa44d3264940dcbd4c77b295a894a1b58fb2c03a0f7fee8
2020-10-01 16:10:22 +00:00
dan
33a99fad08 Add experimental unicode-aware trigram tokenizer to fts5. And support for LIKE and GLOB optimizations for fts5 tables that use said tokenizer.
FossilOrigin-Name: 0d7810c1aea93c0a3da1ccc4911dbce8a1b6e1dbfe1ab7e800289a0c783b5985
2020-09-30 20:35:37 +00:00