Fix behavior of ~> (cube, int) operator

~> (cube, int) operator was especially designed for knn-gist search. However, it appears that knn-gist search can't work correctly with current behavior of this operator when dataset contains cubes of variable dimensionality. In this case, the same value of second operator argument can point to different dimension depending on dimensionality of particular cube. Such behavior is incompatible with gist indexing of cubes, and knn-gist doesn't work correctly for it. This patch changes behavior of ~> (cube, int) operator by introducing dimension numbering where value of second argument unambiguously identifies number of dimension. With new behavior, this operator can be correctly supported by knn-gist. Relevant changes to cube operator class are also included. Backpatch to v9.6 where operator was introduced. Since behavior of ~> (cube, int) operator is changed, depending entities must be refreshed after upgrade. Such as, expression indexes using this operator must be reindexed, materialized views must be rebuilt, stored procedures and client code must be revised to correctly use new behavior. That should be mentioned in release notes. Noticed by: Tomas Vondra Author: Alexander Korotkov Reviewed by: Tomas Vondra, Andrey Borodin Discussion: https://www.postgresql.org/message-id/flat/a9657f6a-b497-36ff-e56-482a2c7e3292@2ndquadrant.com
2025-11-03 09:13:20 +03:00 · 2018-01-11 14:43:13 +03:00
parent 1226051948
commit bda5281fdc
7 changed files with 901 additions and 527 deletions
--- a/contrib/cube/sql/cube.sql
+++ b/contrib/cube/sql/cube.sql
@@ -372,20 +372,29 @@ SELECT * FROM test_cube WHERE c && '(3000,1000),(0,0)' ORDER BY c;
 -- Test sorting
 SELECT * FROM test_cube WHERE c && '(3000,1000),(0,0)' GROUP BY c ORDER BY c;

-- kNN with index
+-- Test kNN
+INSERT INTO test_cube VALUES ('(1,1)'), ('(100000)'), ('(0, 100000)'); -- Some corner cases
+SET enable_seqscan = false;
+
+-- Test different metrics
 SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <-> '(100, 100),(500, 500)'::cube LIMIT 5;
 SELECT *, c <=> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <=> '(100, 100),(500, 500)'::cube LIMIT 5;
 SELECT *, c <#> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <#> '(100, 100),(500, 500)'::cube LIMIT 5;

-- kNN-based sorting
-SELECT * FROM test_cube ORDER BY c~>1 LIMIT 15; -- ascending by 1st coordinate of lower left corner
-SELECT * FROM test_cube ORDER BY c~>4 LIMIT 15; -- ascending by 2nd coordinate or upper right corner
-SELECT * FROM test_cube ORDER BY c~>1 DESC LIMIT 15; -- descending by 1st coordinate of lower left corner
-SELECT * FROM test_cube ORDER BY c~>4 DESC LIMIT 15; -- descending by 2nd coordinate or upper right corner
+-- Test sorting by coordinates
+SELECT c~>1, c FROM test_cube ORDER BY c~>1 LIMIT 15; -- ascending by left bound
+SELECT c~>2, c FROM test_cube ORDER BY c~>2 LIMIT 15; -- ascending by right bound
+SELECT c~>3, c FROM test_cube ORDER BY c~>3 LIMIT 15; -- ascending by lower bound
+SELECT c~>4, c FROM test_cube ORDER BY c~>4 LIMIT 15; -- ascending by upper bound

-- same thing for index with points
-CREATE TABLE test_point(c cube);
-INSERT INTO test_point(SELECT cube(array[c->1,c->2,c->3,c->4]) FROM test_cube);
-CREATE INDEX ON test_point USING gist(c);
-SELECT * FROM test_point ORDER BY c~>1, c~>2 LIMIT 15; -- ascending by 1st then by 2nd coordinate
-SELECT * FROM test_point ORDER BY c~>4 DESC LIMIT 15; -- descending by 1st coordinate
+-- Same queries with sequential scan (should give the same results as above)
+RESET enable_seqscan;
+SET enable_indexscan = OFF;
+SELECT *, c <-> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <-> '(100, 100),(500, 500)'::cube LIMIT 5;
+SELECT *, c <=> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <=> '(100, 100),(500, 500)'::cube LIMIT 5;
+SELECT *, c <#> '(100, 100),(500, 500)'::cube as dist FROM test_cube ORDER BY c <#> '(100, 100),(500, 500)'::cube LIMIT 5;
+SELECT c~>1, c FROM test_cube ORDER BY c~>1 LIMIT 15; -- ascending by left bound
+SELECT c~>2, c FROM test_cube ORDER BY c~>2 LIMIT 15; -- ascending by right bound
+SELECT c~>3, c FROM test_cube ORDER BY c~>3 LIMIT 15; -- ascending by lower bound
+SELECT c~>4, c FROM test_cube ORDER BY c~>4 LIMIT 15; -- ascending by upper bound
+RESET enable_indexscan;