Fix array size allocation for HashAggregate hash keys.

When there were duplicate columns in the hash key list, the array sizes could be miscomputed, resulting in access off the end of the array. Adjust the computation to ensure the array is always large enough. (I considered whether the duplicates could be removed in planning, but I can't rule out the possibility that duplicate columns might have different hash functions assigned. Simpler to just make sure it works at execution time regardless.) Bug apparently introduced in fc4b3dea2 as part of narrowing down the tuples stored in the hashtable. Reported by Colm McHugh of Salesforce, though I didn't use their patch. Backpatch back to version 10 where the bug was introduced. Discussion: https://postgr.es/m/CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
2025-12-10 14:22:35 +03:00 · 2019-05-23 15:26:01 +01:00
parent 156c0c2dff
commit 44e95b5728
3 changed files with 47 additions and 7 deletions
--- a/src/test/regress/expected/aggregates.out
+++ b/src/test/regress/expected/aggregates.out
@@ -2270,3 +2270,21 @@ select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*)
 ba       |    0 |     1
 (2 rows)

+-- Make sure that generation of HashAggregate for uniqification purposes
+-- does not lead to array overflow due to unexpected duplicate hash keys
+-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
+explain (costs off)
+  select 1 from tenk1
+   where (hundred, thousand) in (select twothousand, twothousand from onek);
+                         QUERY PLAN                          
+-------------------------------------------------------------
+ Hash Join
+   Hash Cond: (tenk1.hundred = onek.twothousand)
+   ->  Seq Scan on tenk1
+         Filter: (hundred = thousand)
+   ->  Hash
+         ->  HashAggregate
+               Group Key: onek.twothousand, onek.twothousand
+               ->  Seq Scan on onek
+(8 rows)
+
--- a/src/test/regress/sql/aggregates.sql
+++ b/src/test/regress/sql/aggregates.sql
@@ -988,3 +988,10 @@ select v||'a', case v||'a' when 'aa' then 1 else 0 end, count(*)
 select v||'a', case when v||'a' = 'aa' then 1 else 0 end, count(*)
  from unnest(array['a','b']) u(v)
 group by v||'a' order by 1;
+
+-- Make sure that generation of HashAggregate for uniqification purposes
+-- does not lead to array overflow due to unexpected duplicate hash keys
+-- see CAFeeJoKKu0u+A_A9R9316djW-YW3-+Gtgvy3ju655qRHR3jtdA@mail.gmail.com
+explain (costs off)
+  select 1 from tenk1
+   where (hundred, thousand) in (select twothousand, twothousand from onek);