Add levenshtein_less_equal, optimized version for small distances.

Alexander Korotkov, heavily revised by me.
2025-07-28 23:42:10 +03:00 · 2010-10-19 09:51:06 -04:00
parent 262c1a42dc
commit 604ab08145
5 changed files with 460 additions and 214 deletions
--- a/doc/src/sgml/fuzzystrmatch.sgml
+++ b/doc/src/sgml/fuzzystrmatch.sgml
@ -84,6 +84,8 @@ SELECT * FROM s WHERE difference(s.nm, 'john') &gt; 2;
 <synopsis>
 levenshtein(text source, text target, int ins_cost, int del_cost, int sub_cost) returns int
 levenshtein(text source, text target) returns int
+levenshtein_less_equal(text source, text target, int ins_cost, int del_cost, int sub_cost, int max_d) returns int
+levenshtein_less_equal(text source, text target, int max_d) returns int
 </synopsis>

  <para>
@ -92,6 +94,11 @@ levenshtein(text source, text target) returns int
   specify how much to charge for a character insertion, deletion, or
   substitution, respectively.  You can omit the cost parameters, as in
   the second version of the function; in that case they all default to 1.
+   <literal>levenshtein_less_equal</literal> is accelerated version of
+   levenshtein functon for low values of distance. If actual distance
+   is less or equal then max_d, then <literal>levenshtein_less_equal</literal>
+   returns accurate value of it. Otherwise this function returns value
+   which is greater than max_d.
  </para>

  <para>
@ -110,6 +117,18 @@ test=# SELECT levenshtein('GUMBO', 'GAMBOL', 2,1,1);
 -------------
           3
 (1 row)
+
+test=# SELECT levenshtein_less_equal('extensive', 'exhaustive',2);
+ levenshtein_less_equal
+------------------------
+                      3
+(1 row)
+
+test=# SELECT levenshtein_less_equal('extensive', 'exhaustive',4);
+ levenshtein_less_equal
+------------------------
+                      4
+(1 row)
 </screen>
 </sect2>