Add SQL function CASEFOLD().

Useful for caseless matching. Similar to LOWER(), but avoids edge-case problems with using LOWER() for caseless matching. For collations that support it, CASEFOLD() handles characters with more than two case variations or multi-character case variations. Some characters may fold to uppercase. The results of case folding are also more stable across Unicode versions than LOWER() or UPPER(). Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com Reviewed-by: Ian Lawrence Barwick
2025-07-30 11:03:19 +03:00 · 2025-01-24 14:56:22 -08:00
parent f15538cd27
commit bfc5992069
14 changed files with 278 additions and 3 deletions
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@ -2596,7 +2596,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in

      <row>
       <entry role="func_table_entry"><para role="func_signature">
-        <indexterm>
+        <indexterm id="function-lower">
         <primary>lower</primary>
        </indexterm>
        <function>lower</function> ( <type>text</type> )
@ -2657,7 +2657,7 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in

      <row>
       <entry role="func_table_entry"><para role="func_signature">
-        <indexterm>
+        <indexterm id="function-normalize">
         <primary>normalize</primary>
        </indexterm>
        <indexterm>
@ -3109,6 +3109,48 @@ SELECT NOT(ROW(table.*) IS NOT NULL) FROM TABLE; -- detect at least one null in
       </para></entry>
      </row>

+      <row>
+       <entry role="func_table_entry"><para role="func_signature">
+        <indexterm>
+         <primary>casefold</primary>
+        </indexterm>
+        <function>casefold</function> ( <type>text</type> )
+        <returnvalue>text</returnvalue>
+       </para>
+       <para>
+        Performs case folding of the input string according to the collation.
+        Case folding is similar to case conversion, but the purpose of case
+        folding is to facilitate case-insensitive comparison of strings,
+        whereas the purpose of case conversion is to convert to a particular
+        cased form.  This function can only be used when the server encoding
+        is <literal>UTF8</literal>.
+       </para>
+       <para>
+        Ordinarily, case folding simply converts to lowercase, but there are a
+        few notable exceptions depending on the collation.  For instance, the
+        character <literal>Σ</literal> (U+03A3) has two lowercase forms:
+        <literal>σ</literal> (U+03C3) and <literal>ς</literal> (U+03C2); case
+        folding in the <literal>PG_C_UTF8</literal> collation maps all three
+        forms to <literal>σ</literal>.  Additionally, the result is not
+        necessarily lowercase; some characters may be folded to uppercase.
+       </para>
+       <para>
+        Case folding may change the length of the string.  For instance, in
+        the <literal>PG_UNICODE_FAST</literal> collation, <literal>ß</literal>
+        (U+00DF) folds to <literal>ss</literal>.
+       </para>
+       <para>
+        <function>casefold</function> can be used for Unicode Default Caseless
+        Matching.  It does not always preserve the normalized form of the
+        input string (see <xref linkend="function-normalize"/>).
+       </para>
+       <para>
+        The <literal>libc</literal> provider doesn't support case folding, so
+        <function>casefold</function> is identical to <xref
+        linkend="function-lower"/>.
+       </para></entry>
+      </row>
+
      <row>
       <entry role="func_table_entry"><para role="func_signature">
        <indexterm>