postgres

mirror of https://github.com/postgres/postgres.git synced 2025-11-19 13:42:17 +03:00

Author	SHA1	Message	Date
Jeff Davis	bfc5992069	Add SQL function CASEFOLD(). Useful for caseless matching. Similar to LOWER(), but avoids edge-case problems with using LOWER() for caseless matching. For collations that support it, CASEFOLD() handles characters with more than two case variations or multi-character case variations. Some characters may fold to uppercase. The results of case folding are also more stable across Unicode versions than LOWER() or UPPER(). Discussion: https://postgr.es/m/a1886ddfcd8f60cb3e905c93009b646b4cfb74c5.camel%40j-davis.com Reviewed-by: Ian Lawrence Barwick	2025-01-24 14:56:22 -08:00
Jeff Davis	d3d0983169	Support PG_UNICODE_FAST locale in the builtin collation provider. The PG_UNICODE_FAST locale uses code point sort order (fast, memcmp-based) combined with Unicode character semantics. The character semantics are based on Unicode full case mapping. Full case mapping can map a single codepoint to multiple codepoints, such as "ß" uppercasing to "SS". Additionally, it handles context-sensitive mappings like the "final sigma", and it uses titlecase mappings such as "ǅ" when titlecasing (rather than plain uppercase mappings). Importantly, the uppercasing of "ß" as "SS" is specifically mentioned by the SQL standard. In Postgres, UCS_BASIC uses plain ASCII semantics for case mapping and pattern matching, so if we changed it to use the PG_UNICODE_FAST locale, it would offer better compliance with the standard. For now, though, do not change the behavior of UCS_BASIC. Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org Reviewed-by: Peter Eisentraut, Daniel Verite	2025-01-17 15:56:30 -08:00
Jeff Davis	286a365b9c	Support Unicode full case mapping and conversion. Generate tables from Unicode SpecialCasing.txt to support more sophisticated case mapping behavior: * support case mappings to multiple codepoints, such as "ß" uppercasing to "SS" * support conditional case mappings, such as the "final sigma" * support titlecase variants, such as "ǆ" uppercasing to "Ǆ" but titlecasing to "ǅ" Discussion: https://postgr.es/m/ddfd67928818f138f51635712529bc5e1d25e4e7.camel@j-davis.com Discussion: https://postgr.es/m/27bb0e52-801d-4f73-a0a4-02cfdd4a9ada@eisentraut.org Reviewed-by: Peter Eisentraut, Daniel Verite	2025-01-17 15:56:20 -08:00
Jeff Davis	4f5cef2607	Move code for collation version into provider-specific files. Author: Andreas Karlsson Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477%40proxel.se	2025-01-08 13:54:07 -08:00
Bruce Momjian	50e6eb731d	Update copyright for 2025 Backpatch-through: 13	2025-01-01 11:21:55 -05:00
Jeff Davis	86a5d6006a	Refactor string case conversion into provider-specific files. Create API entry points pg_strlower(), etc., that work with any provider and give the caller control over the destination buffer. Then, move provider-specific logic into pg_locale_builtin.c, pg_locale_icu.c, and pg_locale_libc.c as appropriate. Discussion: https://postgr.es/m/7aa46d77b377428058403723440862d12a8a129a.camel@j-davis.com	2024-12-16 09:35:18 -08:00
Jeff Davis	1ba0782ce9	Perform provider-specific initialization in new functions. Reviewed-by: Andreas Karlsson Discussion: https://postgr.es/m/4548a168-62cd-457b-8d06-9ba7b985c477@proxel.se	2024-12-02 23:24:35 -08:00

7 Commits