mirror of
https://github.com/postgres/postgres.git
synced 2026-01-26 09:41:40 +03:00
It's been almost a year since we last did this, and upstream has been busy. They've added stemmers for Polish and Esperanto, and also deprecated their old Dutch stemmer in favor of the Kraaij-Pohlmann algorithm. (The "dutch" stemmer is now the latter, and "dutch_porter" is the old algorithm.) Upstream also decided to rename their internal header "header.h" to something less generic: "snowball_runtime.h". Seems like a good thing, but it complicates this patch a bit because we were relying on interposing our own version of "header.h" to control system header inclusion order. (We're partially failing at that now, because now the generated stemmer files include <stddef.h> before snowball_runtime.h. I think that'll be okay, but if the buildfarm complains then we'll have to do more-extensive editing of the generated files.) I realized that we weren't documenting the available stemmers in any user-visible place, except indirectly through sample \dFd output. That's incomplete because we only provide built-in dictionaries for the recommended stemmers for each language, not alternative stemmers such as dutch_porter. So I added a list to the documentation. I did not do anything with the stopword lists. If those are still available from snowballstem.org, they are mighty well hidden. Discussion: https://postgr.es/m/1185975.1767569534@sss.pgh.pa.us