mirror of
https://github.com/postgres/postgres.git
synced 2025-06-23 14:01:44 +03:00
While syncing our timezone code with IANA's updates in commit1c1a7cbd6
, I'd chosen not to adopt the code they conditionally compile under #ifdef ALL_STATE. The main thing that that drives is that the space for gmtime and localtime timezone definitions isn't statically allocated, but is malloc'd on first use. I reasoned we didn't need that logic: we don't have localtime() at all, and we always initialize TimeZone to GMT so we always need that one. But there is one other thing ALL_STATE does, which is to make tzload() malloc its transient workspace instead of just declaring it as a local variable. It turns out that that local variable occupies 78K. Even worse is that, at least for common US timezone settings, there's a recursive call to parse the "posixrules" zone name, making peak stack consumption to select a time zone upwards of 150K. That's an uncomfortably large fraction of our STACK_DEPTH_SLOP safety margin, and could result in outright crashes if we try to reduce STACK_DEPTH_SLOP as has been discussed recently. Furthermore, this means that the postmaster's peak stack consumption is several times that of a backend running typical queries (since, except on Windows, backends inherit the timezone GUC values and don't ever run this code themselves unless you do SET TIMEZONE). That's completely backwards from a safety perspective. Hence, adopt the ALL_STATE rather than non-ALL_STATE variant of tzload(), while not changing the other code aspects that symbol controls. The risk of an ENOMEM error from malloc() seems less than that of a SIGSEGV from stack overrun. This should probably get back-patched along with1c1a7cbd6
and followon fixes, whenever we decide we have enough confidence in the updates to do that.
src/timezone/README This is a PostgreSQL adapted version of the IANA timezone library from http://www.iana.org/time-zones The latest versions of both the tzdata and tzcode tarballs are normally available right from that page. Historical versions can be found elsewhere on the site. Since time zone rules change frequently in some parts of the world, we should endeavor to update the data files before each PostgreSQL release. The code need not be updated as often, but we must track changes that might affect interpretation of the data files. Time Zone data ============== The data files under data/ are an exact copy of the latest tzdata set, except that we omit some files that are not of interest for our purposes. While the files under data/ can just be duplicated when updating, manual effort is needed to update the time zone abbreviation lists under tznames/. These need to be changed whenever new abbreviations are invented or the UTC offset associated with an existing abbreviation changes. To detect if this has happened, after installing new files under data/ do make abbrevs.txt which will produce a file showing all abbreviations that are in current use according to the data/ files. Compare this to known_abbrevs.txt, which is the list that existed last time the tznames/ files were updated. Update tznames/ as seems appropriate, then replace known_abbrevs.txt in the same commit. Usually, if a known abbreviation has changed meaning, the appropriate fix is to make it refer to a long-form zone name instead of a fixed GMT offset. When there has been a new release of Windows (probably including Service Packs), the list of matching timezones need to be updated. Run the script in src/tools/win32tzlist.pl on a Windows machine running this new release and apply any new timezones that it detects. Never remove any mappings in case they are removed in Windows, since we still need to match properly on the old version. Time Zone code ============== The code in this directory is currently synced with tzcode release 2016c. There are many cosmetic (and not so cosmetic) differences from the original tzcode library, but diffs in the upstream version should usually be propagated to our version. Here are some notes about that. For the most part we want to use the upstream code as-is, but there are several considerations preventing an exact match: * For readability/maintainability we reformat the code to match our own conventions; this includes pgindent'ing it and getting rid of upstream's overuse of "register" declarations. (It used to include conversion of old-style function declarations to C89 style, but thank goodness they fixed that.) * We need the code to follow Postgres' portability conventions; this includes relying on configure's results rather than hand-hacked #defines, and not relying on <stdint.h> features that may not exist on old systems. (In particular this means using Postgres' definitions of the int32 and int64 typedefs, not int_fast32_t/int_fast64_t.) * Since Postgres is typically built on a system that has its own copy of the <time.h> functions, we must avoid conflicting with those. This mandates renaming typedef time_t to pg_time_t, and similarly for most other exposed names. * We have exposed the tzload() and tzparse() internal functions, and slightly modified the API of the former, in part because it now relies on our own pg_open_tzfile() rather than opening files for itself. * There's a fair amount of code we don't need and have removed, including all the nonstandard optional APIs. We have also added a few functions of our own at the bottom of localtime.c. * In zic.c, we have added support for a -P (print_abbrevs) switch, which is used to create the "abbrevs.txt" summary of currently-in-use zone abbreviations that was described above. The most convenient way to compare a new tzcode release to our code is to first run the tzcode source files through a sed filter like this: sed -r \ -e 's/^([ \t]*)\*\*([ \t])/\1 *\2/' \ -e 's/^([ \t]*)\*\*$/\1 */' \ -e 's|^\*/| */|' \ -e 's/\bregister[ \t]//g' \ -e 's/int_fast32_t/int32/g' \ -e 's/int_fast64_t/int64/g' \ -e 's/struct[ \t]+tm\b/struct pg_tm/g' \ -e 's/\btime_t\b/pg_time_t/g' \ and then run them through pgindent. (The first three sed patterns deal with conversion of their block comment style to something pgindent won't make a hash of; the remainder address other points noted above.) After that, the files can be diff'd directly against our corresponding files.