mirror of
https://github.com/postgres/postgres.git
synced 2025-08-27 07:42:10 +03:00
This patch fixes a couple of low-probability bugs that could lead to reporting an irrelevant errno value (and hence possibly a wrong SQLSTATE) concerning directory-open or file-open failures. It also fixes places where we took shortcuts in reporting such errors, either by using elog instead of ereport or by using ereport but forgetting to specify an errcode. And it eliminates a lot of just plain redundant error-handling code. In service of all this, export fd.c's formerly-static function ReadDirExtended, so that external callers can make use of the coding pattern dir = AllocateDir(path); while ((de = ReadDirExtended(dir, path, LOG)) != NULL) if they'd like to treat directory-open failures as mere LOG conditions rather than errors. Also fix FreeDir to be a no-op if we reach it with dir == NULL, as such a coding pattern would cause. Then, remove code at many call sites that was throwing an error or log message for AllocateDir failure, as ReadDir or ReadDirExtended can handle that job just fine. Aside from being a net code savings, this gets rid of a lot of not-quite-up-to-snuff reports, as mentioned above. (In some places these changes result in replacing a custom error message such as "could not open tablespace directory" with more generic wording "could not open directory", but it was agreed that the custom wording buys little as long as we report the directory name.) In some other call sites where we can't just remove code, change the error reports to be fully project-style-compliant. Also reorder code in restoreTwoPhaseData that was acquiring a lock between AllocateDir and ReadDir; in the unlikely but surely not impossible case that LWLockAcquire changes errno, AllocateDir failures would be misreported. There is no great value in opening the directory before acquiring TwoPhaseStateLock, so just do it in the other order. Also fix CheckXLogRemoved to guarantee that it preserves errno, as quite a number of call sites are implicitly assuming. (Again, it's unlikely but I think not impossible that errno could change during a SpinLockAcquire. If so, this function was broken for its own purposes as well as breaking callers.) And change a few places that were using not-per-project-style messages, such as "could not read directory" when "could not open directory" is more correct. Back-patch the exporting of ReadDirExtended, in case we have occasion to back-patch some fix that makes use of it; it's not needed right now but surely making it global is pretty harmless. Also back-patch the restoreTwoPhaseData and CheckXLogRemoved fixes. The rest of this is essentially cosmetic and need not get back-patched. Michael Paquier, with a bit of additional work by me Discussion: https://postgr.es/m/CAB7nPqRpOCxjiirHmebEFhXVTK7V5Jvw4bz82p7Oimtsm3TyZA@mail.gmail.com
src/timezone/README This is a PostgreSQL adapted version of the IANA timezone library from https://www.iana.org/time-zones The latest version of the timezone data and library source code is available right from that page. It's best to get the merged file tzdb-NNNNX.tar.lz, since the other archive formats omit tzdata.zi. Historical versions, as well as release announcements, can be found elsewhere on the site. Since time zone rules change frequently in some parts of the world, we should endeavor to update the data files before each PostgreSQL release. The code need not be updated as often, but we must track changes that might affect interpretation of the data files. Time Zone data ============== We distribute the time zone source data as-is under src/timezone/data/. Currently, we distribute just the abbreviated single-file format "tzdata.zi", to reduce the size of our tarballs as well as churn in our git repo. Feeding that file to zic produces the same compiled output as feeding the bulkier individual data files would do. While data/tzdata.zi can just be duplicated when updating, manual effort is needed to update the time zone abbreviation lists under tznames/. These need to be changed whenever new abbreviations are invented or the UTC offset associated with an existing abbreviation changes. To detect if this has happened, after installing new files under data/ do make abbrevs.txt which will produce a file showing all abbreviations that are in current use according to the data/ files. Compare this to known_abbrevs.txt, which is the list that existed last time the tznames/ files were updated. Update tznames/ as seems appropriate, then replace known_abbrevs.txt in the same commit. Usually, if a known abbreviation has changed meaning, the appropriate fix is to make it refer to a long-form zone name instead of a fixed GMT offset. The core regression test suite does some simple validation of the zone data and abbreviations data (notably by checking that the pg_timezone_names and pg_timezone_abbrevs views don't throw errors). It's worth running it as a cross-check on proposed updates. When there has been a new release of Windows (probably including Service Packs), the list of matching timezones need to be updated. Run the script in src/tools/win32tzlist.pl on a Windows machine running this new release and apply any new timezones that it detects. Never remove any mappings in case they are removed in Windows, since we still need to match properly on the old version. Time Zone code ============== The code in this directory is currently synced with tzcode release 2017c. There are many cosmetic (and not so cosmetic) differences from the original tzcode library, but diffs in the upstream version should usually be propagated to our version. Here are some notes about that. For the most part we want to use the upstream code as-is, but there are several considerations preventing an exact match: * For readability/maintainability we reformat the code to match our own conventions; this includes pgindent'ing it and getting rid of upstream's overuse of "register" declarations. (It used to include conversion of old-style function declarations to C89 style, but thank goodness they fixed that.) * We need the code to follow Postgres' portability conventions; this includes relying on configure's results rather than hand-hacked #defines, and not relying on <stdint.h> features that may not exist on old systems. (In particular this means using Postgres' definitions of the int32 and int64 typedefs, not int_fast32_t/int_fast64_t.) * Since Postgres is typically built on a system that has its own copy of the <time.h> functions, we must avoid conflicting with those. This mandates renaming typedef time_t to pg_time_t, and similarly for most other exposed names. * We have exposed the tzload() and tzparse() internal functions, and slightly modified the API of the former, in part because it now relies on our own pg_open_tzfile() rather than opening files for itself. * tzparse() is adjusted to cache the result of loading the TZDEFRULES zone, so that that's not repeated more than once per process. * There's a fair amount of code we don't need and have removed, including all the nonstandard optional APIs. We have also added a few functions of our own at the bottom of localtime.c. * In zic.c, we have added support for a -P (print_abbrevs) switch, which is used to create the "abbrevs.txt" summary of currently-in-use zone abbreviations that was described above. The most convenient way to compare a new tzcode release to our code is to first run the tzcode source files through a sed filter like this: sed -r \ -e 's/^([ \t]*)\*\*([ \t])/\1 *\2/' \ -e 's/^([ \t]*)\*\*$/\1 */' \ -e 's|^\*/| */|' \ -e 's/\bregister[ \t]//g' \ -e 's/int_fast32_t/int32/g' \ -e 's/int_fast64_t/int64/g' \ -e 's/struct[ \t]+tm\b/struct pg_tm/g' \ -e 's/\btime_t\b/pg_time_t/g' \ and then run them through pgindent. (The first three sed patterns deal with conversion of their block comment style to something pgindent won't make a hash of; the remainder address other points noted above.) After that, the files can be diff'd directly against our corresponding files.