Presently, pg_dump generates commands like SELECT pg_catalog.lo_create('5432'); ALTER LARGE OBJECT 5432 OWNER TO alice; GRANT SELECT ON LARGE OBJECT 5432 TO bob; for each large object. This is particularly slow at restore time, especially when there are tens or hundreds of millions of large objects. From reports and personal experience, such slow restores seem to be most painful when encountered during pg_upgrade. This commit teaches pg_dump to instead dump pg_largeobject_metadata and the corresponding pg_shdepend rows when in binary upgrade mode, i.e., pg_dump now generates commands like COPY pg_catalog.pg_largeobject_metadata (oid, lomowner, lomacl) FROM stdin; 5432 16384 {alice=rw/alice,bob=r/alice} \. COPY pg_catalog.pg_shdepend (dbid, classid, objid, objsubid, refclassid, refobjid, deptype) FROM stdin; 5 2613 5432 0 1260 16384 o 5 2613 5432 0 1260 16385 a \. Testing indicates the COPY approach can be significantly faster. To do any better, we'd probably need to find a way to copy/link pg_largeobject_metadata's files during pg_upgrade, which would be limited to upgrades from >= v16 (since commit7b378237aa
changed the storage format for aclitem, which is used for pg_largeobject_metadata.lomacl). Note that this change only applies to binary upgrade mode (i.e., dumps initiated by pg_upgrade) since it inserts rows directly into catalogs. Also, this optimization can only be used for upgrades from >= v12 because pg_largeobject_metadata was created WITH OIDS in older versions, which prevents pg_dump from handling pg_largeobject_metadata.oid properly. With some extra effort, it might be possible to support upgrades from older versions, but the added complexity didn't seem worth it to support versions that will have been out-of-support for nearly 3 years by the time this change is released. Experienced hackers may remember that prior to v12, pg_upgrade copied/linked pg_largeobject_metadata's files (see commit12a53c732c
). Besides the aforementioned storage format issues, this approach failed to transfer the relevant pg_shdepend rows, and pg_dump still had to generate an lo_create() command per large object so that creating the dependent comments and security labels worked. We could perhaps adopt a hybrid approach for upgrades from v16 and newer (i.e., generate lo_create() commands for each large object, copy/link pg_largeobject_metadata's files, and COPY the relevant pg_shdepend rows), but further testing is needed. Reported-by: Hannu Krosing <hannuk@google.com> Suggested-by: Tom Lane <tgl@sss.pgh.pa.us> Reviewed-by: Hannu Krosing <hannuk@google.com> Reviewed-by: Nitin Motiani <nitinmotiani@google.com> Reviewed-by: Tom Lane <tgl@sss.pgh.pa.us> Discussion: https://postgr.es/m/CAMT0RQSS-6qLH%2BzYsOeUbAYhop3wmQTkNmQpo5--QRDUR%2BqYmQ%40mail.gmail.com
PostgreSQL Database Management System
This directory contains the source code distribution of the PostgreSQL database management system.
PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. This distribution also contains C language bindings.
Copyright and license information can be found in the file COPYRIGHT.
General documentation about this version of PostgreSQL can be found at https://www.postgresql.org/docs/devel/. In particular, information about building PostgreSQL from the source code can be found at https://www.postgresql.org/docs/devel/installation.html.
The latest version of this software, and related software, may be obtained at https://www.postgresql.org/download/. For more information look at our web site located at https://www.postgresql.org/.