1
0
mirror of https://github.com/postgres/postgres.git synced 2025-07-28 23:42:10 +03:00

Allow configurable LZ4 TOAST compression.

There is now a per-column COMPRESSION option which can be set to pglz
(the default, and the only option in up until now) or lz4. Or, if you
like, you can set the new default_toast_compression GUC to lz4, and
then that will be the default for new table columns for which no value
is specified. We don't have lz4 support in the PostgreSQL code, so
to use lz4 compression, PostgreSQL must be built --with-lz4.

In general, TOAST compression means compression of individual column
values, not the whole tuple, and those values can either be compressed
inline within the tuple or compressed and then stored externally in
the TOAST table, so those properties also apply to this feature.

Prior to this commit, a TOAST pointer has two unused bits as part of
the va_extsize field, and a compessed datum has two unused bits as
part of the va_rawsize field. These bits are unused because the length
of a varlena is limited to 1GB; we now use them to indicate the
compression type that was used. This means we only have bit space for
2 more built-in compresison types, but we could work around that
problem, if necessary, by introducing a new vartag_external value for
any further types we end up wanting to add. Hopefully, it won't be
too important to offer a wide selection of algorithms here, since
each one we add not only takes more coding but also adds a build
dependency for every packager. Nevertheless, it seems worth doing
at least this much, because LZ4 gets better compression than PGLZ
with less CPU usage.

It's possible for LZ4-compressed datums to leak into composite type
values stored on disk, just as it is for PGLZ. It's also possible for
LZ4-compressed attributes to be copied into a different table via SQL
commands such as CREATE TABLE AS or INSERT .. SELECT.  It would be
expensive to force such values to be decompressed, so PostgreSQL has
never done so. For the same reasons, we also don't force recompression
of already-compressed values even if the target table prefers a
different compression method than was used for the source data.  These
architectural decisions are perhaps arguable but revisiting them is
well beyond the scope of what seemed possible to do as part of this
project.  However, it's relatively cheap to recompress as part of
VACUUM FULL or CLUSTER, so this commit adjusts those commands to do
so, if the configured compression method of the table happens not to
match what was used for some column value stored therein.

Dilip Kumar. The original patches on which this work was based were
written by Ildus Kurbangaliev, and those were patches were based on
even earlier work by Nikita Glukhov, but the design has since changed
very substantially, since allow a potentially large number of
compression methods that could be added and dropped on a running
system proved too problematic given some of the architectural issues
mentioned above; the choice of which specific compression method to
add first is now different; and a lot of the code has been heavily
refactored.  More recently, Justin Przyby helped quite a bit with
testing and reviewing and this version also includes some code
contributions from him. Other design input and review from Tomas
Vondra, Álvaro Herrera, Andres Freund, Oleg Bartunov, Alexander
Korotkov, and me.

Discussion: http://postgr.es/m/20170907194236.4cefce96%40wp.localdomain
Discussion: http://postgr.es/m/CAFiTN-uUpX3ck%3DK0mLEk-G_kUQY%3DSNOTeqdaNRR9FMdQrHKebw%40mail.gmail.com
This commit is contained in:
Robert Haas
2021-03-19 15:10:38 -04:00
parent e589c4890b
commit bbe0a81db6
61 changed files with 2261 additions and 160 deletions

170
configure vendored
View File

@ -699,6 +699,9 @@ with_gnu_ld
LD
LDFLAGS_SL
LDFLAGS_EX
LZ4_LIBS
LZ4_CFLAGS
with_lz4
with_zlib
with_system_tzdata
with_libxslt
@ -864,6 +867,7 @@ with_libxml
with_libxslt
with_system_tzdata
with_zlib
with_lz4
with_gnu_ld
with_ssl
with_openssl
@ -891,6 +895,8 @@ ICU_LIBS
XML2_CONFIG
XML2_CFLAGS
XML2_LIBS
LZ4_CFLAGS
LZ4_LIBS
LDFLAGS_EX
LDFLAGS_SL
PERL
@ -1569,6 +1575,7 @@ Optional Packages:
--with-system-tzdata=DIR
use system time zone data in DIR
--without-zlib do not use Zlib
--with-lz4 build with LZ4 support
--with-gnu-ld assume the C compiler uses GNU ld [default=no]
--with-ssl=LIB use LIB for SSL/TLS support (openssl)
--with-openssl obsolete spelling of --with-ssl=openssl
@ -1596,6 +1603,8 @@ Some influential environment variables:
XML2_CONFIG path to xml2-config utility
XML2_CFLAGS C compiler flags for XML2, overriding pkg-config
XML2_LIBS linker flags for XML2, overriding pkg-config
LZ4_CFLAGS C compiler flags for LZ4, overriding pkg-config
LZ4_LIBS linker flags for LZ4, overriding pkg-config
LDFLAGS_EX extra linker flags for linking executables only
LDFLAGS_SL extra linker flags for linking shared libraries only
PERL Perl program
@ -8563,6 +8572,137 @@ fi
#
# LZ4
#
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking whether to build with LZ4 support" >&5
$as_echo_n "checking whether to build with LZ4 support... " >&6; }
# Check whether --with-lz4 was given.
if test "${with_lz4+set}" = set; then :
withval=$with_lz4;
case $withval in
yes)
$as_echo "#define USE_LZ4 1" >>confdefs.h
;;
no)
:
;;
*)
as_fn_error $? "no argument expected for --with-lz4 option" "$LINENO" 5
;;
esac
else
with_lz4=no
fi
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $with_lz4" >&5
$as_echo "$with_lz4" >&6; }
if test "$with_lz4" = yes; then
pkg_failed=no
{ $as_echo "$as_me:${as_lineno-$LINENO}: checking for liblz4" >&5
$as_echo_n "checking for liblz4... " >&6; }
if test -n "$LZ4_CFLAGS"; then
pkg_cv_LZ4_CFLAGS="$LZ4_CFLAGS"
elif test -n "$PKG_CONFIG"; then
if test -n "$PKG_CONFIG" && \
{ { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"liblz4\""; } >&5
($PKG_CONFIG --exists --print-errors "liblz4") 2>&5
ac_status=$?
$as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
test $ac_status = 0; }; then
pkg_cv_LZ4_CFLAGS=`$PKG_CONFIG --cflags "liblz4" 2>/dev/null`
test "x$?" != "x0" && pkg_failed=yes
else
pkg_failed=yes
fi
else
pkg_failed=untried
fi
if test -n "$LZ4_LIBS"; then
pkg_cv_LZ4_LIBS="$LZ4_LIBS"
elif test -n "$PKG_CONFIG"; then
if test -n "$PKG_CONFIG" && \
{ { $as_echo "$as_me:${as_lineno-$LINENO}: \$PKG_CONFIG --exists --print-errors \"liblz4\""; } >&5
($PKG_CONFIG --exists --print-errors "liblz4") 2>&5
ac_status=$?
$as_echo "$as_me:${as_lineno-$LINENO}: \$? = $ac_status" >&5
test $ac_status = 0; }; then
pkg_cv_LZ4_LIBS=`$PKG_CONFIG --libs "liblz4" 2>/dev/null`
test "x$?" != "x0" && pkg_failed=yes
else
pkg_failed=yes
fi
else
pkg_failed=untried
fi
if test $pkg_failed = yes; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
if $PKG_CONFIG --atleast-pkgconfig-version 0.20; then
_pkg_short_errors_supported=yes
else
_pkg_short_errors_supported=no
fi
if test $_pkg_short_errors_supported = yes; then
LZ4_PKG_ERRORS=`$PKG_CONFIG --short-errors --print-errors --cflags --libs "liblz4" 2>&1`
else
LZ4_PKG_ERRORS=`$PKG_CONFIG --print-errors --cflags --libs "liblz4" 2>&1`
fi
# Put the nasty error message in config.log where it belongs
echo "$LZ4_PKG_ERRORS" >&5
as_fn_error $? "Package requirements (liblz4) were not met:
$LZ4_PKG_ERRORS
Consider adjusting the PKG_CONFIG_PATH environment variable if you
installed software in a non-standard prefix.
Alternatively, you may set the environment variables LZ4_CFLAGS
and LZ4_LIBS to avoid the need to call pkg-config.
See the pkg-config man page for more details." "$LINENO" 5
elif test $pkg_failed = untried; then
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: no" >&5
$as_echo "no" >&6; }
{ { $as_echo "$as_me:${as_lineno-$LINENO}: error: in \`$ac_pwd':" >&5
$as_echo "$as_me: error: in \`$ac_pwd':" >&2;}
as_fn_error $? "The pkg-config script could not be found or is too old. Make sure it
is in your PATH or set the PKG_CONFIG environment variable to the full
path to pkg-config.
Alternatively, you may set the environment variables LZ4_CFLAGS
and LZ4_LIBS to avoid the need to call pkg-config.
See the pkg-config man page for more details.
To get pkg-config, see <http://pkg-config.freedesktop.org/>.
See \`config.log' for more details" "$LINENO" 5; }
else
LZ4_CFLAGS=$pkg_cv_LZ4_CFLAGS
LZ4_LIBS=$pkg_cv_LZ4_LIBS
{ $as_echo "$as_me:${as_lineno-$LINENO}: result: yes" >&5
$as_echo "yes" >&6; }
fi
LIBS="$LZ4_LIBS $LIBS"
CFLAGS="$LZ4_CFLAGS $CFLAGS"
fi
#
# Assignments
#
@ -13379,6 +13519,36 @@ Use --without-zlib to disable zlib support." "$LINENO" 5
fi
fi
if test "$with_lz4" = yes; then
for ac_header in lz4/lz4.h
do :
ac_fn_c_check_header_mongrel "$LINENO" "lz4/lz4.h" "ac_cv_header_lz4_lz4_h" "$ac_includes_default"
if test "x$ac_cv_header_lz4_lz4_h" = xyes; then :
cat >>confdefs.h <<_ACEOF
#define HAVE_LZ4_LZ4_H 1
_ACEOF
else
for ac_header in lz4.h
do :
ac_fn_c_check_header_mongrel "$LINENO" "lz4.h" "ac_cv_header_lz4_h" "$ac_includes_default"
if test "x$ac_cv_header_lz4_h" = xyes; then :
cat >>confdefs.h <<_ACEOF
#define HAVE_LZ4_H 1
_ACEOF
else
as_fn_error $? "lz4.h header file is required for LZ4" "$LINENO" 5
fi
done
fi
done
fi
if test "$with_gssapi" = yes ; then