Boost 1.85 removed some deprecated code in filesystem module which is
still used in columnstore:
- The boost/filesystem/convenience.hpp was removed but columnstore does
not use any functionality from that file except indirect includes.
Therefore this include is removed or replaced with more general
boost/filesystem.hpp. The convenience.hpp header file was deprecated
in filesystem V3 introduced in Boost 1.46.0.
- `normalize` method was removed and users are suggested to replace it
with `lexically_normal` method, which was introduced in Boost 1.60.0.
Original `normalize` call is preserved for backward compatibility with
old Boost version, however`, `lexically_normal` method is preferably
used with Boost 1.60.0 and newer.
- The `copy_option` was removed in favor of `copy_options` (note the
trailing 's'), but enum values were renamed. Namely, `fail_if_exists`
is replaced with `none` and `overwrite_if_exists` is replaced with
`overwrite_existing`. The `copy_options` was introduced in Boost
1.74.0.
New form is used instead, but a backward compatibility layer for Boost
1.73.0 and older was introduced in boost_copy_options_compat.hpp file.
This solution seems to be less awkward than using multiple #if #else
#endif blocks in source code.
* Adds CompressInterfaceLZ4 which uses LZ4 API for compress/uncompress.
* Adds CMake machinery to search LZ4 on running host.
* All methods which use static data and do not modify any internal data - become `static`,
so we can use them without creation of the specific object. This is possible, because
the header specification has not been modified. We still use 2 sections in header, first
one with file meta data, the second one with pointers for compressed chunks.
* Methods `compress`, `uncompress`, `maxCompressedSize`, `getUncompressedSize` - become
pure virtual, so we can override them for the other compression algos.
* Adds method `getChunkMagicNumber`, so we can verify chunk magic number
for each compression algo.
* Renames "s/IDBCompressInterface/CompressInterface/g" according to requirement.
* Introduce multigeneration aggregation
* Do not save unused part of RGDatas to disk
* Add IO error explanation (strerror)
* Reduce memory usage while aggregating
* introduce in-memory generations to better memory utilization
* Try to limit the qty of buckets at a low limit
* Refactor disk aggregation a bit
* pass calculated hash into RowAggregation
* try to keep some RGData with free space in memory
* do not dump more than half of rowgroups to disk if generations are
allowed, instead start a new generation
* for each thread shift the first processed bucket at each iteration,
so the generations start more evenly
* Unify temp data location
* Explicitly create temp subdirectories
whether disk aggregation/join are enabled or not
MCS now chowns created directories hierarchy not only files and
immediate parent directories
Minor changes to cpimport's help printout
cpimport's -f option is now mandatory with mode 2
an owner for all data files created by cpimport
The patch consists of two parts: cpimport.bin changes, cpimport splitter
changes
cpimport.bin computes uid_t and gid_t early and propagates it down the stack
where MCS creates data files