1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-29 11:21:22 +03:00

14 Commits

Author SHA1 Message Date
dd8cb5a0f1 added documentation for the seekable format
and notably provide additional context for the
Maximum Frame Size parameter.

requested by @P-E-Meunier
at 1df9f36c6c (commitcomment-103856979).
2023-03-10 15:54:31 -08:00
1df9f36c6c Improved seekable format ingestion speed for small frame size
As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186,
seekable format ingestion speed can be particularly slow
when selected `FRAME_SIZE` is very small,
especially in combination with the recent row_hash compression mode.
The specific scenario mentioned was `pijul`,
using frame sizes of 256 bytes and level 10.

This is improved in this PR,
by providing approximate parameter adaptation to the compression process.

Tested locally on a M1 laptop,
ingestion of `enwik8` using `pijul` parameters
went from 35sec. (before this PR) to 2.5sec (with this PR).
For the specific corner case of a file full of zeroes,
this is even more pronounced, going from 45sec. to 0.5sec.

These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically.

The `seekable_compress` test program has been updated to allows setting compression level,
in order to produce these performance results.
2023-03-09 18:00:30 -08:00
2fa4c8c405 added code comments for new API ZSTD_seekTable 2021-03-03 22:54:04 -08:00
029f974ddc strengthen compilation flags 2021-03-02 15:43:52 -08:00
c7e42e147b fixed const guarantees
read-only objects are properly const-ified in parameters
2021-03-02 15:24:30 -08:00
9b8f337357 [contrib] Support seek table-only API
Memory constrained use cases that manage multiple archives benefit from
retaining multiple archive seek tables without retaining a ZSTD_seekable
instance for each.

* New opaque type for seek table: ZSTD_seekTable.
* ZSTD_seekable_copySeekTable() supports copying seek table out of a
  ZSTD_seekable.
* ZSTD_seekTable_[eachSeekTableOp]() defines seek table API that mirrors
  existing seek table operations.
* Existing ZSTD_seekable_[eachSeekTableOp]() retained; they delegate to
  ZSTD_seekTable the variant.

These changes allow the above-mentioned use cases to initialize a
ZSTD_seekable, extract its ZSTD_seekTable, then throw the ZSTD_seekable
away to save memory. Standard ZSTD operations can then be used to
decompress frames based on seek table offsets.

The copy and delegate patterns are intended to minimize impact on
existing code and clients. Using copy instead of move for the infrequent
operation extracting a seek table ensures that the extraction does not
render the ZSTD_seekable useless. Delegating to *new* seek
table-oriented APIs ensures that this is not a breaking change for
existing clients while supporting all meaningful operations that depend
only on seek table data.
2020-05-07 09:31:43 -04:00
97c60cdf36 fixed seekable_format type mismatch
and some minor "unused variable" warnings.
Also : zstd_seekable.h is actually depending on zstd.h for ZSTDLIB_API
2018-06-06 13:10:29 -07:00
470993c9b1 Add raw seek table construction API and parallel compression example 2017-04-28 12:17:09 -07:00
35186e65b0 Address comments and make sure all prototypes are rendered by gen_html 2017-04-20 16:48:54 -07:00
0f7bd772e6 Update seekable API to simplify IO 2017-04-18 16:48:30 -07:00
9626cf1ac6 Address @terrelln's comments 2017-04-13 17:48:35 -07:00
2785b28e05 Reduce the limit on frame decompressed size to 2 GB 2017-04-12 14:09:13 -07:00
5ee1135f30 s/chunk/frame/ 2017-04-12 11:15:50 -07:00
d048fefef7 Move seekable format content to /contrib 2017-04-11 14:38:56 -07:00