In main, resources were freed on the success path but not in the error path.
This change ensures all allocated resources are released before returning.
There was no memory barrier between writing and reading `done`, which
would allow reordering to cause races. With so little data to handle
after each job completes, we might as well just join.
Previously, parallel_compression would only handle each job's results
after ALL jobs were successfully queued. This caused all src/dst
buffers to remain in memory until then!
It also polled to check whether a job completed, which is racy without
any memory barrier.
Now, we flush results as a side effect of completing a job. Completed
frames are placed in an ordered linked-list, and any eligible frames
are flushed. This may be zero or multiple frames, depending on the
order in which jobs finish.
This design also makes it simple to support streaming input, so that
is now available. Just pass `-` as the filename, and stdin/stdout will
be used for I/O.
Some of these examples are intended to be parallel, and don't make
sense to link against single-threaded libzstd.
The filename of mt and nomt libzstd are identical, so it's still
possible to link against the single-threaded one, just harder.
Make the function ZSTD_compressSequencesAndLiterals() available in kernel
space. This will be used by Intel QAT driver.
Additionally, (1) expose the function ZSTD_CCtx_setParameter(), which is
required to set parameters before calling ZSTD_compressSequencesAndLiterals(),
(2) update the build process to include `compress/zstd_preSplit.o` and
(3) replace `asm/unaligned.h` with `linux/unaligned.h`.
Signed-off-by: Giovanni Cabiddu <giovanni.cabiddu@intel.com>
When two threads are using a WorkQueue and the reader thread exits due
to an error, it must call WorkQueue::finish() to wake up the writer
thread. Otherwise, if the queue is full and the writer thread is waiting
for a free slot, it could hang forever.
This can happen in pratice when decompressing a large, corrupted file
that does not contain pzstd skippable frames.
Some older Android libc implementations don't support `fseeko` or `ftello`.
This commit adds a new compile-time macro `LIBC_NO_FSEEKO` as well as a usage in CMake for old Android APIs.
Do some include shuffling for `**.h` files within lib, programs, tests, and zlibWrapper.
`lib/legacy` and `lib/deprecated` are untouched.
`#include`s within `extern "C"` blocks in .cpp files are untouched.
todo: shuffling for `xxhash.h`
Doing this check with a direct c++ snippet is prone to portability problems:
- \043 is not portable between shells: dash expands it to #,
bash does not;
- using # directly works with make 4.3 but does not with make 4.2.
Let's just use the c++ version that covers both the code and the gtest.
ZSTD_resetDStream() is deprecated and replaced by ZSTD_DCtx_reset().
This removes deprecation warnings from the kernel build.
This change is a no-op, see the docs suggesting this replacement.
fcbf2fde9a/lib/zstd.h (L2655-L2663)
To the best of my knowledge:
* `_WIN32` and `_WIN64` are defined by the compiler,
* `WIN32` and `WIN64` are defined by the user, to indicate whatever
the user chooses them to indicate. They mean 32-bit and 64-bit Windows
compilation by convention only.
See:
https://accu.org/journals/overload/24/132/wilson_2223/
Windows compilers in general, and MSVC in particular, have been defining
`_WIN32` and `_WIN64` for a long time, provably at least since Visual Studio
2015, and in practice as early as in the days of 16-bit Windows.
See:
https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-140https://learn.microsoft.com/en-us/windows/win32/winprog64/the-tools
Tests used to be inconsistent, sometimes testing `_WIN32`, sometimes
`_WIN32` and `WIN32`. This brings consistency to Windows detection.
This does the following:
1. Compress test data into multiple frames
2. Perform a series of small decompressions and seeks forward, checking
that compressed data wasn't reread unnecessarily.
3. Perform some seeks forward and backward to ensure correctness.
When decompressing a seekable file, if seeking forward within
a frame (by issuing multiple ZSTD_seekable_decompress calls
with a small gap between them), the frame will be unnecessarily
reread from the beginning. This patch makes it continue using
the current frame data and simply skip over the unneeded bytes.
Rather than remove the flag entirely, as proposed in #3499, this commit uses
the newest C++ standard the compiler supports. This retains the selection of
using only standardized features (excluding GNU extensions) and keeps the
recency requirements of the codebase explicit.
Tested with various versions of `g++` and `clang++`.
As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186,
seekable format ingestion speed can be particularly slow
when selected `FRAME_SIZE` is very small,
especially in combination with the recent row_hash compression mode.
The specific scenario mentioned was `pijul`,
using frame sizes of 256 bytes and level 10.
This is improved in this PR,
by providing approximate parameter adaptation to the compression process.
Tested locally on a M1 laptop,
ingestion of `enwik8` using `pijul` parameters
went from 35sec. (before this PR) to 2.5sec (with this PR).
For the specific corner case of a file full of zeroes,
this is even more pronounced, going from 45sec. to 0.5sec.
These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically.
The `seekable_compress` test program has been updated to allows setting compression level,
in order to produce these performance results.
* Fixes zstd-dll build (https://github.com/facebook/zstd/issues/3492):
- Adds pool.o and threading.o dependency to the zstd-dll target
- Moves custom allocation functions into header to avoid needing to add dependency on common.o
- Adds test target for zstd-dll
- Adds github workflow that buildis zstd-dll