lib/zstd

mirror of https://github.com/facebook/zstd.git synced 2025-07-29 11:21:22 +03:00

Author	SHA1	Message	Date
shixuantong	de8d9e8914	Fix several locations with potential memory leak	2025-06-09 21:23:23 +08:00
Dave Vasilevsky	448a09ff78	seekable_format: Fix conversion warnings in parallel_compression	2025-05-07 22:01:49 -07:00
Dave Vasilevsky	13cb7a10ae	seekable_format: Add test for parallel_compression memory usage Use ulimit to fail the test if we use O(filesize) memory, rather than O(threads).	2025-05-07 22:01:49 -07:00
Dave Vasilevsky	01c973de8d	seekable_format: Fix race in parallel_processing There was no memory barrier between writing and reading `done`, which would allow reordering to cause races. With so little data to handle after each job completes, we might as well just join.	2025-05-07 22:01:49 -07:00
Dave Vasilevsky	6fc8455a72	seekable_format: Cleanup POOL in parallel_compression	2025-05-07 22:01:49 -07:00
Dave Vasilevsky	2d4cff69c4	seekable_format: Make parallel_compression use memory properly Previously, parallel_compression would only handle each job's results after ALL jobs were successfully queued. This caused all src/dst buffers to remain in memory until then! It also polled to check whether a job completed, which is racy without any memory barrier. Now, we flush results as a side effect of completing a job. Completed frames are placed in an ordered linked-list, and any eligible frames are flushed. This may be zero or multiple frames, depending on the order in which jobs finish. This design also makes it simple to support streaming input, so that is now available. Just pass `-` as the filename, and stdin/stdout will be used for I/O.	2025-05-07 22:01:49 -07:00
Dave Vasilevsky	f5b6531902	seekable_format: Link against multi-threaded libzstd.a Some of these examples are intended to be parallel, and don't make sense to link against single-threaded libzstd. The filename of mt and nomt libzstd are identical, so it's still possible to link against the single-threaded one, just harder.	2025-05-07 22:01:49 -07:00
Dave Vasilevsky	6b0039abcf	seekable_format: Build with $(MAKE) This passes make flags, such as `-jN` for building in parallel, to the underlying make.	2025-05-07 22:01:49 -07:00
Victor Zhang	a610550e2c	Merge pull request #4218 from facebook/externC Move #includes out of `extern "C"` blocks	2025-01-07 10:06:08 -08:00
Victor Zhang	6b046f5841	PR feedback	2025-01-02 15:05:58 -08:00
Victor Zhang	54c3d998a0	Support for libc variants without fseeko/ftello Some older Android libc implementations don't support `fseeko` or `ftello`. This commit adds a new compile-time macro `LIBC_NO_FSEEKO` as well as a usage in CMake for old Android APIs.	2025-01-02 14:02:10 -08:00
Victor Zhang	fc726da774	Move #includes out of `extern "C"` blocks Do some include shuffling for `**.h` files within lib, programs, tests, and zlibWrapper. `lib/legacy` and `lib/deprecated` are untouched. `#include`s within `extern "C"` blocks in .cpp files are untouched. todo: shuffling for `xxhash.h`	2024-12-17 17:55:07 -08:00
Robert Rose	b683c0dbe2	prevent possible segfault when creating seek table Add a check whether the seek table of a `ZSTD_seekable` is initialized before creating a new seek table from it. Return `NULL`, if the check fails.	2024-11-25 08:57:25 +01:00
Dimitri Papadopoulos	585aaa0ed3	Do not test WIN32, instead test _WIN32 To the best of my knowledge: * `_WIN32` and `_WIN64` are defined by the compiler, * `WIN32` and `WIN64` are defined by the user, to indicate whatever the user chooses them to indicate. They mean 32-bit and 64-bit Windows compilation by convention only. See: https://accu.org/journals/overload/24/132/wilson_2223/ Windows compilers in general, and MSVC in particular, have been defining `_WIN32` and `_WIN64` for a long time, provably at least since Visual Studio 2015, and in practice as early as in the days of 16-bit Windows. See: https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-140 https://learn.microsoft.com/en-us/windows/win32/winprog64/the-tools Tests used to be inconsistent, sometimes testing `_WIN32`, sometimes `_WIN32` and `WIN32`. This brings consistency to Windows detection.	2023-09-23 19:03:18 +02:00
Yoni Gilad	649a9c85c3	seekable_format: Add unit test for multiple decompress calls This does the following: 1. Compress test data into multiple frames 2. Perform a series of small decompressions and seeks forward, checking that compressed data wasn't reread unnecessarily. 3. Perform some seeks forward and backward to ensure correctness.	2023-03-29 21:35:52 -07:00
Yoni Gilad	618bf84e0d	seekable_format: Prevent rereading frame when seeking forward When decompressing a seekable file, if seeking forward within a frame (by issuing multiple ZSTD_seekable_decompress calls with a small gap between them), the frame will be unnecessarily reread from the beginning. This patch makes it continue using the current frame data and simply skip over the unneeded bytes.	2023-03-29 21:24:12 -07:00
Yann Collet	dd8cb5a0f1	added documentation for the seekable format and notably provide additional context for the Maximum Frame Size parameter. requested by @P-E-Meunier at `1df9f36c6c (commitcomment-103856979)`.	2023-03-10 15:54:31 -08:00
Yann Collet	1df9f36c6c	Improved seekable format ingestion speed for small frame size As reported by @P-E-Meunier in https://github.com/facebook/zstd/issues/2662#issuecomment-1443836186, seekable format ingestion speed can be particularly slow when selected `FRAME_SIZE` is very small, especially in combination with the recent row_hash compression mode. The specific scenario mentioned was `pijul`, using frame sizes of 256 bytes and level 10. This is improved in this PR, by providing approximate parameter adaptation to the compression process. Tested locally on a M1 laptop, ingestion of `enwik8` using `pijul` parameters went from 35sec. (before this PR) to 2.5sec (with this PR). For the specific corner case of a file full of zeroes, this is even more pronounced, going from 45sec. to 0.5sec. These benefits are unrelated to (and come on top of) other improvement efforts currently being made by @yoniko for the row_hash compression method specifically. The `seekable_compress` test program has been updated to allows setting compression level, in order to produce these performance results.	2023-03-09 18:00:30 -08:00
Danielle Rozenblit	63042f1f11	fix 32bit build errors in zstd seekable	2023-01-24 15:53:59 -08:00
W. Felix Handte	5d693cc38c	Coalesce Almost All Copyright Notices to Standard Phrasing ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do sed -i '/Copyright .* $Yann Collet$\\|$Meta Platforms$/ s/Copyright ./Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0.c lib/legacy/zstd_v0*.h nano ./programs/windres/zstd.rc nano ./build/VS2010/zstd/zstd.rc nano ./build/VS2010/libzstd-dll/libzstd-dll.rc ```	2022-12-20 12:52:34 -05:00
W. Felix Handte	7f12f24cf4	Rewrite Copyright Date Ranges from `-present` to `-2022` Apparently it's better. Somehow. ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache $ -prune -o -type f); do echo $f; sed -i 's/\-present/-2022/' $f; done g co HEAD -- build/meson/ ```	2022-12-20 12:44:56 -05:00
W. Felix Handte	8927f985ff	Update Copyright Headers 'Facebook' -> 'Meta Platforms' ``` for f in $(find . $ -path ./.git -o -path ./tests/fuzz/corpora $ -prune -o -type f); do sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f; done ```	2022-12-20 12:37:57 -05:00
daniellerozenblit	e2fc93340f	Merge branch 'dev' into http-to-https	2022-12-15 10:46:13 -05:00
Danielle Rozenblit	4dffc35f2e	Convert references to https from http	2022-12-14 06:58:35 -08:00
Danielle Rozenblit	aece0f258a	free memory in test case	2022-12-13 08:15:16 -08:00
yhoogstrate	f17652931c	seekable_format no header when compressing empty string to stream	2022-02-08 14:06:00 +01:00
Yann Collet	03903f5701	fixed minor compression difference in btlazy2 subtle dependency on sumtype numeric representation	2021-12-29 18:51:03 -08:00
sen	d6be7659b0	Add seekable roundtrip fuzzer (#2617 )	2021-05-06 10:08:21 -04:00
Azat Khuzhin	53a60e98de	seekable decompression fixes (#2594 ) * seekable_format: fix from-file reading (not in-memory) It tries to check the buffer boundary, but there is no buffer for from-file reading. * seekable_decompression: break when ZSTD_seekable_decompress() returns zero * seekable_decompression_mem: break when ZSTD_seekable_decompress() returns zero * seekable_format: cap the offset+len up to the last dOffset This will allow to read the whole file w/o gotting corruption error if the offset is more then the data left in file, i.e.: $ ./seekable_compression seekable_compression.c 8192 \| head $ zstd -cdq seekable_compression.c.zst \| wc -c 4737 Before this patch: $ ./seekable_decompression seekable_compression.c.zst 0 10000000 \| wc -c ZSTD_seekable_decompress() error : Corrupted block detected 0 After: $ ./seekable_decompression seekable_compression.c.zst 0 10000000 \| wc -c 4737	2021-05-05 10:05:41 -04:00
Fotis Xenakis	3c6f5d5eca	Fix seekable test to provide valid descriptor	2021-03-13 00:00:08 +02:00
Fotis Xenakis	21697b9c9e	Fix seek table descriptor check when loading	2021-03-12 23:07:15 +02:00
Yann Collet	2fa4c8c405	added code comments for new API ZSTD_seekTable	2021-03-03 22:54:04 -08:00
Yann Collet	6e390ced1f	Merge branch 'seekTable' of github.com:facebook/zstd into seekTable	2021-03-03 22:44:38 -08:00
Yann Collet	16ec1cf355	added test case for seekTable API and simple roundtrip test	2021-03-03 18:56:23 -08:00
Yann Collet	713d4953f7	fixed gcc-7 conversion warning	2021-03-03 18:00:41 -08:00
Yann Collet	6c0bfc468c	fixed wrong assert condition	2021-03-03 15:30:55 -08:00
Yann Collet	a1d7b9d654	fixed gcc conversion warnings	2021-03-03 15:17:12 -08:00
Yann Collet	24d59a655d	Merge branch 'dev' into seekTable	2021-03-03 15:08:40 -08:00
Yann Collet	ac95a30455	various minor style fixes	2021-03-02 16:03:18 -08:00
Yann Collet	029f974ddc	strengthen compilation flags	2021-03-02 15:43:52 -08:00
Yann Collet	c7e42e147b	fixed const guarantees read-only objects are properly const-ified in parameters	2021-03-02 15:24:30 -08:00
Yann Collet	a80b10f5e6	fix potential leak on exit	2021-03-02 15:03:37 -08:00
Sen Huang	527a20c3cd	Fix seekable decompress hanging	2021-03-02 14:30:03 -08:00
Martin Lindsay	3cbdbb888b	ZSTD_seekable_decompress() example that hangs.	2021-03-02 14:25:17 -08:00
Yann Collet	ce6d1b9376	Merge pull request #2113 from mdittmer/expose-seek-table [contrib] Support seek table-only API	2021-03-02 10:50:47 -08:00
Stephen Kitt	adb54293ab	Stop using deprecated reset?Stream functions These are replaced by the corresponding context resets. When converting resetCStream, CCtx_setPledgedSrcSize isn't called if the source size is "unknown". This helps reduce the reliance on "static only" symbols, as well as reducing the use of deprecated functions. Signed-off-by: Stephen Kitt <steve@sk2.org>	2021-02-23 21:56:01 +01:00
Yann Collet	0b39531d75	moving all references to `release` branch was previously `master`	2020-12-16 23:00:35 -08:00
senhuang42	26f89d47aa	Clean up makefile for seekable tests	2020-12-03 09:25:04 -05:00
senhuang42	152b55879c	Add unit tests to seekable	2020-12-02 15:33:12 -05:00
senhuang42	9db49a3989	Add a forward progress requirement bound to seekable streaming decompression	2020-12-02 12:24:16 -05:00

1 2

77 Commits