1
0
mirror of https://github.com/facebook/zstd.git synced 2025-09-11 11:51:02 +03:00
Commit Graph

2480 Commits

Author SHA1 Message Date
Yann Collet
6b11fc436c fix issue with incompressible sections 2024-02-23 14:53:56 -08:00
Yann Collet
cc4530924b speed optimized version of targetCBlockSize
note that the size of individual compressed blocks will vary more wildly with this modification.
But it seems good enough for a first test, and fix the speed regression issue.
Further refinements can be attempted later.
2024-02-23 14:03:26 -08:00
Christoph Grüninger
b921f1aad6 Reduce scope of variables
This improves readability, keeps variables local, and
prevents the unintended use (e.g. typo) later on.
Found by Cppcheck (variableScope)
2024-02-11 22:00:03 +01:00
Yann Collet
b0e8580dc7 fix fuzz issue 5131069967892480 2024-02-08 16:38:20 -08:00
Yann Collet
22574d848d fix issue 5921623844651008
ossfuzz managed to create a scenario which triggers an `assert`.
This fixes it, by giving +1 more space for the backward search pass.
2024-02-06 13:01:14 -08:00
Yann Collet
b88c593d8f added or updated code comments
as suggested by @terrelln,
to make the code of the optimal parser a bit more understandable.
2024-02-05 18:32:25 -08:00
Yann Collet
6c35fb2e8c fix msan warnings 2024-02-05 01:21:06 -08:00
Yann Collet
641749fc09 fix uasan dictionary_stream_round_trip fuzz test 2024-02-05 00:36:10 -08:00
Yann Collet
fe2e2ad36d use ZSTD_memcpy()
which can be redirected in Linux kernel mode
2024-02-03 19:57:38 -08:00
Yann Collet
0ae21d8c31 removed trace control 2024-02-03 19:32:59 -08:00
Yann Collet
5474edbe60 fixed wrong assert
by introducing ZSTD_OPT_SIZE
2024-02-03 19:31:53 -08:00
Yann Collet
e5af24c5fa fixed wrong assert 2024-02-03 17:48:29 -08:00
Yann Collet
8168a451e5 minor optimization, mostly for clarity 2024-02-03 17:26:47 -08:00
Yann Collet
d31018e223 finally, a version that generalizes well
While it's not always strictly a win,
it's a win for files that see a noticeably compression ratio increase,
while it's a very small noise for other files.

Downside is, this patch is less efficient for 32-bit arrays of integer
than the previous patch which was introducing losses for other files,
but it's still a net improvement on this scenario.
2024-02-03 14:26:18 -08:00
Yann Collet
0166b2ba80 modification: differentiate literal update at pos+1
helps when litlen==1 is cheaper than litlen==0

works great on pathological arr[u32] examples
but doesn't generalize well on other files.

silesia/x-ray is amoung the most negatively affected ones.
2024-01-31 11:20:43 -08:00
Yann Collet
4683667785 refactor optimal parser
store stretches as intermediate solution instead of sequences.
makes it possible to link a solution to a predecessor.
2024-01-31 02:51:46 -08:00
Yann Collet
de10f56be2 improve high compression ratio for file like #3793
this works great for 32-bit arrays,
notably the synthetic ones, with extreme regularity,
unfortunately, it's not universal,
and in some cases, it's a loss.
Crucially, on average, it's a loss on silesia.
The most negatively impacted file is x-ray.
It deserves an investigation before suggesting it as an evolution.
2024-01-29 23:25:24 -08:00
Yann Collet
a07cae3976 Merge pull request #3847 from michoecho/fix_nullptr_deref_in_createCDict
Fix a nullptr dereference in ZSTD_createCDict_advanced2()
2023-12-30 13:23:39 -08:00
Elliot Gorokhovsky
c6cabf9441 Make offload API compatible with static CCtx (#3854)
* Add ZSTD_CCtxParams_registerSequenceProducer() to public API

* add unit test

* add docs to zstd.h

* nits

* Add ZSTDLIB_STATIC_API prefix

* Add asserts
2023-12-28 14:48:46 -05:00
Michał Chojnowski
9a3b17c4d6 Fix a nullptr dereference in ZSTD_createCDict_advanced2()
If the relevant allocation returns NULL, ZSTD_createCDict_advanced_internal()
will return NULL. But ZSTD_createCDict_advanced2() doesn't check for
this and attempts to use the returned pointer anyway, which leads to
a segfault.
2023-12-16 13:02:18 +01:00
Elliot Gorokhovsky
d151a4880b Move offload API params into ZSTD_CCtx_params 2023-11-27 08:11:01 -08:00
Elliot Gorokhovsky
809c7eb6bf Refactor ZSTD_sequenceProducer_F typedef to ZSTD_sequenceProducer_F* 2023-11-27 06:56:37 -08:00
Nick Terrell
8193250615 Modernize macros to use do { } while (0)
This PR introduces no functional changes. It attempts to change all
macros currently using `{ }` or some variant of that to to
`do { } while (0)`, and introduces trailing `;` where necessary.
There were no bugs found during this migration.

The bug in Visual Studios warning on this has been fixed since VS2015.
Additionally, we have several instances of `do { } while (0)` which have
been present for several releases, so we don't have to worry about
breaking peoples builds.

Fixes Issue #3830.
2023-11-21 20:05:17 -05:00
Yann Collet
d988e00a7f baby-step towards solving flexArray issue #3785
the flexArray in structure FSE_DecompressWksp
is just a way to derive a pointer easily,
without risk/complexity of calculating it manually.

Not sure if this change is good enough to avoid ubsan warnings though.
2023-10-18 16:21:39 -07:00
Yann Collet
6bb1688c1a extended the fix to ZSTDMT's Buffer Pool 2023-10-08 00:25:17 -07:00
Yann Collet
ea4027c003 removed unused macro constant 2023-10-07 23:32:22 -07:00
Yann Collet
c87ad5bdb5 fixes suggested by @ebiggers 2023-10-07 23:29:42 -07:00
Yann Collet
e8ff7d18eb removed FlexArray pattern from CCtxPool
within ZSTDMT_.
This pattern is flagged by less forgiving variants of ubsan
notably used during compilation of the Linux Kernel.

There are 2 other places in the code where this pattern is used.
This fixes just one of them.
2023-10-07 21:30:08 -07:00
Yann Collet
c1e588fcb4 Merge pull request #3771 from DimitriPapadopoulos/codespell
Fix new typos found by codespell
2023-10-07 19:29:41 -07:00
Nick Terrell
43118da8a7 Stop suppressing pointer-overflow UBSAN errors
* Remove all pointer-overflow suppressions from our UBSAN builds/tests.
* Add `ZSTD_ALLOW_POINTER_OVERFLOW_ATTR` macro to suppress
  pointer-overflow at a per-function level. This is a superior approach
  because it also applies to users who build zstd with UBSAN.
* Add `ZSTD_wrappedPtr{Diff,Add,Sub}()` that use these suppressions.
  The end goal is to only tag these functions with
  `ZSTD_ALLOW_POINTER_OVERFLOW`. But we can start by annoting functions
  that rely on pointer overflow, and gradually transition to using
  these.
* Add `ZSTD_maybeNullPtrAdd()` to simplify pointer addition when the
  pointer may be `NULL`.
* Fix all the fuzzer issues that came up. I'm sure there will be a lot
  more, but these are the ones that came up within a few minutes of
  running the fuzzers, and while running GitHub CI.
2023-09-28 17:35:05 -04:00
Dimitri Papadopoulos
fe34776c20 Fix new typos found by codespell 2023-09-23 18:56:01 +02:00
Nick Terrell
396ef5b434 Fix & refactor Huffman repeat tables for dictionaries
The Huffman repeat mode checker assumed that the CTable was zeroed in the region `[maxSymbolValue + 1, 256)`.
This assumption didn't hold for tables built in the dictionaries, because it didn't go through the same codepath.

Since this code was originally written, we added a header to the CTable that specifies the `tableLog`.
Add `maxSymbolValue` to that header, and check that the table's `maxSymbolValue` is at least the block's `maxSymbolValue`.

This solution is cleaner because we write this header for every CTable we build, so it can't be missed in any code path.

Credit to OSS-Fuzz
2023-08-25 13:21:58 -04:00
Nick Terrell
bd02c9be6e No longer reject dictionaries with literals maxSymbolValue < 255
We already have logic in our Huffman encoder to validate Huffman tables with missing symbols.
We use this for higher compression levels to re-use the previous blocks statistics, or when the dictionaries table has zero-weighted symbols.
This check was leftover as an oversight from before we added validation for Huffman tables.

I validated that the `dictionary_loader` fuzzer has coverage of every line in the `ZSTD_loadCEntropy()` function to validate that it is correctly testing this function.
2023-08-22 13:22:35 -04:00
W. Felix Handte
9987d2f594 Unpoison Workspace Memory Before Freeing to Custom Free
MSAN is hooked into the system malloc, but when the user provides a custom
allocator, it may not provide the same cleansing behavior. So if we leave
memory poisoned and return it to the user's allocator, where it is re-used
elsewhere, our poisoning can blow up in some other context.
2023-08-16 12:09:12 -04:00
W. Felix Handte
5f5bdc1e5d Easy: Move Helper Functions Up 2023-08-16 12:08:52 -04:00
Jacob Greenfield
55ff3e4e17 Save one byte on the frame epilogue 2023-07-20 18:59:44 -04:00
Elliot Gorokhovsky
c6a888c073 suppress false error message in LDM mode 2023-06-21 19:19:02 -07:00
Yann Collet
1f83b7cfc4 fix a minor inefficiency in compress_superblock
and in `decodecorpus`:
the specific case `nbSeq=127` can be represented using the 1-byte format.
Note that both the 1-byte and the 2-bytes formats are valid to represent this case,
so there was no "error", produced data remains valid,
it's just that the 1-byte format is more efficient.

fix #3667

Credit to @ip7z for finding this issue.
2023-06-05 09:51:52 -07:00
Nick Terrell
d01a2c6929 Fix UBSAN issue (zero addition to NULL)
Fix UBSAN issue that came up internally.
2023-05-26 13:43:47 -07:00
W. Felix Handte
1b65803fe7 Reorder Definitions in zstd_opt.c to Group Under Macro Guards (Slightly) 2023-05-22 12:41:48 -04:00
W. Felix Handte
59c7b2a492 Reorder Definitions in zstd_lazy.c to Group Under Macro Guards 2023-05-22 12:37:03 -04:00
W. Felix Handte
eb9227935e Also Reorganize Zstd Opt Declarations 2023-05-04 12:18:58 -04:00
W. Felix Handte
d09f195ceb Remove blockCompressor NULL Checks 2023-05-04 12:18:58 -04:00
W. Felix Handte
b7add1dd67 Abort if Unsupported Parameters Used 2023-05-04 12:18:58 -04:00
W. Felix Handte
f242f5be8f Re-Order Lazy Declarations; Minimize ifndefs 2023-05-04 12:18:58 -04:00
W. Felix Handte
39b7946b95 Define Macros for Possibly-Present Functions; Use Them Rather than Ifdef Guards 2023-05-04 12:18:58 -04:00
W. Felix Handte
b12e8cb3e7 Merge Ultra and Ultra2 Exclusion
Ultra2 does not exist for dict compression, and so uses ultra. So ultra must
be present if ultra2 is.
2023-05-04 12:18:58 -04:00
W. Felix Handte
6761e1c949 Tweak Ultra/Opt Guards 2023-05-04 12:18:58 -04:00
W. Felix Handte
5a75956001 Adjust Strategy in CParams to Avoid Using Excluded Block Compressors 2023-05-04 12:18:58 -04:00
W. Felix Handte
50cdf84f58 Macro-Exclude Block Compressors from Declaration/Definition 2023-05-04 12:18:58 -04:00