1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-02 20:22:31 +03:00
Commit Graph

368 Commits

Author SHA1 Message Date
04a2a0219c update type names
naming convention: Type names should start with a Capital letter (after the prefix)
2024-12-29 14:25:33 -08:00
b7a9e69d8d added parameter litCapacity
to ZSTD_compressSequencesAndLiterals()
to enforce the litCapacity >= litSize+8 condition.
2024-12-20 10:37:01 -08:00
76445bb379 add a check, to return an error if Sequence validation is enabled
since ZSTD_compressSequencesAndLiterals() doesn't support it.
2024-12-20 10:37:01 -08:00
b339efff2b add dedicated error code for special case
ZSTD_compressSequencesAndLiterals() cannot produce an uncompressed block
2024-12-20 10:37:00 -08:00
0a54f6f288 ZSTD_compressSequencesAndLiterals requires srcSize as parameter
this makes it possible to adjust windowSize to its tightest.
2024-12-20 10:37:00 -08:00
12c47d3262 improved speed of the Sequences converter 2024-12-20 10:37:00 -08:00
0b013b2688 added unit tests to ZSTD_compressSequencesAndLiterals()
seems to work as expected,
correctly control that `litSize` and `srcSize` are exactly correct.
2024-12-20 10:36:58 -08:00
14a21e43b3 produced ZSTD_compressSequencesAndLiterals() as a separate pipeline
only supports explicit delimiter mode, at least for the time being
2024-12-20 10:36:58 -08:00
047db4f1f8 ZSTD_SequenceCopier_f no returns the nb of bytes consumed from input
which feels much more natural
2024-12-20 10:36:58 -08:00
4ef9d7d585 codemod: ZSTD_cParamMode_e -> ZSTD_CParamMode_e 2024-12-20 10:36:58 -08:00
56cfb7816a codemod: ZSTD_paramSwitch_e -> ZSTD_ParamSwitch_e 2024-12-20 10:36:58 -08:00
13b9296d79 minor simplification 2024-12-20 10:36:58 -08:00
c97522f7fb codemod: ZSTD_sequenceFormat_e -> ZSTD_SequenceFormat_e
since it's a type name.

Note: in contrast with previous names, this one is on the Public API side.
So there is a #define, so that existing programs using ZSTD_sequenceFormat_e still work.
2024-12-20 10:36:56 -08:00
b4a40a845f move Sequences definition to zstd_compress_internal.h
they should not be in common/zstd_internal.h,
since these definitions are not shared beyond lib/compress/.
2024-12-20 10:36:55 -08:00
fcf88ae39b Fix new typos found by codespell 2024-11-26 11:15:39 +01:00
2e02cd330d inform manual users that it's automatically generated
suggested by @Eugeny1
2024-10-31 15:06:48 -07:00
d9553fd218 elevated ZSTD_getErrorCode() to stable status
answering #4183
2024-10-31 14:15:50 -07:00
3e7c66acd1 added ascending order example 2024-10-09 01:06:24 -07:00
3b343dcfb1 refactor huffman prefix code paragraph 2024-10-07 17:15:07 -07:00
a8b86d024a refactor documentation of the FSE decoding table build process 2024-10-02 23:09:06 -07:00
d2212c680a Merge pull request #4013 from elasota/spec-clarify-offset-code-overflow
Specify that decoders may reject non-zero probabilities for larger offset codes than implementation supports
2024-09-27 13:42:32 -07:00
3de0541aef Merge pull request #4079 from elasota/truncated-huff-state-error
Throw error if Huffman weight initial states are truncated
2024-06-30 16:17:03 -07:00
0938308ff6 Throw error if Huffman weight initial states are truncated 2024-06-20 17:46:16 -04:00
2d736d9c50 Fix new typos found by codespell 2024-06-20 20:12:16 +02:00
f19c98228f Fix $filter and Msys/Cygwin
- switched the patter and input of $filter into the right places
- added pattern wildcard to MSYS_NT & CYGWIN_NT as they change with windows versions
- correctly identify MSYS2, even in an env like MINGW64
2024-06-05 18:37:27 +02:00
c54f4783d0 Specify that decoders may reject non-zero probabilities for larger offset codes than supported by the implementation 2024-04-01 20:13:48 -04:00
8cff66f2f5 Remove text specifying probability overflow as invalid, the variable-size value encoding scheme makes this impossible. 2024-04-01 20:08:42 -04:00
c6e5257240 Merge pull request #3977 from facebook/doc_advanced
Doc update
2024-03-21 12:33:15 -07:00
741b87bbe1 Fuzzing and bugfixes for magicless-format decoding (#3976)
* fuzzing and bugfixes for magicless format

* reset dctx before each decompression

* do not memcmp empty buffers

* nit: decompressor errata
2024-03-20 19:22:34 -04:00
c5da438dc0 fix typo 2024-03-18 12:33:22 -07:00
3d18d9a9ce updated API manual 2024-03-18 12:30:54 -07:00
686e7e4b4b updated version to v1.5.6 2024-03-14 15:38:14 -07:00
eb5f7a7fa2 produced golden sample for the offset==0 decoder test
is correctly detected as corrupted by new version,
and is accepted (changed into offset==1) by older version.

updated documentation accordingly, with an hexadecimal representation.
2024-03-09 00:33:44 -08:00
d2f56ba442 update documentation 2024-03-08 15:55:30 -08:00
e127139ceb Merge pull request #3824 from elasota/specify-zero-offset
Specify offset 0 as invalid and specify required fixup behavior
2024-03-08 15:25:48 -08:00
478e5fedf9 Merge pull request #3816 from elasota/fix-state-table
Fix state table formatting
2024-03-08 15:02:00 -08:00
f77f634d41 update API documentation 2024-02-24 01:28:17 -08:00
7971fd16f7 Merge pull request #3817 from elasota/oversized-probs-clarification
Clarify that probability tables must not contain non-zero probabilities for invalid values
2024-01-13 11:37:54 -08:00
f06b18b3ff Specify offset 0 as invalid 2023-12-28 16:47:09 -05:00
05059e5a48 Clarify that there must be at least 2 weights, i.e. encoding all weights as 0 is invalid 2023-11-24 16:49:40 -05:00
dc84e35138 Clarify that the presence of a value with weight 1 is required 2023-11-24 16:49:40 -05:00
c5bf96fb74 Clarify that a non-zero probability for an invalid symbol is invalid 2023-11-13 00:03:56 -05:00
52e41b9ac8 Fix malformed state table 2023-11-09 12:28:21 -05:00
e61e3ff152 Clarify that decoding too many Huffman weights is a failure condition 2023-11-08 20:06:58 -05:00
324cce4996 Add definition of "log2sup" function 2023-10-31 11:45:10 -04:00
b38d87b476 Clarify that the log2 of the largest possible symbol is the maximum number of bits consumed 2023-10-31 01:17:23 -04:00
fe34776c20 Fix new typos found by codespell 2023-09-23 18:56:01 +02:00
3732a08f5b fixed decoder behavior when nbSeqs==0 is encoded using 2 bytes
The sequence section starts with a number, which tells how sequences are present in the section.
If this number if 0, the section automatically ends.

The number 0 can be represented using the 1 byte or the 2 bytes formats.
That's because the 2-bytes formats fully overlaps the 1 byte format.

However, when 0 is represented using the 2-bytes format,
the decoder was expecting the sequence section to continue,
and was looking for FSE tables, which is incorrect.

Fixed this behavior, in both the reference decoder and the educational behavior.

In practice, this behavior never happens,
because the encoder will always select the 1-byte format to represent 0,
since this is more efficient.

Completed the fix with a new golden sample for tests,
a clarification of the specification,
and a decoder errata paragraph.
2023-06-05 16:03:00 -07:00
8030342eea Merge pull request #3659 from facebook/fixHarness
Fixed a bug in the educational decoder
2023-06-05 15:03:14 -04:00
1f83b7cfc4 fix a minor inefficiency in compress_superblock
and in `decodecorpus`:
the specific case `nbSeq=127` can be represented using the 1-byte format.
Note that both the 1-byte and the 2-bytes formats are valid to represent this case,
so there was no "error", produced data remains valid,
it's just that the 1-byte format is more efficient.

fix #3667

Credit to @ip7z for finding this issue.
2023-06-05 09:51:52 -07:00