1
0
mirror of https://github.com/facebook/zstd.git synced 2025-09-11 11:51:02 +03:00
Commit Graph

11318 Commits

Author SHA1 Message Date
Yann Collet
ae64545c6b fixed a potential division by 0 in the cli trace unit 2025-08-19 17:13:15 -07:00
Yann Collet
40c285e0ba Merge pull request #4419 from AZero13/patch-1
Check for job before releasing resources
2025-08-19 17:02:48 -07:00
Yann Collet
cfeb29e397 Merge pull request #4462 from facebook/dependabot/github_actions/actions/checkout-5
Bump actions/checkout from 4 to 5
2025-08-18 09:10:13 -07:00
dependabot[bot]
0e69452a30 Bump actions/checkout from 4 to 5
Bumps [actions/checkout](https://github.com/actions/checkout) from 4 to 5.
- [Release notes](https://github.com/actions/checkout/releases)
- [Commits](https://github.com/actions/checkout/compare/v4...v5)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: '5'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-08-18 08:13:07 +00:00
Yann Collet
3991e8ac84 Merge pull request #4459 from Margen67/premake
Remove need for trailing forward slash in dir
2025-08-17 12:14:42 -07:00
Margen67
1e1db47323 Remove need for trailing forward slash in dir 2025-08-17 00:44:39 -07:00
Yann Collet
e128976193 Merge pull request #4448 from Cyan4973/install_oses
regroup list of OSes for install inside common variable
2025-07-28 11:01:58 -08:00
Yann Collet
8bca04ba9f regroup list of OSes for install inside common variable
within lib/install_oses.mk.

fixes #4445
2025-07-28 11:33:22 -07:00
Yann Collet
5b89189741 Merge pull request #4450 from facebook/dependabot/github_actions/github/codeql-action-3.29.4
Bump github/codeql-action from 3.28.9 to 3.29.4
2025-07-28 07:33:09 -08:00
dependabot[bot]
96f316a246 Bump github/codeql-action from 3.28.9 to 3.29.4
Bumps [github/codeql-action](https://github.com/github/codeql-action) from 3.28.9 to 3.29.4.
- [Release notes](https://github.com/github/codeql-action/releases)
- [Changelog](https://github.com/github/codeql-action/blob/main/CHANGELOG.md)
- [Commits](9e8d0789d4...4e828ff8d4)

---
updated-dependencies:
- dependency-name: github/codeql-action
  dependency-version: 3.29.4
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-28 06:30:43 +00:00
Yann Collet
9bf5d340ae Merge pull request #4447 from facebook/android-cmake
added android cmake build
2025-07-24 10:07:16 -08:00
Yann Collet
34f3a0ab11 Merge pull request #4413 from arpadpanyik-arm/huf_decode2x
AArch64: Enhance struct access in Huffman decode 2X
2025-07-23 15:03:37 -08:00
Yann Collet
6f1cb87ade Merge pull request #4443 from facebook/opt_simplify_4442
simplify sequence resolution in zstd_opt
2025-07-23 15:01:36 -08:00
Yann Collet
3b23f0c673 added android cmake build
is expecte to fail, due to #4444
2025-07-23 15:07:20 -07:00
Yann Collet
0055ce7a02 simplify sequence resolution in zstd_opt
initially hinted by @pitaj in #4442
2025-07-18 21:21:47 -07:00
Yann Collet
f9e26bb42b Merge pull request #4394 from AZero13/zstd
Remove redundant setting of allJobsCompleted to 1
2025-07-18 18:55:47 -08:00
Yann Collet
8c651868ff Merge pull request #4418 from arpadpanyik-arm/decode_seq_opt
AArch64: Improve ZSTD_decodeSequence performance
2025-07-18 18:54:49 -08:00
Yann Collet
a1e11db08a Merge pull request #4435 from zijianli1234/dev
add riscv  ci
2025-07-18 18:54:24 -08:00
Yann Collet
afa96bbf25 Merge pull request #4429 from arpadpanyik-arm/convertSequences_Neon
Improve speed of ZSTD_compressSequencesAndLiterals using Neon
2025-07-13 23:52:48 -08:00
Yann Collet
c768d7b94b Merge pull request #4436 from facebook/dependabot/github_actions/cygwin/cygwin-install-action-6
Bump cygwin/cygwin-install-action from 5 to 6
2025-07-13 23:52:32 -08:00
dependabot[bot]
3ce4d1cba3 Bump cygwin/cygwin-install-action from 5 to 6
Bumps [cygwin/cygwin-install-action](https://github.com/cygwin/cygwin-install-action) from 5 to 6.
- [Release notes](https://github.com/cygwin/cygwin-install-action/releases)
- [Commits](f61179d722...f200932376)

---
updated-dependencies:
- dependency-name: cygwin/cygwin-install-action
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-07-14 06:27:46 +00:00
Yann Collet
9a41990883 Merge pull request #4433 from facebook/vs2025
removed VS2019 runners
2025-07-12 19:44:28 -08:00
ZijianLi
534860c90b add -DMEM_FORCE_MEMORY_ACCESS=0 in CI RVV test 2025-07-13 10:51:08 +08:00
Yann Collet
7325384a68 removed VS2019 runners
replaced by one vs2025 runner,
which is badly named since it still running MSVC 2022,
but it's a good test that  shows that the matrix is able to handle multiple MSVC versions.
2025-07-11 10:29:07 -07:00
Arpad Panyik
703f855734 AArch64: Enable optimized QEMU CI builds
Add missing `-O3` flag to the compilation of AArch64 SVE2 builds
executed by QEMU. This can decrease the CI job runtime considerably.
2025-07-10 18:20:57 +00:00
Arpad Panyik
07cd78d366 AArch64: Add Neon path for convertSequences_noRepcodes
Add a 4-way Neon implementation for the convertSequences_noRepcodes
function. Remove 'static' keywords from all of its implementations to
be able to add unit tests.

Relative performance to Clang-18 using: `./fullbench -b18 -l5 enwik5`

Neoverse-V2   before     after
Clang-18:    100.000%  311.703%
Clang-19:    100.191%  311.714%
Clang-20:    100.181%  311.723%
GCC-13:      107.520%  252.309%
GCC-14:      107.652%  253.158%
GCC-15:      107.674%  253.168%

Cortex-A720   before     after
Clang-18:    100.000%  204.512%
Clang-19:    102.825%  204.600%
Clang-20:    102.807%  204.558%
GCC-13:      110.668%  203.594%
GCC-14:      110.684%  203.978%
GCC-15:      102.864%  204.299%

Co-authored by, Thomas Daubney <Thomas.Daubney@arm.com>
2025-07-10 18:20:57 +00:00
Arpad Panyik
8e4400463a Improve ZSTD_get1BlockSummary
Add a faster scalar implementation of ZSTD_get1BlockSummary which
removes the data dependency of the accumulators in the hot loop to
leverage the superscalar potential of recent out-of-order CPUs.
The new algorithm leverages SWAR (SIMD Within A Register) methodology
to exploit the capabilities of 64-bit architectures. It achieves this
by packing two 32-bit data elements into a single 64-bit register,
enabling parallel operations on these subcomponents while ensuring
that the 32-bit boundaries prevent overflow, thereby optimizing
computational efficiency.

Corresponding unit tests are included.

Relative performance to GCC-13 using: `./fullbench -b19 -l5 enwik5`

Neoverse-V2   before     after
GCC-13:      100.000%  290.527%
GCC-14:      100.000%  291.714%
GCC-15:       99.914%  291.495%
Clang-18:    148.072%  264.524%
Clang-19:    148.075%  264.512%
Clang-20:    148.062%  264.490%

Cortex-A720   before     after
GCC-13:      100.000%  235.261%
GCC-14:      101.064%  234.903%
GCC-15:      112.977%  218.547%
Clang-18:    127.135%  180.359%
Clang-19:    127.149%  180.297%
Clang-20:    127.154%  180.260%

Co-authored by, Thomas Daubney <Thomas.Daubney@arm.com>
2025-07-10 18:20:49 +00:00
ZijianLi
d04e7944dd add compiler version check. 2025-07-07 23:07:39 +08:00
ZijianLi
2c3f23b018 fix dereferencing type-punned pointer error 2025-06-29 15:36:25 +08:00
ZijianLi
40f64f3493 add riscv rvv ci 2025-06-29 15:33:50 +08:00
Yann Collet
1dbc2e0908 Merge pull request #4414 from arpadpanyik-arm/copy8
AArch64: Use better block COPY8
2025-06-25 07:47:01 -04:00
Rose
4efbd56749 Check for job before releasing
ZSTDMT_freeCCtx calls ZSTDMT_releaseAllJobResources, but ZSTDMT_releaseAllJobResources may be called when ZSTDMT_freeCCtx is called when initialization fails, resulting in a NULL pointer dereference.
2025-06-24 14:05:08 -04:00
Rose
50f169411b Remove redundant setting of allJobsCompleted to 1
This will do it automatically.
2025-06-24 14:04:21 -04:00
Arpad Panyik
a28e8182b1 AArch64: Improve ZSTD_decodeSequence performance
LLVM's alias-analysis sometimes fails to see that a static-array member
of a struct cannot alias other members. This patch:

- Reduces array accesses via struct indirection to aid load/store alias
  analysis under Clang.
- Converts dynamic array indexing into conditional-move arithmetic,
  eliminating branches and extra loads/stores on out-of-order CPUs.
- Reloads the bitstream only when match-length bits are consumed
  (assuming each reload only needs to happen once per match-length
  read), improving branch-prediction rates.
- Removes the UNLIKELY() hint, which recent compilers already handle
  well without cost.

Decompression uplifts on a Neoverse V2 system, using Zstd-1.5.8
compiled with "-O3 -march=armv8.2-a+sve2":

                 Clang-19  Clang-20   Clang-*    GCC-14    GCC-15
 1#silesia.tar:  +11.556%  +16.203%   +0.240%   +2.216%   +7.891%
 2#silesia.tar:  +15.493%  +21.140%   -0.041%   +2.850%   +9.926%
 3#silesia.tar:  +16.887%  +22.570%   -0.183%   +3.056%  +10.660%
 4#silesia.tar:  +17.785%  +23.315%   -0.262%   +3.343%  +11.187%
 5#silesia.tar:  +18.125%  +24.175%   -0.466%   +3.350%  +11.228%
 6#silesia.tar:  +17.607%  +23.339%   -0.591%   +3.175%  +10.851%
 7#silesia.tar:  +17.463%  +22.837%   -0.486%   +3.292%  +10.868%

* Requires Clang-21 support from LLVM commit hash
  `a53003fe23cb6c871e72d70ff2d3a075a7490da2`
   (Clang-21 hasn’t been released as of this writing)

Co-authored by:
 David Sherwood, David.Sherwood@arm.com
 Ola Liljedahl, Ola.Liljedahl@arm.com
2025-06-24 12:22:23 +00:00
Arpad Panyik
bd38fc2c5f AArch64: Enhance struct access in Huffman decode 2X
In the multi-stream multi-symbol Huffman decoder GCC generates
suboptimal code - emitting more loads for HUF_DEltX2 struct member
accesses. Forcing it to use 32-bit loads and bit arithmetic to extract
the necessary parts (UBFX) improves the overall decode speed.

Also avoid integer type conversions in the symbol decodes, which
leads to better instruction selection in table lookup accesses.

On AArch64 the decoder no longer runs into register-pressure limits,
so we can simplify the hot path and improve throughput

Decompression uplifts on a Neoverse V2 system, using Zstd-1.5.8
compiled with "-O3 -march=armv8.2-a+sve2":

                 Clang-20   Clang-*    GCC-13    GCC-14    GCC-15
 1#silesia.tar:   +0.820%   +1.365%   +2.480%   +1.348%   +0.987%
 2#silesia.tar:   +0.426%   +0.784%   +1.218%   +0.665%   +0.554%
 3#silesia.tar:   +0.112%   +0.389%   +0.508%   +0.188%   +0.261%

* Requires Clang-21 support from LLVM commit hash
  `a53003fe23cb6c871e72d70ff2d3a075a7490da2`
  (Clang-21 hasn’t been released as of this writing)
2025-06-23 14:16:25 +00:00
Yann Collet
3c3b8274c5 Merge pull request #4417 from facebook/dependabot/github_actions/msys2/setup-msys2-2.28.0
Bump msys2/setup-msys2 from 2.27.0 to 2.28.0
2025-06-23 06:32:14 -07:00
dependabot[bot]
7b1b6a0d2d Bump msys2/setup-msys2 from 2.27.0 to 2.28.0
Bumps [msys2/setup-msys2](https://github.com/msys2/setup-msys2) from 2.27.0 to 2.28.0.
- [Release notes](https://github.com/msys2/setup-msys2/releases)
- [Changelog](https://github.com/msys2/setup-msys2/blob/main/CHANGELOG.md)
- [Commits](61f9e5e925...40677d36a5)

---
updated-dependencies:
- dependency-name: msys2/setup-msys2
  dependency-version: 2.28.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-06-23 06:24:00 +00:00
Yann Collet
bdceb81271 Merge pull request #4415 from bgilbert/buildtype
meson: drop unused variable
2025-06-21 20:31:26 -07:00
Yann Collet
2e8ec28b30 Merge pull request #4416 from facebook/test_largeDictionary
added test-largeDictionary to dev-long CI script
2025-06-21 12:37:08 -07:00
Yann Collet
2295826266 update tests duration indications 2025-06-21 12:01:07 -07:00
Yann Collet
d77a7b6895 added test-largeDictionary to dev-long CI script 2025-06-21 11:34:10 -07:00
Yann Collet
528132e9a0 Merge pull request #4402 from mugitya03/tests
Release resources in error paths via cleanup
2025-06-21 11:33:44 -07:00
jinyaoguo
878be1c8f0 fix 2025-06-21 13:43:47 -04:00
jinyaoguo
16e13ebdeb delete 2025-06-21 13:03:13 -04:00
jinyaoguo
a74f7fcabd merge 2025-06-21 12:57:12 -04:00
Benjamin Gilbert
a4b9ebcbeb meson: drop unused variable 2025-06-20 23:34:13 -07:00
Arpad Panyik
1e9d2006ae AArch64: Use better block copy8
The vector copy is only necessary for 16-byte blocks on AArch64.

Decompression uplifts on a Neoverse V2 system, using Zstd-1.5.8
compiled with "-O3 -march=armv8.2-a+sve2":

                 Clang-19  Clang-20    GCC-14    GCC-15
 1#silesia.tar:   +0.316%   +0.865%   +0.025%   +0.096%
 2#silesia.tar:   +0.689%   +1.374%   +0.027%   +0.065%
 3#silesia.tar:   +0.811%   +1.654%   +0.034%   +0.033%
 4#silesia.tar:   +0.912%   +1.755%   +0.027%   +0.042%
 5#silesia.tar:   +0.995%   +1.826%   +0.062%   +0.094%
 6#silesia.tar:   +0.976%   +1.777%   +0.065%   +0.104%
 7#silesia.tar:   +0.910%   +1.738%   +0.077%   +0.110%
2025-06-20 17:05:41 +00:00
Yann Collet
7eefc22169 Merge pull request #4367 from ClickHouse/cfi
Add unwind information in huf_decompress_amd64.S
2025-06-19 23:41:38 -07:00
Yann Collet
354cede369 Merge pull request #4412 from Cyan4973/rm_bd
remove duplicate
2025-06-19 14:32:32 -07:00
Yann Collet
e315155cc2 removed duplicate
this file is already present as `largeDictionary.c`
2025-06-18 15:07:32 -07:00