1
0
mirror of https://github.com/facebook/zstd.git synced 2025-07-29 11:21:22 +03:00

802 Commits

Author SHA1 Message Date
40a7188130 Fix make clangbuild & add CI
Fix the errors for:
* `-Wdocumentation`
* `-Wconversion` except `-Wsign-conversion`
2022-12-21 17:31:04 -08:00
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
7f12f24cf4 Rewrite Copyright Date Ranges from -present to -2022
Apparently it's better. Somehow.

```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do echo $f; sed -i 's/\-present/-2022/' $f; done

g co HEAD -- build/meson/
```
2022-12-20 12:44:56 -05:00
36d5c2f326 Update Copyright Year ('2021' -> 'present')
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f);
do
  sed -i 's/\-2021/-present/' $f;
done

g co HEAD -- .github/workflows/dev-short-tests.yml # fix bad match
```
2022-12-20 12:42:50 -05:00
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
e9797b5dc5 [pzstd] Fixes for Windows build
* Add `Portability.h` to fix min/max issues.
* Fix conversion warnings
* Assert that windowLog <= 23, which is currently always the case.
  This could be loosened, but we aren't looking to add new functionality.

Fixes on top of PR #3375 by @eli-schwartz, which added Windows CI for contrib & programs.
2022-12-19 14:09:43 -08:00
358a237484 [api][visibility] Make the visibility macros more consistent
1. Follow the scheme introduced in PR #2501 for both `zdict.h` and `zstd_errors.h`.
2. If the `*_VISIBLE` macro isn't set, but the `*_VISIBILITY` macro is, use that.
   Also make this change for `zstd.h`, since we probably shouldn't have changed
   that macro name without backward compatibility in the first place.
3. Change all references to `*_VISIBILITY` to `*_VISIBLE`.

Fixes #3359.
2022-12-16 12:54:45 -08:00
e2fc93340f Merge branch 'dev' into http-to-https 2022-12-15 10:46:13 -05:00
a78c91ae59 Use proper unaligned access attributes
Instead of using packed attribute hack, just use aligned attribute. It
improves code generation on armv6 and armv7, and slightly improves code
generation on aarch64. GCC generates identical code to regular aligned
access on ARMv6 for all versions between 4.5 and trunk, except GCC 5
which is buggy and generates the same (bad) code as packed access:
https://gcc.godbolt.org/z/hq37rz7sb
2022-12-14 16:00:37 -08:00
72845ebad2 Merge pull request #3346 from daniellerozenblit/seekable-format-empty-string
Seekable format empty string
2022-12-14 14:28:32 -05:00
4dffc35f2e Convert references to https from http 2022-12-14 06:58:35 -08:00
e767d5c7c1 [contrib][linux-kernel] Fix stack detection for newer gcc
Newer gcc versions were getting smart and omitting the `memset()`.
Get around this issue by outlining the `memset()` into a different
function. This test is still hacky, but it works...
2022-12-13 15:56:53 -08:00
aece0f258a free memory in test case 2022-12-13 08:15:16 -08:00
6ad71a3f0b Merge pull request #10 from yhoogstrate/seekable_header_skip
seekable_format no header when compressing empty string to stream
2022-12-12 17:58:58 -05:00
43de2aa17d [contrib][linux] Disable ASM in the kernel
Disable ASM in the kernel for now. It requires a few changes & setup to
get working. Instead of doing it in a zstd version update, I'd prefer to
package that change as a single patch, and propose it separately from
the version update. This makes the version update easier, and reduces
some risk.
2022-10-21 17:14:31 -07:00
330558ad52 [contrib][linux] Add zstd_common module
The zstd_common module was added upstream in commit
637a642f5c.

But the kernel specific code was inlined into the library. This commit
switches it to use the out of line method that we use for the other
modules.
2022-10-21 17:14:31 -07:00
5c1cdba7dd [contrib][linux-kernel] Generate SPDX license identifiers (#3294)
Add a `--spdx` option to the freestanding script to prefix
files with a line like (for `.c` files):

    // SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause

or (for `.h` and `.S` files):

    /* SPDX-License-Identifier: GPL-2.0+ OR BSD-3-Clause */

Given the style of the line to be used depends on the extension,
a simple `sed` insert command would not work.

It also skips the file if an existing SPDX line is there,
as well as raising an error if an unexpected SPDX line appears
anywhere else in the file, as well as for unexpected
file extensions.

I double-checked that all currently generated files appear
to be license as expected with:

    grep -LRF 'This source code is licensed under both the BSD-style license (found in the'  linux/lib/zstd
    grep -LRF 'LICENSE file in the root directory of this source tree) and the GPLv2 (found' linux/lib/zstd

but somebody knowledgable on the licensing of the project should
double-check this is the intended case.

Fixes: https://github.com/facebook/zstd/issues/3293
Signed-off-by: Miguel Ojeda <ojeda@kernel.org>

Signed-off-by: Miguel Ojeda <ojeda@kernel.org>
2022-10-18 16:35:44 -07:00
1613caf8bd use ZSTD_sequenceBound in seqBench 2022-09-09 13:04:41 -07:00
39ab02a71f Merge pull request #3257 from embg/seqBench2
Benchmark program for sequence compression API
2022-09-09 15:53:28 -04:00
0015308c0f Fix typos found by codespell 2022-09-08 23:17:00 +02:00
61c79bf0d6 Benchmark program for sequence compression API 2022-09-08 09:20:50 -07:00
6255f994d3 [largeNbDicts] Second try at fixing decompression segfault to always create compressInstructions
Summary:
Freeing an uninitialized pointer is undefined behavior. This caused a segfault
when compiling the benchmark with Clang -O3 and benching decompression.

V2: always create compressInstructions but check if cctxParams is NULL before
setting CCtx params to avoid segfault.

Test Plan:
make and run
2022-07-21 11:55:01 -07:00
d993a288e0 [largeNbDicts] Add an option to print out median speed
Summary:
Added an option -p# where -p0 (default) sets the aggregation method to fastest
speed while -p1 sets the aggregation method to median. Also added a new column
in the csv file to report this option's value.

Test Plan:
``
$ ./largeNbDicts -1 --nbDicts=1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.0 MB/s
Fastest Speed : 310.6 MB/s

$ ./largeNbDicts -1 --nbDicts=1 -p1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.9 MB/s
Median Speed : 298.4 MB/s
```
2022-07-20 11:19:41 -07:00
b550f9b77e [largeNbDicts] Print more metrics into csv file
Summary:
Add column headers and data for whether it's a compression or a decompression
run, compression level, nbDicts and dictAttachPref in additional to
compr/decompr speed.

Test Plan:
Example output:

```
./largeNbDicts
Compression/Decompression,Level,nbDicts,dictAttachPref,Speed
Compression,1,1,0,300.9
Compression,1,1,1,296.4
Compression,1,1,2,307.8
Compression,1,10,0,292.3
Compression,1,100,0,293.3
Compression,3,110,0,106.0
Decompression,-1,110,-1,155.6
Decompression,-1,110,-1,709.4
Decompression,-1,120,-1,709.1
Decompression,-1,120,-1,734.6
```
2022-07-19 16:50:28 -07:00
d0c88afe6d [largeNbDicts] Fix decompression segfault in createCompressInstructions
Benchmarking decompression results in a segfault in `createCompressInstructions`
because `cctxParams` is NULL. Skip running that function if we are not benching
compression.
2022-07-19 13:55:52 -07:00
6bd5ac6713 add prefetchCDictTables to largeNbDicts 2022-06-22 16:13:07 -04:00
24364057bc fix typo
Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2022-06-14 19:18:49 -04:00
2bbdc9f40e Fix FILE handle leak 2022-06-14 14:57:54 -07:00
f7ebbcd0cc Support advanced API so forceCopy/forceAttach works properly 2022-06-14 14:52:51 -07:00
e0c4863c5c largeNbDicts bugfix + improvements 2022-06-13 17:26:44 -07:00
05796796fd fix some typos
Signed-off-by: cuishuang <imcusg@gmail.com>
2022-04-26 17:40:23 +08:00
b772f53952 Typo and grammar fixes 2022-03-12 08:58:04 +01:00
498ac8238d [contrib][linux] Make zstd_reset_cstream() functionally identical to ZSTD_resetCStream()
- As referenced by Nick Terrelln ~ the ZSTD maintainer in the linux kernel, making zstd_reset_cstream() functionally identical to ZSTD_resetCStream() would be the perfect way to fix the warning without touching any core functions or breaking other parts of the code.

Suggested-by: Nick Terrell <terrelln@fb.com>
Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
2022-03-10 15:32:13 +08:00
8ff20c25f3 [contrib][linux] Use ZSTD_CCtx_setPledgedSrcSize() instead of ZSTD_CCtx_reset()
- The previous patch throws the following warning:

 ../linux/lib/zstd/zstd_compress_module.c: In function ‘zstd_reset_cstream’:
../linux/lib/zstd/zstd_compress_module.c:136:34: error: enum conversion when passing argument 2 of ‘ZSTD_CCtx_reset’ is invalid in C++ [-Werror=c++-compat]
  136 |  return ZSTD_CCtx_reset(cstream, pledged_src_size);
      |                                  ^~~~~~~~~~~~~~~~
In file included from ../linux/include/linux/zstd.h:26,
                 from ../linux/lib/zstd/zstd_compress_module.c:15:
../linux/include/linux/zstd_lib.h:501:20: note: expected ‘ZSTD_ResetDirective’ {aka ‘enum <anonymous>’} but argument is of type ‘long long unsigned int’
  501 | ZSTDLIB_API size_t ZSTD_CCtx_reset(ZSTD_CCtx* cctx, ZSTD_ResetDirective reset);
      |                    ^~~~~~~~~~~~~~~
cc1: all warnings being treated as errors

Since we have a choice to either use ZSTD_CCtx_reset or ZSTD_CCtx_setPledgedSrcSize instead of ZSTD_resetCStream, let's switch to ZSTD_CCtx_setPledgedSrcSize to not have any unnecessary warns alongside the kernel build and CI test build.

Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
2022-03-08 13:38:34 +08:00
e470c940f6 [contrib][linux] Fix a warning in zstd_reset_cstream()
- This fixes the below warning:

../lib/zstd/zstd_compress_module.c: In function 'zstd_reset_cstream':
../lib/zstd/zstd_compress_module.c:136:9: warning: 'ZSTD_resetCStream' is deprecated [-Wdeprecated-declarations]
  136 |         return ZSTD_resetCStream(cstream, pledged_src_size);
      |         ^~~~~~
In file included from ../include/linux/zstd.h:26,
                 from ../lib/zstd/zstd_compress_module.c:15:
../include/linux/zstd_lib.h:2277:8: note: declared here
 2277 | size_t ZSTD_resetCStream(ZSTD_CStream* zcs, unsigned long long pledgedSrcSize);
      |        ^~~~~~~~~~~~~~~~~

ZSTD_resetCstream is deprecated and zstd_CCtx_reset is suggested to use hence let's switch to it.

Signed-off-by: Cyber Knight <cyberknight755@gmail.com>
2022-03-07 11:55:33 +08:00
762898f5e4 Bugfix and new features for largeNbDicts benchmark 2022-02-11 13:15:16 -05:00
f17652931c seekable_format no header when compressing empty string to stream 2022-02-08 14:06:00 +01:00
cc7d23bcec Merge pull request #2965 from facebook/offbase
Converge sumtype (offset | repcode) numeric representation towards offBase
2022-01-24 15:47:42 -08:00
70df5de1b2 AsyncIO compression part 1 - refactor of existing asyncio code (#3021)
* Refactored fileio.c:
- Extracted asyncio code to fileio_asyncio.c/.h
- Moved type definitions to fileio_types.h
- Moved common macro definitions needed by both fileio.c and fileio_asyncio.c to fileio_common.h

* Bugfix - rename fileio_asycio to fileio_asyncio

* Added copyrights & license to new files

* CR fixes
2022-01-24 14:43:02 -08:00
8ea3d57de4 [build][asm] Pass ASFLAGS to the assembler instead of CFLAGS
* Add `-Wa,--noexecstack` to both `ASFLAGS` and `CFLAGS`
* Pass `ASFLAGS` to `.S` compilation instead of `CFLAGS`

Fixes #3006.
2022-01-18 15:11:29 -08:00
03903f5701 fixed minor compression difference in btlazy2
subtle dependency on sumtype numeric representation
2021-12-29 18:51:03 -08:00
99923dfc1a Add typedefs for 8bit (un)signed
To make code more expressive, add U8 and S8 typedefs
2021-12-14 23:47:57 +01:00
c284569457 [asm] Share portability macros and restrict ASM further
Move portability macros to `lib/common/portability_macros.h`. This file
only contains platform/feature detection (e.g. 0/1 macros). This file is
shared between C and ASM code, so it cannot include any C code.

Rename `HUF_` ASM macros to be `ZSTD_` prefixed, and move to the new
header.

Restrict `ZSTD_ASM_SUPPORTED` to `__GNUC__`, because we need the GAS
assembler.

Finally, only include the ASM code if we are actually going to use it.
This disables it on all Windows platforms, which should resolve the
problem brought up in Issue #2789.
2021-12-02 16:58:04 -08:00
a7a469e095 [contrib][pzstd] Fix build issue with gcc-5
gcc-5 didn't like the l-value overload for defaulted operator=. There is
no reason it needs to be l-value overloaded, so just remove it.

I'm not sure why the build broke for @mckaygerhard in Issue #2811, since
this code hasn't changed since it was added. But, there is no harm in
fixing it.

Fixes issue #2811.
2021-11-30 18:29:02 -08:00
7abebc847b Clarify documentation for -c (#2883) 2021-11-29 14:10:43 -05:00
92c41c53ee Merge pull request #2866 from terrelln/linux-O2
[linux-kernel] Don't add -O3 to CFLAGS
2021-11-16 17:23:23 -08:00
e2d01863bc [linux-kernel] Don't add -O3 to CFLAGS
It is no longer necessary to get good performance, there is only a small
speed difference between -O2 and -O3, so just stick to the default of
-O2. I've measured neutral compression speed and a ~3% decompression
speed loss in userspace with clang & gcc. I've also measured neutral
compression speed and a ~1% decompression speed loss in the kernel
benchmarks.

This also fixes the stack space usage on parisc. The compiler was buggy
for -O3 and used ~3KB of stack space for several functions. With -O2 the
problem is completely resolved, and stack space is back to a few hundred
bytes.

Additionally, we get a large code size win on gcc:

| Compiler | Before (Bytes) | After (Bytes) | Delta (Bytes) |
|----------|----------------|---------------|---------------|
| gcc-11   |         952754 |        738954 |       -213800 |
| clang-12 |         976290 |        938826 |        -37464 |
2021-11-16 15:42:36 -08:00
19eb459da3 [linux-kernel] Don't inline function in zstd_opt.c
The optimal parser is unlikely to be used in the linux kernel in
practice. There is no reason these functions should be force inlined,
since we aren't gaining anything, and are losing build size.

| Compiler | Before (Bytes) | After (Bytes) | Delta (Bytes) |
|----------|----------------|---------------|---------------|
| gcc-11   |        1142090 |        952754 |       -189336 |
| clang-12 |        1228402 |        976290 |       -252112 |

This is a temporary solution pending the resolution of PR #2862 in the
`dev` branch.
2021-11-15 20:37:30 -08:00
b10085d97d [contrib][linux-kernel] Add standard warnings and -Werror to CI
Test the kernel build with the standard warnings enabled so that we
don't miss issues like fixed in PR #2802.

Stacked on top of PR #2802 so CI passes, so it includes the fix. But, I
won't merge until it is merged, so @solbjorn gets the credit for the
fix.
2021-09-24 15:08:11 -07:00
32a8443b5c Merge pull request #2790 from solbjorn/huf-asm-kernel
[contrib][linux] Fix build after introducing ASM HUF implementation
2021-09-24 12:42:03 -07:00