1
0
mirror of https://github.com/facebook/zstd.git synced 2025-09-11 11:51:02 +03:00
Commit Graph

46 Commits

Author SHA1 Message Date
Yann Collet
56cfb7816a codemod: ZSTD_paramSwitch_e -> ZSTD_ParamSwitch_e 2024-12-20 10:36:58 -08:00
Yann Collet
4de9d637e8 minor: fix missing newline character in help page 2023-02-08 15:56:49 -08:00
Nick Terrell
40a7188130 Fix make clangbuild & add CI
Fix the errors for:
* `-Wdocumentation`
* `-Wconversion` except `-Wsign-conversion`
2022-12-21 17:31:04 -08:00
W. Felix Handte
5d693cc38c Coalesce Almost All Copyright Notices to Standard Phrasing
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do sed -i '/Copyright .* \(Yann Collet\)\|\(Meta Platforms\)/ s/Copyright .*/Copyright (c) Meta Platforms, Inc. and affiliates./' $f; done

git checkout HEAD -- build/VS2010/libzstd-dll/libzstd-dll.rc build/VS2010/zstd/zstd.rc tests/test-license.py contrib/linux-kernel/test/include/linux/xxhash.h examples/streaming_compression_thread_pool.c lib/legacy/zstd_v0*.c lib/legacy/zstd_v0*.h
nano ./programs/windres/zstd.rc
nano ./build/VS2010/zstd/zstd.rc
nano ./build/VS2010/libzstd-dll/libzstd-dll.rc
```
2022-12-20 12:52:34 -05:00
W. Felix Handte
7f12f24cf4 Rewrite Copyright Date Ranges from -present to -2022
Apparently it's better. Somehow.

```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora -o -path ./tests/regression/data-cache -o -path ./tests/regression/cache \) -prune -o -type f); do echo $f; sed -i 's/\-present/-2022/' $f; done

g co HEAD -- build/meson/
```
2022-12-20 12:44:56 -05:00
W. Felix Handte
8927f985ff Update Copyright Headers 'Facebook' -> 'Meta Platforms'
```
for f in $(find . \( -path ./.git -o -path ./tests/fuzz/corpora \) -prune -o -type f);
do
  sed -i 's/Facebook, Inc\./Meta Platforms, Inc. and affiliates./' $f;
done
```
2022-12-20 12:37:57 -05:00
Han Zhu
6255f994d3 [largeNbDicts] Second try at fixing decompression segfault to always create compressInstructions
Summary:
Freeing an uninitialized pointer is undefined behavior. This caused a segfault
when compiling the benchmark with Clang -O3 and benching decompression.

V2: always create compressInstructions but check if cctxParams is NULL before
setting CCtx params to avoid segfault.

Test Plan:
make and run
2022-07-21 11:55:01 -07:00
Han Zhu
d993a288e0 [largeNbDicts] Add an option to print out median speed
Summary:
Added an option -p# where -p0 (default) sets the aggregation method to fastest
speed while -p1 sets the aggregation method to median. Also added a new column
in the csv file to report this option's value.

Test Plan:
``
$ ./largeNbDicts -1 --nbDicts=1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.0 MB/s
Fastest Speed : 310.6 MB/s

$ ./largeNbDicts -1 --nbDicts=1 -p1 -D ~/benchmarks/html/html_8_16K.32K.dict
~/benchmarks/html/html_8_16K/*
loading 7450 files...
created src buffer of size 83.4 MB
split input into 7450 blocks
loading dictionary /home/zhuhan/benchmarks/html/html_8_16K.32K.dict
compressing at level 1 without dictionary : Ratio=3.03  (28827863 bytes)
compressed using a 32768 bytes dictionary : Ratio=4.28  (20410262 bytes)
generating 1 dictionaries, using 0.1 MB of memory
Compression Speed : 306.9 MB/s
Median Speed : 298.4 MB/s
```
2022-07-20 11:19:41 -07:00
Han Zhu
b550f9b77e [largeNbDicts] Print more metrics into csv file
Summary:
Add column headers and data for whether it's a compression or a decompression
run, compression level, nbDicts and dictAttachPref in additional to
compr/decompr speed.

Test Plan:
Example output:

```
./largeNbDicts
Compression/Decompression,Level,nbDicts,dictAttachPref,Speed
Compression,1,1,0,300.9
Compression,1,1,1,296.4
Compression,1,1,2,307.8
Compression,1,10,0,292.3
Compression,1,100,0,293.3
Compression,3,110,0,106.0
Decompression,-1,110,-1,155.6
Decompression,-1,110,-1,709.4
Decompression,-1,120,-1,709.1
Decompression,-1,120,-1,734.6
```
2022-07-19 16:50:28 -07:00
Han Zhu
d0c88afe6d [largeNbDicts] Fix decompression segfault in createCompressInstructions
Benchmarking decompression results in a segfault in `createCompressInstructions`
because `cctxParams` is NULL. Skip running that function if we are not benching
compression.
2022-07-19 13:55:52 -07:00
Elliot Gorokhovsky
6bd5ac6713 add prefetchCDictTables to largeNbDicts 2022-06-22 16:13:07 -04:00
Elliot Gorokhovsky
24364057bc fix typo
Co-authored-by: Nick Terrell <nickrterrell@gmail.com>
2022-06-14 19:18:49 -04:00
Elliot Gorokhovsky
2bbdc9f40e Fix FILE handle leak 2022-06-14 14:57:54 -07:00
Elliot Gorokhovsky
f7ebbcd0cc Support advanced API so forceCopy/forceAttach works properly 2022-06-14 14:52:51 -07:00
Elliot Gorokhovsky
e0c4863c5c largeNbDicts bugfix + improvements 2022-06-13 17:26:44 -07:00
Elliot Gorokhovsky
762898f5e4 Bugfix and new features for largeNbDicts benchmark 2022-02-11 13:15:16 -05:00
Bimba Shrestha
80053bdae3 updating cold benchmark 2020-09-10 18:51:52 -04:00
Bimba Shrestha
0301ef5d04 [bench] Extending largeNbDicts to compression (#2089)
* adding cdict_collection_t

* adding shuffleCDictionaries()

* adding compressInstructions

* adding compress()

* integrating compression into bench()

* copy paste error fix

* static analyzer uninit value complaint fix

* changing to control

* removing assert

* changing to control

* moving memcpy to seperate function

* fixing static analyzer complaint

* another hacky solution attempt

* Copying createbuffer logic
2020-05-04 10:42:22 -07:00
Yann Collet
9a3de0a535 changed name from createX to assembleX
shows that the resulting object just takes ownership of provided buffer.
2019-11-25 15:34:55 -08:00
Yann Collet
31a0abbfda updated pzstd and largeNbDicts to use the new FileNamesTable* abstraction 2019-11-06 09:10:05 -08:00
Qin Li
04a9d6b828 fix compiling errors with clang-8
Compiling with clang-8 fails with the following errors:

largeNbDicts.c:562:37: error: implicit conversion turns floating-point
number into integer: 'const double' to 'U64' (aka 'unsigned long')
[-Werror,-Wfloat-conversion]
        U64 const dTime_ns = result.nanoSecPerRun;
                  ~~~~~~~~   ~~~~~~~^~~~~~~~~~~~~

zstdcli.c:300:5: error: '@return' command used in a comment that is
not attached to a function or method declaration
[-Werror,-Wdocumentation]
 * @return 1 means that cover parameters were correct
   ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

zstdcli.c:301:5: error: '@return' command used in a comment that is
not attached to a function or method declaration
[-Werror,-Wdocumentation]
 * @return 0 in case of malformed parameters
   ~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2019-07-18 19:41:00 -07:00
Josh Soref
a880ca239b Spelling (#1582)
* spelling: accidentally

* spelling: across

* spelling: additionally

* spelling: addresses

* spelling: appropriate

* spelling: assumed

* spelling: available

* spelling: builder

* spelling: capacity

* spelling: compiler

* spelling: compressibility

* spelling: compressor

* spelling: compression

* spelling: contract

* spelling: convenience

* spelling: decompress

* spelling: description

* spelling: deflate

* spelling: deterministically

* spelling: dictionary

* spelling: display

* spelling: eliminate

* spelling: preemptively

* spelling: exclude

* spelling: failure

* spelling: independence

* spelling: independent

* spelling: intentionally

* spelling: matching

* spelling: maximum

* spelling: meaning

* spelling: mishandled

* spelling: memory

* spelling: occasionally

* spelling: occurrence

* spelling: official

* spelling: offsets

* spelling: original

* spelling: output

* spelling: overflow

* spelling: overridden

* spelling: parameter

* spelling: performance

* spelling: probability

* spelling: receives

* spelling: redundant

* spelling: recompression

* spelling: resources

* spelling: sanity

* spelling: segment

* spelling: series

* spelling: specified

* spelling: specify

* spelling: subtracted

* spelling: successful

* spelling: return

* spelling: translation

* spelling: update

* spelling: unrelated

* spelling: useless

* spelling: variables

* spelling: variety

* spelling: verbatim

* spelling: verification

* spelling: visited

* spelling: warming

* spelling: workers

* spelling: with
2019-04-12 11:18:11 -07:00
Yann Collet
59a7116cc2 benchfn dependencies reduced to only timefn
benchfn used to rely on mem.h, and util,
which in turn relied on platform.h.
Using benchfn outside of zstd required to bring all these dependencies.

Now, dependency is reduced to timefn only.
This required to create a separate timefn from util,
and rewrite benchfn and timefn to no longer need mem.h.

Separating timefn from util has a wide effect accross the code base,
as usage of time functions is widespread.
A lot of build scripts had to be updated to also include timefn.
2019-04-10 12:37:03 -07:00
Peter (Stig) Edwards
2b7120ec71 -Wformat-security not needed with -Wformat=2 2019-02-01 09:28:41 +00:00
Yann Collet
34f01e600f fixed multiple conversions
from 64-bit to 32-bit
2018-12-13 14:02:22 -08:00
Yann Collet
b830ccca5c changed benchfn api
to use structure for function parameters
as it expresses much clearer than a long list of parameters,
since each parameter can now be named.
2018-11-13 13:12:50 -08:00
Yann Collet
d38063f8ae separated bench module into benchfn and benchzstd
it shall be possible to use benchfn
without any dependency on zstd.
2018-11-13 11:01:59 -08:00
Yann Collet
483759a3de Improves decompression speed when using cold dictionary
by triggering the prefetching decoder path
(which used to be dedicated to long-range offsets only).

Figures on my laptop :
no content prefetch : ~300 MB/s (for reference)
full content prefetch : ~325 MB/s (before this patch)
new prefetch path : ~375 MB/s (after this patch)

The benchmark speed is already significant,
but another side-effect is that this version
prefetch less data into memory,
since it only prefetches what's needed, instead of the full dictionary.

This is supposed to help highly active environments
such as active databases,
that can't be properly measured in benchmark environment (too clean).

Also :
fixed the largeNbDict test program
which was working improperly when setting nbBlocks > nbFiles.
2018-11-08 17:00:23 -08:00
Rohit Jain
705e0b18ab Making changes to make it compile on my laptop 2018-10-11 15:51:57 -07:00
ko-zu
b053bec2f4 Fix largeNbDicts bench for clangbuild
Remove unsigned to size_t promotion to fix implicit down conversion errors in clangbuild target.
2018-09-17 13:09:08 +09:00
Yann Collet
c49ccbc8e7 largeNbDicts : can select a nb of blocks
will automatically truncate or repeat input as needed,
to create the requested nb of blocks.
default: nb of files, eventually increased appropriately if blockSize is set
2018-09-12 11:31:28 -07:00
Yann Collet
c57a856d64 fixed minor static analyzer warning 2018-09-05 14:33:51 -07:00
Yann Collet
1d487d587f updated documentation 2018-09-04 14:57:45 -07:00
Yann Collet
11b8b8c100 silenced false-positive scan-build warning 2018-08-31 10:01:06 -07:00
Yann Collet
0ff67511e6 fixed link order for old compilers 2018-08-30 16:43:28 -07:00
Yann Collet
f76253bb70 minor : createDictionaryBuffer() can create dictionaries of different sizes 2018-08-30 16:24:44 -07:00
Yann Collet
39c55a118f fixed minor compatibility issues with older compilers 2018-08-30 16:00:57 -07:00
Yann Collet
39ef91a599 -std=c99 for largeNbDicts 2018-08-30 14:59:23 -07:00
Yann Collet
4086b2871b largeNbDicts compatible with multiple source files
splitting is disabled by default, but can be re-enabled using usual command -B#
update commands to look like zstd ones
2018-08-30 14:38:49 -07:00
Yann Collet
a5a77965d3 make all includes contrib/largeNbDicts 2018-08-29 16:17:22 -07:00
Yann Collet
d89fa814c1 added a README
for documentation
2018-08-28 18:19:19 -07:00
Yann Collet
6444c50035 increases randomness of ddict ptrs 2018-08-28 18:13:46 -07:00
Yann Collet
6c398df241 level, block size and nb dicts can be set on command line 2018-08-28 18:05:31 -07:00
Yann Collet
0c66a44d1b first working test program
measures :
- compression ratio with / without dictionary
- create one dictionary per block
- memory budget for dictionaries
- decompression speed, using one different dictionary per block

current limitations :
- only one file
- 4K blocks only
- automatic dictionary built with 4K size

dictionary can be selected on command line, with -D
2018-08-28 15:47:07 -07:00
Yann Collet
274b60e6e6 largeNbDicts can compress and compare dict vs noDict 2018-08-27 17:08:44 -07:00
Yann Collet
6782725155 first sketch for largeNbDicts test program 2018-08-26 19:29:12 -07:00