updated API inline doc and manual

regarding ZSTD_CDict created without a dictBuffer.
2025-12-24 17:21:03 +03:00 · 2019-10-28 11:15:41 -07:00
parent b4037b18ef
commit 74065da4c5
2 changed files with 30 additions and 13 deletions
--- a/doc/zstd_manual.html
+++ b/doc/zstd_manual.html
@@ -692,12 +692,17 @@ size_t ZSTD_freeDStream(ZSTD_DStream* zds);

 <pre><b>ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize,
                             int compressionLevel);
-</b><p>  When compressing multiple messages / blocks using the same dictionary, it's recommended to load it only once.
-  ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup cost.
+</b><p>  When compressing multiple messages or blocks using the same dictionary,
+  it's recommended to digest the dictionary only once, since it's a costly operation.
+  ZSTD_createCDict() will create a state from digesting a dictionary.
+  The resulting state can be used for future compression operations with very limited startup cost.
  ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
- `dictBuffer` can be released after ZSTD_CDict creation, because its content is copied within CDict.
-  Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate `dictBuffer` content.
-  Note : A ZSTD_CDict can be created from an empty dictBuffer, but it is inefficient when used to compress small data. 
+ @dictBuffer can be released after ZSTD_CDict creation, because its content is copied within CDict.
+  Note 1 : Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate @dictBuffer content.
+  Note 2 : A ZSTD_CDict can be created from an empty @dictBuffer,
+      in which case the only thing that it transports is the @compressionLevel.
+      This can be useful in a pipeline featuring ZSTD_compress_usingCDict() exclusively,
+      expecting a ZSTD_CDict parameter with any data, including those without a known dictionary. 
 </p></pre><BR>

 <pre><b>size_t      ZSTD_freeCDict(ZSTD_CDict* CDict);
@@ -947,7 +952,7 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
     * to evolve and should be considered only in the context of extremely
     * advanced performance tuning.
     *
-     * Zstd currently supports the use of a CDict in two ways:
+     * Zstd currently supports the use of a CDict in three ways:
     *
     * - The contents of the CDict can be copied into the working context. This
     *   means that the compression can search both the dictionary and input
@@ -963,6 +968,12 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
     *   working context's tables can be reused). For small inputs, this can be
     *   faster than copying the CDict's tables.
     *
+     * - The CDict's tables are not used at all, and instead we use the working
+     *   context alone to reload the dictionary and use params based on the source
+     *   size. See ZSTD_compress_insertDictionary() and ZSTD_compress_usingDict().
+     *   This method is effective when the dictionary sizes are very small relative
+     *   to the input size, and the input size is fairly large to begin with.
+     *
     * Zstd has a simple internal heuristic that selects which strategy to use
     * at the beginning of a compression. However, if experimentation shows that
     * Zstd is making poor choices, it is possible to override that choice with
@@ -970,7 +981,8 @@ size_t ZSTD_sizeof_DDict(const ZSTD_DDict* ddict);
     */
    ZSTD_dictDefaultAttach = 0, </b>/* Use the default heuristic. */<b>
    ZSTD_dictForceAttach   = 1, </b>/* Never copy the dictionary. */<b>
-    ZSTD_dictForceCopy     = 2  </b>/* Always copy the dictionary. */<b>
+    ZSTD_dictForceCopy     = 2, </b>/* Always copy the dictionary. */<b>
+    ZSTD_dictForceLoad     = 3  </b>/* Always reload the dictionary */<b>
 } ZSTD_dictAttachPref_e;
 </b></pre><BR>
 <pre><b>typedef enum {
--- a/lib/zstd.h
+++ b/lib/zstd.h
@@ -808,12 +808,17 @@ ZSTDLIB_API size_t ZSTD_decompress_usingDict(ZSTD_DCtx* dctx,
 typedef struct ZSTD_CDict_s ZSTD_CDict;

 /*! ZSTD_createCDict() :
- *  When compressing multiple messages / blocks using the same dictionary, it's recommended to load it only once.
- *  ZSTD_createCDict() will create a digested dictionary, ready to start future compression operations without startup cost.
+ *  When compressing multiple messages or blocks using the same dictionary,
+ *  it's recommended to digest the dictionary only once, since it's a costly operation.
+ *  ZSTD_createCDict() will create a state from digesting a dictionary.
+ *  The resulting state can be used for future compression operations with very limited startup cost.
 *  ZSTD_CDict can be created once and shared by multiple threads concurrently, since its usage is read-only.
- * `dictBuffer` can be released after ZSTD_CDict creation, because its content is copied within CDict.
- *  Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate `dictBuffer` content.
- *  Note : A ZSTD_CDict can be created from an empty dictBuffer, but it is inefficient when used to compress small data. */
+ * @dictBuffer can be released after ZSTD_CDict creation, because its content is copied within CDict.
+ *  Note 1 : Consider experimental function `ZSTD_createCDict_byReference()` if you prefer to not duplicate @dictBuffer content.
+ *  Note 2 : A ZSTD_CDict can be created from an empty @dictBuffer,
+ *      in which case the only thing that it transports is the @compressionLevel.
+ *      This can be useful in a pipeline featuring ZSTD_compress_usingCDict() exclusively,
+ *      expecting a ZSTD_CDict parameter with any data, including those without a known dictionary. */
 ZSTDLIB_API ZSTD_CDict* ZSTD_createCDict(const void* dictBuffer, size_t dictSize,
                                         int compressionLevel);

@@ -1167,7 +1172,7 @@ typedef enum {
     *   tables. However, this model incurs no start-up cost (as long as the
     *   working context's tables can be reused). For small inputs, this can be
     *   faster than copying the CDict's tables.
-     * 
+     *
     * - The CDict's tables are not used at all, and instead we use the working
     *   context alone to reload the dictionary and use params based on the source
     *   size. See ZSTD_compress_insertDictionary() and ZSTD_compress_usingDict().