diff --git a/docs/source/en/_toctree.yml b/docs/source/en/_toctree.yml index d8fc32f380..858309391f 100644 --- a/docs/source/en/_toctree.yml +++ b/docs/source/en/_toctree.yml @@ -89,6 +89,8 @@ - sections: - local: modular_diffusers/developer_guide title: Developer Guide + - local: modular_diffusers/overview + title: Overview - sections: - local: using-diffusers/cogvideox title: CogVideoX diff --git a/docs/source/en/modular_diffusers/overview.md b/docs/source/en/modular_diffusers/overview.md index 8321430820..ecb7d4cf1f 100644 --- a/docs/source/en/modular_diffusers/overview.md +++ b/docs/source/en/modular_diffusers/overview.md @@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License. # Overview -The Modular Diffusers Framework consist of three main components +The Modular Diffusers Framework consists of three main components: ## ModularPipelineBlocks @@ -23,35 +23,37 @@ Pipeline blocks are the fundamental building blocks of the Modular Diffusers sys - [`AutoPipelineBlocks`](TODO) -Each block defines: - -**Specifications:** -- Inputs: User-provided parameters that the block expects -- Intermediate inputs: Variables from other blocks that this block needs -- Intermediate outputs: Variables this block produces for other blocks to use -- Components: Models and processors the block requires (e.g., UNet, VAE, scheduler) - -**Computation:** -- `__call__` method: Defines the actual computational steps within the block - -Pipeline blocks are essentially **"definitions"** - they define the specifications and computational steps for a pipeline, but are not runnable until converted into a `ModularPipeline` object. - -All blocks interact with a global `PipelineState` object that maintains the pipeline's state throughout execution. - -### Load/save a custom `ModularPipelineBlocks` - -You can load a custom pipeline block from a hub repository directly - +To use a `ModularPipelineBlocks` officially supported in 🧨 Diffusers ```py -from diffusers import ModularPipelineBlocks -diffdiff_block = ModularPipelineBlocks.from_pretrained(repo_id, trust_remote_code=True) +>>> from diffusers.modular_pipelines.stable_diffusion_xl import StableDiffusionXLTextEncoderStep +>>> text_encoder_block = StableDiffusionXLTextEncoderStep() ``` -to save, and publish to a hub repository +Each [`ModularPipelineBlocks`] defines its requirement for components, configs, inputs, intermediate inputs, and outputs. You'll see that this text encoder block uses text_encoders, tokenizers as well as a guider component. It takes user inputs such as `prompt` and `negative_prompt`, and return a list of conditional text embeddings. -```py -diffdiff_block.save(repo_id) ``` +>>> text_encoder_block +StableDiffusionXLTextEncoderStep( + Class: PipelineBlock + Description: Text Encoder step that generate text_embeddings to guide the image generation + Components: + text_encoder (`CLIPTextModel`) + text_encoder_2 (`CLIPTextModelWithProjection`) + tokenizer (`CLIPTokenizer`) + tokenizer_2 (`CLIPTokenizer`) + guider (`ClassifierFreeGuidance`) + Configs: + force_zeros_for_empty_prompt (default: True) + Inputs: + prompt=None, prompt_2=None, negative_prompt=None, negative_prompt_2=None, cross_attention_kwargs=None, clip_skip=None + Intermediates: + - outputs: prompt_embeds, negative_prompt_embeds, pooled_prompt_embeds, negative_pooled_prompt_embeds +) +``` + +Pipeline blocks are essentially **"definitions"** - they define the specifications and computational steps for a pipeline. However, they do not contain any model states, and are not runnable until converted into a `ModularPipeline` object. + +Read more about how to write your own `ModularPipelineBlocks` [here](TODO) ## PipelineState & BlockState @@ -67,15 +69,9 @@ You typically don't need to manually create or manage these state objects. The ` `ModularPipeline` is the main interface to create and execute pipelines in the Modular Diffusers system. -### Create a `ModularPipeline` +### Modular Repo -Each `ModularPipelineBlocks` has an `init_pipeline` method that can initialize a `ModularPipeline` object based on its component and configuration specifications. - -```py ->>> pipeline = blocks.init_pipeline(pretrained_model_name_or_path) -``` - -`ModularPipeline` only works with modular repositories, so make sure `pretrained_model_name_or_path` points to a modular repo (you can see an example [here](https://huggingface.co/YiYiXu/modular-diffdiff)). +`ModularPipeline` only works with modular repositories. You can find an example modular repo [here](https://huggingface.co/YiYiXu/modular-diffdiff). The main differences from standard diffusers repositories are: @@ -93,7 +89,7 @@ In standard `model_index.json`, each component entry is a `(library, class)` tup In `modular_model_index.json`, each component entry contains 3 elements: `(library, class, loading_specs {})` - `library` and `class`: Information about the actual component loaded in the pipeline at the time of saving (can be `None` if not loaded) -- **`loading_specs`**: A dictionary containing all information required to load this component, including `repo`, `revision`, `subfolder`, `variant`, and `type_hint` +- `loading_specs`: A dictionary containing all information required to load this component, including `repo`, `revision`, `subfolder`, `variant`, and `type_hint` ```py "text_encoder": [ @@ -114,7 +110,16 @@ In `modular_model_index.json`, each component entry contains 3 elements: `(libra 2. Cross-Repository Component Loading -Unlike standard repositories where components must be in subfolders within the same repo, modular repositories can fetch components from different repositories based on the `loading_specs` dictionary. In our example above, the `text_encoder` component will be fetched from the "text_encoder" folder in `stabilityai/stable-diffusion-xl-base-1.0` while other components come from different repositories. +Unlike standard repositories where components must be in subfolders within the same repo, modular repositories can fetch components from different repositories based on the `loading_specs` dictionary. e.g. the `text_encoder` component will be fetched from the "text_encoder" folder in `stabilityai/stable-diffusion-xl-base-1.0` while other components come from different repositories. + + +### Create a `ModularPipeline` from `ModularPipelineBlocks` + +Each `ModularPipelineBlocks` has an `init_pipeline` method that can initialize a `ModularPipeline` object based on its component and configuration specifications. + +```py +>>> pipeline = blocks.init_pipeline(pretrained_model_name_or_path) +``` @@ -135,7 +140,6 @@ You can read more about Components Manager [here](TODO) - Unlike `DiffusionPipeline`, you need to explicitly load model components using `load_components`: ```py @@ -155,6 +159,23 @@ You can partially load specific components using the `component_names` argument, +### Load a `ModularPipeline` from hub + +You can create a `ModularPipeline` from a HuggingFace Hub repository with `from_pretrained` method, as long as it's a modular repo: + +```py +pipeline = ModularPipeline.from_pretrained(repo_id, components_manager=..., collection=...) +``` + +Loading custom code is also supported: + +```py +diffdiff_pipeline = ModularPipeline.from_pretrained(repo_id, trust_remote_code=True, ...) +``` + +Similar to `init_pipeline` method, the modular pipeline will not load any components automatically, so you will have to call `load_components` to explicitly load the components you need. + + ### Execute a `ModularPipeline` The API to run the `ModularPipeline` is very similar to how you would run a regular `DiffusionPipeline`: @@ -170,26 +191,10 @@ There are a few key differences though: Under the hood, `ModularPipeline`'s `__call__` method is a wrapper around the pipeline blocks' `__call__` method: it creates a `PipelineState` object and populates it with user inputs, then returns the output to the user based on the `output` argument. It also ensures that all pipeline-level config and components are exposed to all pipeline blocks by preparing and passing a `components` input. -### Load a `ModularPipeline` from hub - -You can directly load a `ModularPipeline` from a HuggingFace Hub repository, as long as it's a modular repo - -```py -pipeine = ModularPipeline.from_pretrained(repo_id, components_manager=..., collection=...) -``` - -Loading custom code is also supported, just pass a `trust_remote_code=True` argument: - -```py -diffdiff_pipeline = ModularPipeline.from_pretrained(repo_id, trust_remote_code=True, ...) -``` - -The ModularPipeine created with `from_pretrained` method also would not load any components and you would have to call `load_components` to explicitly load components you need. - ### Save a `ModularPipeline` -to save a `ModularPipeline` and publish it to hub +To save a `ModularPipeline` and publish it to hub: ```py pipeline.save_pretrained("YiYiXu/modular-loader-t2i", push_to_hub=True) @@ -197,7 +202,9 @@ pipeline.save_pretrained("YiYiXu/modular-loader-t2i", push_to_hub=True) -We do not automatically save custom code and share it on hub for you, please read more about how to share your custom pipeline on hub [here](TODO: ModularPipeline/CustomCode) +We do not automatically save custom code and share it on hub for you. Please read more about how to share your custom pipeline on hub [here](TODO: ModularPipeline/CustomCode) + +